Ken Muse

Decoding Binary Data in Swift


This is a post in the series Building a Workout App for watchOS in Swift. The posts in this series include:

We’ve now seen how to implement the basic shell of a decoder. We still need to implement the actual decoding logic. While we could embed all of our processing logic directly into the decoder implementation, some things are best handled by encapsulation. To do that, we’ll create a new class specifically to handle decoding the data. We’ll then integrate it into the earlier work. This allows us to separate the responsibilities for incrementally reading a data stream from the core implementation of the Decoder.

Introducing the BinaryDataReader

First, we want to make a class to encapsulate the state of the processing. It will hold a reference the data we’re decoding, as well as the current position in that data. We’ll want this class to automatically advance the position as we read data. It will also need to know how to read the data for different variable types. While we could add other decoding functionality, I want this class to a have a single purpose – providing a way to read the binary data into data types.

The code will use a class so that we can pass around the instance by reference. We’ll call this class BinaryDataReader and initialize it using a Data:

1public class BinaryDataReader {
2  let data: Data
3  init(_ data: Data) {
4    self.data = data
5  }
6}

Last week, we created an unkeyed container. That container needs to know the current position in the data and whether we’ve reached the end of the data. Let’s expose those values:

1public private(set) var currentIndex: Int = 0
2public var isAtEnd: Bool {
3    return currentIndex == data.count
4}

Why am I using public private(set)? This is a Swift feature that allows the property to be read from anywhere, but only set from within the class. This allows the value to be exposed without making it mutable from outside the class.

Sometimes, when reading binary data we may need to skip over content that will not be used. Let’s give the class the ability to skip some amount of the data. To do that, we’ll simply attempt to increment the index so that any future processing begins at a different offset.

1public func skip(withLength: Int) throws {
2  let position = currentIndex + withLength
3  guard position <= data.count else {
4    throw BinaryDecoderError.prematureEndOfData
5  }
6  currentIndex = position
7}

I decided to do a bit of error checking. If the code attempts to skip more data than is available, I want to throw an error. This will help us catch bugs in our code. To do that, I’m declaring an error type. Let’s define an enum to represent our possible errors:

1enum BinaryDecoderError: Error {
2  case prematureEndOfData
3  case attemptToReuse
4  case boolOutOfRange(UInt8)
5}

Fun tip: we can provide a localized description for the error. This is useful for debugging and for providing feedback to the user. That requires an extension to the enum to implement the LocalizedError protocol:

 1extension BinaryDecoderError: LocalizedError {
 2  var errorDescription: String? {
 3    switch self {
 4    case .prematureEndOfData:
 5      return NSLocalizedString("Attempt to read past the end of the data", comment: "")
 6    case .attemptToReuse:
 7      return NSLocalizedString("Additional reads are not supported", comment: "")
 8    case let .boolOutOfRange(value):
 9      return NSLocalizedString("Boolean value out of range: \(value)", comment: "")
10  }
11}

Localization is a MUCH bigger topic, so I won’t dive into that discussion at this point. In practice, things like string interpolation are a bit more complex than what I’m showing here. In addition, you’ll want to make sure that you have localized .strings files to support multiple languages. Just know that these features are available if you need them.

In some cases (such as strings), we may need to read some or all of the remaining data. We’ll add a special method for that as well.

 1public func readBytes(withLength: Int?) throws -> Data {
 2  // Calculate the last position to return. If no length is provided, go to the end of the data.
 3  let end = withLength != nil ? currentIndex + withLength! : data.count
 4
 5  // If the end position is beyond the end of the data, throw an error.
 6  guard end <= data.count else {
 7    throw BinaryDecoderError.prematureEndOfData
 8  }
 9        
10  // Create a slice of the data representing the requested portion
11  let result = data[currentIndex..<end]
12
13  // Increment the current position
14  currentIndex = end
15
16  // Return the data
17  return result
18}

Now we can return an arbitrary number of bytes (or all of the bytes) in a Data starting at the current position. Reads after that will begin from the next position. Now, this class just needs a way to read the data and convert it to a type dynamically. Thankfully, Swift has made this easier with the withUnsafeBytes. Essentially, this lets us read a set of bytes from the underlying data instance and use the memory directly. In the past, you had to take be careful to align the bytes correctly to avoid exceptions, but the loadUnaligned method can handle that for us. The implementation looks like this:

 1public func read<T>() throws -> T {
 2  // Get the number of bytes occupied by the type T
 3  let typeSize = MemoryLayout<T>.size
 4
 5  // Ensure the bytes that contain next value are available
 6  guard (self.currentIndex + typeSize) <= data.count else {
 7    throw BinaryDecoderError.prematureEndOfData
 8  }
 9   
10  // Read the data into a value of type T
11  let value: T = data.withUnsafeBytes {
12    return $0.loadUnaligned(fromByteOffset: currentIndex, as: T.self)
13  }
14    
15  // Move the cursor to the next position
16  self.currentIndex += typeSize
17    
18  // Return the value
19  return value
20}

We now have an arbitrary way to read a set of bytes and convert those into a type.

The Single Use Binary Data Reader

I mentioned in the previous post that the SingleValueDecodingContainer expects to be limited to reading a single value. If we directly use this reader, we run the risk that code might attempt to read a value a second time, incrementing the reader. We could attempt to wrap all of the individual calls where this might happen, but I like to minimize repetition (i.e., Don’t Repeat Yourself). Instead, let’s define a protocol that we can use to represent an arbitrary reader:

1public protocol BinaryReaderProtocol {
2    var currentIndex: Int { get }
3    var isAtEnd: Bool { get }
4
5    func skip(withLength: Int) throws
6    func read<T>() throws -> T
7    func readBytes(withLength: Int?) throws -> Data
8}

Now, we can create a second class that conforms to this protocol. It will use an existing BinaryDataReader instance to read the data, but it will ensure that only a single read operation happens:

 1public class SingleUseBinaryDataReader: BinaryDataReaderProtocol {
 2  
 3  let reader: BinaryDataReaderProtocol
 4  var isUsed = false
 5  init(_ reader: BinaryDataReaderProtocol) {
 6    self.reader = reader
 7  }
 8  
 9  public var currentIndex: Int {
10    return reader.currentIndex
11  }
12  
13  public var isAtEnd: Bool {
14    return isUsed
15  }
16  
17  public func skip(withLength: Int) throws {
18    try checkIfPreviouslyUsed()
19  }
20  
21  public func read<T>() throws -> T {
22    try checkIfPreviouslyUsed()
23    return try reader.read()
24  }
25  
26  public func readBytes(withLength: Int? = nil) throws -> Data {
27    try checkIfPreviouslyUsed()
28    return try reader.readBytes(withLength: withLength)
29  }
30  
31   func checkIfPreviouslyUsed() throws {
32    if isUsed {
33        throw BinaryDecoderError.attemptToReuse
34    }
35    isUsed = true
36  }
37}

This class simply wraps an existing reader to ensure that no reads are allowed after the initial read. Because it wrappers an existing reader, it allows an existing reader to be safely used within a SingleValueDecodingContainer.

The Binary Decoder

To use these classes in our decoder, we just need to add a new initializer.

 1public class BinaryDecoder: Decoder {
 2  // Other methods omitted for previty
 3  
 4  // Store the instance of the assigned reader
 5  let reader: BinaryDataReaderProtocol
 6
 7  // Initialize the instance by converting an array of bytes to
 8  // Data and use that to construct the BinaryDataReader
 9  public init(data: [UInt8]) {
10    reader = BinaryDataReader(Data(data))
11  }
12  
13  // Initialize the decoder instance with an existing reader and its data
14  init(reader: BinaryDataReaderProtocol) {
15    self.reader = reader
16  }
17  
18  // Expose a value indicating whether the reader has any more data
19  // available to process
20  var isAtEnd: Bool {
21    return reader.isAtEnd
22  }
23  
24  // Expose the current position of the data within the reader
25  var currentIndex: Int {
26    return reader.currentIndex
27  }
28}

With these elements in place, we can now implement the decoding. First, let’s implement the generic decode method:

1func decode<T: Decodable>(_ type: T.Type) throws -> T {
2  switch type {
3    case is any BinaryInteger.Type:
4      let value: T = try reader.read()
5      return value
6    case let sourceType as Decodable.Type:
7      return try sourceType.init(from: self) as! T
8  }
9}

This method provides a generic entry point. Now, we just need to add the remaining overloads for decode. Because the data reader is dynamically reading and converting types, the compiler can implement most of the heavy lifting automatically. Here are the remaining methods.

 1func decode(_ type: Bool.Type) throws -> Bool {
 2  switch try decode(UInt8.self) {
 3    case 0: return false
 4    case 1: return true
 5    case let value: throw BinaryDecoderError.boolOutOfRange(value)
 6  }
 7}
 8
 9func decode(_ type: Float.Type) throws -> Float {
10  return try reader.read()
11}
12
13func decode(_ type: Double.Type) throws -> Double {
14  return try reader.read()
15}
16
17func decode(_ type: Int8.Type) throws -> Int8 {
18  return try reader.read()
19}
20
21func decode(_ type: Int16.Type) throws -> Int16 {
22  return try reader.read()
23}
24
25func decode(_ type: Int32.Type) throws -> Int32 {
26  return try reader.read()
27}
28
29func decode(_ type: Int64.Type) throws -> Int64 {
30  return try reader.read()
31}
32
33func decode(_ type: Int.Type) throws -> Int {
34  return try reader.read()
35}
36
37func decode(_ type: UInt8.Type) throws -> UInt8 {
38  return try reader.read()
39}
40
41func decode(_ type: UInt16.Type) throws -> UInt16 {
42  return try reader.read()
43}
44
45func decode(_ type: UInt32.Type) throws -> UInt32 {
46  return try reader.read()
47}
48
49func decode(_ type: UInt64.Type) throws -> UInt64 {
50  return try reader.read()
51}
52
53func decode(_ type: UInt.Type) throws -> UInt {
54  return try reader.read()
55}
56
57func decode(_ type: String.Type) throws -> String {
58  // Read all of the remaining bytes and convert them to a string
59  let data = try reader.readBytes(withLength: nil)
60  return String(decoding: data, as: UTF8.self)
61}

And that’s it! We now have all of the code required for a basic binary decoder. We can use this decoder to help convert binary data from Bluetooth devices into Swift data types. In addition, we can expand this decoder to support more types if needed (Bluetooth defines a few unique types, including some special string and float encodings). We won’t need that for the current project, but now that you understand the basics, you are prepared for when you do!