- Nick Daigler
Parsing XML with Swift
If you're just here for the code, go here.
I recently found myself needing to parse an RSS feed and display a list of podcast episodes in a list. I typed, "xml parser apple docs" into Google and found XMLParser, the XML parser included in Swift's standard library. I was immediately overjoyed, but became less enthusiastic when I realized the API left me wanting.
I wanted a way of transforming XML data into a dictionary. In this post, I'll walk you through how I was able to implement this.
You interact with XMLParser via delegate methods. Below are the three delegate methods provided by the XMLParserDelegate API I used.
func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String] = [:])
func parser(_ parser: XMLParser, foundCharacters string: String)
func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?)
The high level idea is we want to build up the dictionary as we parse each element. This problem becomes simpler if you think about the parallels between parsing XML and running a DFS on a tree.
Here are the steps we need to take when begin parsing an element:
Get a reference to the parent dictionary from a stack of dictionaries (the parent will be the dictionary most recently pushed onto the stack).
Generate a child dictionary.
Either group the element we're parsing into an array if it appears more than once, or set the element's name in the parent to the child we generate.
Finally, push the child we've generated onto the stack.
This is what this looks like in Swift:
public func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String] = [:]) {
// Get the parent
guard let parent = stack.last else {
fatalError()
}
// Make a new child
let child = NSMutableDictionary()
if attributeDict.count > 0 {
child.setObject(attributeDict, forKey: XMLToDict.kAttributes as NSString)
}
// Group elements as an array if this element name already has a value in the parent
if let existingValue = parent[elementName] {
let array: NSMutableArray
if let currentArray = existingValue as? NSMutableArray {
array = currentArray
} else {
// Make a new array if we don't already have one
array = NSMutableArray()
array.add(existingValue)
parent[elementName] = array
}
// Make sure we put our new child into this array
array.add(child)
} else {
// In the case where we haven't seen this elementName yet, put it into the parent
parent[elementName] = child
stack[stack.endIndex - 1] = parent
}
// We need to push our new child onto the stack so we can track it
stack.append(child)
}
While we're parsing an element, we can get callbacks via a delegate method when the parser finds additional characters. We'll need to keep track of the strings found by the parser while parsing a given element so we can deal with it appropriately once parsing for an element has been finished.
Below is how we track the strings found while the XMLParser is parsing an element.
func parser(_ parser: XMLParser, foundCharacters string: String) {
let trimmedString = string.trimmingCharacters(in: .whitespacesAndNewlines)
// textBeingProcessed is an instance variable on the class used to implement this functionality
textBeingProcessed.append(trimmedString)
}
Finally, when we finish parsing an element, we want to take the following steps:
Find a reference to the dictionary that represents the element that was just parsed.
Find a reference to the current element's parent.
If we found text while parsing this element and the element that was just parsed is not empty, adjust the dictionary representing the current element appropriately. Otherwise, if we found text and the element that was just parsed is empty, there are two possibilities: the parent element is an array, or the parent element is a dictionary. In both cases, we need to set the parent's value to the text we found while parsing the element that was just finished processing.
If we did not find text while parsing this element, update the parent appropriately.
Finally, we need to reset state before beginning to parse the next element. This means we need to clear the text found while processing the last element and pop the current element from the stack
This is what this looks like in Swift:
func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?) {
// Grab the element that was just parsed
guard let lastDict = stack.last else { fatalError() }
// Grab the parent
let parent = stack[stack.endIndex - 2]
if textBeingProcessed.count > 0 {
if lastDict.count > 0 {
lastDict[XMLToDict.kText] = textBeingProcessed
} else if let array = parent[elementName] as? NSMutableArray {
array.removeLastObject()
array.add(textBeingProcessed)
} else {
parent[elementName] = textBeingProcessed
}
} else if lastDict.count == 0 {
parent.removeObject(forKey: elementName)
}
// Reset state before parsing the next element
textBeingProcessed = ""
stack.removeLast()
}
I hoped this explanation helped someone. If not, I hope this code helps someone. The GitHub gist will show you how I put all of these pieces together into a nice API.
Grab the RSS data with a URLRequest, pass it to the class found here, and enjoy using a dictionary!
That's all for now; be well.
Nick