Search
  • Nick Daigler

Parsing XML with Swift

If you're just here for the code, go here.


I recently found myself needing to parse an RSS feed and display a list of podcast episodes in a list. I typed, "xml parser apple docs" into Google and found XMLParser, the XML parser included in Swift's standard library. I was immediately overjoyed, but became less enthusiastic when I realized the API left me wanting.


I wanted a way of transforming XML data into a dictionary. In this post, I'll walk you through how I was able to implement this.


You interact with XMLParser via delegate methods. Below are the three delegate methods provided by the XMLParserDelegate API I used.


func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String] = [:])

func parser(_ parser: XMLParser, foundCharacters string: String)

func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?)

The high level idea is we want to build up the dictionary as we parse each element. This problem becomes simpler if you think about the parallels between parsing XML and running a DFS on a tree.


Here are the steps we need to take when begin parsing an element:

  1. Get a reference to the parent dictionary from a stack of dictionaries (the parent will be the dictionary most recently pushed onto the stack).

  2. Generate a child dictionary.

  3. Either group the element we're parsing into an array if it appears more than once, or set the element's name in the parent to the child we generate.

  4. Finally, push the child we've generated onto the stack.

This is what this looks like in Swift:

public func parser(_ parser: XMLParser, didStartElement elementName: String, namespaceURI: String?, qualifiedName qName: String?, attributes attributeDict: [String : String] = [:]) {

  // Get the parent
  guard let parent = stack.last else {
   fatalError()
  }
 
  // Make a new child
  let child = NSMutableDictionary()
  if attributeDict.count > 0 {
   child.setObject(attributeDict, forKey: XMLToDict.kAttributes as NSString)
  }
  
  // Group elements as an array if this element name already has a value in the parent
  if let existingValue = parent[elementName] {
    let array: NSMutableArray
    
    if let currentArray = existingValue as? NSMutableArray {
      array = currentArray
    } else {
      // Make a new array if we don't already have one
      array = NSMutableArray()
      array.add(existingValue)
      parent[elementName] = array
    }
    
    // Make sure we put our new child into this array
    array.add(child)
  } else {
    // In the case where we haven't seen this elementName yet, put it into the parent
    parent[elementName] = child
    stack[stack.endIndex - 1] = parent
  }
  
  // We need to push our new child onto the stack so we can track it
  stack.append(child)
}

While we're parsing an element, we can get callbacks via a delegate method when the parser finds additional characters. We'll need to keep track of the strings found by the parser while parsing a given element so we can deal with it appropriately once parsing for an element has been finished.


Below is how we track the strings found while the XMLParser is parsing an element.

func parser(_ parser: XMLParser, foundCharacters string: String) {
  let trimmedString = string.trimmingCharacters(in: .whitespacesAndNewlines)
  
  // textBeingProcessed is an instance variable on the class used to implement this functionality
  textBeingProcessed.append(trimmedString)
}

Finally, when we finish parsing an element, we want to take the following steps:

  1. Find a reference to the dictionary that represents the element that was just parsed.

  2. Find a reference to the current element's parent.

  3. If we found text while parsing this element and the element that was just parsed is not empty, adjust the dictionary representing the current element appropriately. Otherwise, if we found text and the element that was just parsed is empty, there are two possibilities: the parent element is an array, or the parent element is a dictionary. In both cases, we need to set the parent's value to the text we found while parsing the element that was just finished processing.

  4. If we did not find text while parsing this element, update the parent appropriately.

  5. Finally, we need to reset state before beginning to parse the next element. This means we need to clear the text found while processing the last element and pop the current element from the stack

This is what this looks like in Swift:

func parser(_ parser: XMLParser, didEndElement elementName: String, namespaceURI: String?, qualifiedName qName: String?) {
  // Grab the element that was just parsed
  guard let lastDict = stack.last else { fatalError() }
  
  // Grab the parent
  let parent = stack[stack.endIndex - 2]
  
  if textBeingProcessed.count > 0 {
    if lastDict.count > 0 {
      lastDict[XMLToDict.kText] = textBeingProcessed
    } else if let array = parent[elementName] as? NSMutableArray {
      array.removeLastObject()
      array.add(textBeingProcessed)
    } else {
      parent[elementName] = textBeingProcessed
    }
  } else if lastDict.count == 0 {
    parent.removeObject(forKey: elementName)
  }
  
  // Reset state before parsing the next element
  textBeingProcessed = ""
  stack.removeLast()
}

I hoped this explanation helped someone. If not, I hope this code helps someone. The GitHub gist will show you how I put all of these pieces together into a nice API.


Grab the RSS data with a URLRequest, pass it to the class found here, and enjoy using a dictionary!



That's all for now; be well.


Nick

4 views0 comments

Recent Posts

See All

iOS @ Pludo: Reactive Components

Reactive programming is a trendy topic these days, and this has proved especially true in the iOS community over the past few years. Combine was introduced at WWDC 2019 and only added fuel to the fire

iOS Lead Essentials Course Learnings - Part 2

I've been chugging through the iOS Lead Essentials Course for about one month, at this point. I'm enjoying how the course doesn't lean heavily into iOS-specific technologies and concepts. Rather, the

Why I'm not a fan of pursuing balance

I had zero balance a bit over three years ago, as a freshman in college. I spent most of my time studying. Most of the time I didn't spend studying was spent worrying that I should be studying. Those