Clojure for XML

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

This is a fantastic overview of different approaches:

Zippers are probably the easiest way to manage xml – once you grok them.

Zippers are a strange beast. Wikipedia describes them as:

A technique of representing an aggregate data structure so that it is convenient for writing programs that traverse the structure arbitrarily and update it’s contents…

I like to think of a zipper as a kind of pointer to part of a tree – at any time if you have a tree of nodes like the one above, you can have a zipper that refers to a node in the tree, and use it to navigate around the tree. You can also use the zipper to produce a modified version of the xml document, but I’ll leave that for another post.

To get a zipper from an xml tree, you need another library:

(clojure.zip/xml-zip (parse-str xml))

The output of this isn’t very useful. Zippers are a little hard to view, because they need to keep track of the entire xml tree they are created from – so every time you output a zipper, you see the whole parsed xml structure, which doesn’t help much.

To find out more about the current state of a zipper, you can call clojure.zip/node, which returns the node pointed to by the zipper. Then you can call the same debug functions described earlier. Here’s some short functions to dump zippers:

(defn dz [zipper] (do
                    (dbg (clojure.zip/node zipper))
                    zipper)) ; return the zipper for more processing

(defn az [zipper] (as-short-xml (clojure.zip/node zipper)))

Basic zipper navigation

Zippers, like most of clojure, are immutable – to “navigate” using them, you modify a zipper with a function to get a new zipper. The basic options are:

down – takes you to the first child of this node
up – takes you to the parent element of this node
right – takes you to the next sibling of this node
left – takes you to the previous sibling of this node
and many many other similar navigation commands

So to continue with our example xml:

(-> xml
    c-zip/xml-zip
    c-zip/down
    dz
    c-zip/right
    dz
    c-zip/down
    az)

=> "Baby, I'm the top"
""
""

The first call to ‘dz’ dumps the first child of the root, the text node “Baby, I’m the top”

Then we move to it’s right sibling and dump the value there – the “” node.

Then we move down to it’s first child, and output the “” node.

I hope this is making sense. Basically, you move the zipper around the tree to get to the node you want. Handy for some cases, but still a little strange.

Post external references

  1. 1
    http://blog.korny.info/2014/03/08/xml-for-fun-and-profit.html
Source