Simplify XML Parsing in Postgres

19 06 2009

I was reading this article over at Postgres Online Journal about processing GPX XML files and it reminded me how much cruft is involved with extracting values from XML in Postgres.

The XPath function is great for some things, especially when you want to return an array of XML nodes. But it ALWAYS returns an XML array. If you want to iterate over a chunk of XML, for example pulling all of the waypoint nodes, then XPath is the perfect tool for the job. But if you want to extract a single value, it becomes clunky. So if you want to get the latitude for a node, you’ve got to get the 1st element in the XML array, cast it as varchar in order to cast it again to numeric. It is not exactly straight forward. And if you spend a lot of time parsing XML files, that cruft can really add up.

So we are going to pull a(nother) page from Oracle’s book and create a few functions that will make it easier to work with XML in Postgres.  The first thing we need is a way to extract a single value from somewhere in an XML node. Oracle’s extractvalue() function does precisely this. Here’s what it looks like in Postgres: Read the rest of this entry »