Xpath Expressions Explained

Article published Monday, December 26th, 2011 at 12:00 pm

Xpath is a language for selecting XML nodes. You can think of it as the CSS of the XML world. It does some cool things that traditional CSS can’t do (CSS 3 can do some of it), such as selecting items based on content and attributes, and selecting parents and children. There is a cool ZF library which will translate your CSS selectors into Xpath, if you’re interested.

Here’s an example of an Xpath expression. It’s relatively complex and shows off a lot of useful Xpath features:

//Item[ItemNumber='4111']//ExternalIdentifier[@Source='Alpha' and @Type='Beta']

Now, let me break it down. The // means that this node is located anywhere in the document (in CSS this is kinda just assumed. If there is a space in the selector, it does the same thing). The Item part means that we are looking for a Item node. The [ItemNumber='4111'] means that we are looking for a child element of Item (the string before it) which has a child ItemNumber node whose text value is equal to 4111. The // means that we are looking for a child anywhere below the selected parent. The ExternalIdentifier means we are looking for a node of that type. The @Source=’Alpha’ means we are looking for an attribute named Source whose value is Alpha belonging to an element of type ExternalIdentifier (the string before it). The @Type=’Beta’ does the same thing. The “ and ” means that this element must have both of these attributes set.

Here’s an example chunk of XML (imagine that there are several of these Item nodes):

<Item>
  <ItemNumber>4111</ItemNumber>
  <ExternalIdentifiers>
    <ExternalIdentifier Type="Beta" Source="Alpha">10</ExternalIdentifier>
    <ExternalIdentifier Type="Beta" Source="Gamma">20</ExternalIdentifier>
    <ExternalIdentifier Type="Delta" Source="Alpha">30</ExternalIdentifier>
    <ExternalIdentifier Type="Delta" Source="Gamma">40</ExternalIdentifier>
  </ExternalIdentifiers>
</Item>

By running the xpath expression above against the provided XML document, we get the following PHP object:

array(1) {
  [0]=>
  object(SimpleXMLElement)#2 (2) {
    ["@attributes"]=>
    array(2) {
      ["Type"]=>
      string(12) "Beta"
      ["Source"]=>
      string(4) "Alpha"
    }
    [0]=>
    string(2) "10"
  }
}

If you were to cast this object as a string, you get the text value of the node (in this case 10).

Hi, I’m Tom, and Renowned Media is my professional blog for web development tutorials. Traditionally, I’m a PHP/MySQL developer, but recently I’ve done a lot of JavaScript and Backbone.js development. Right now I’m really interested in Node.js and NoSQL technologies. I love developing on a mac and deploying apps to Linux servers.

Facebook Twitter LinkedIn Google+ 

Tags: ,

Category: PHP TutorialsXML / Xpath Tutorials

Leave a Reply