Path is a non-XML language used to identify any particular element of an XML / XHTML document.

XPath indicates nodes by position, relative position, type, content, and several other criteria. XSLT use XPath expressions to match and select particular elements in the input document for copying into the output document or further processing. XPointer use XPath expression to identify the particular point in or part of an XML document that an XLinks links to.

XPath expressions can also represent numbers, strings, or boolean, so XSLT stylesheets carry out simple arithmetic for numbering and cross-referencing figures, tables, and equations. Strings manipulation in XPath lets XSLT perform tasks like making the title of a chapter uppercase in a headline, but mixed case in a reference in the body text.

An XML document is a tree made up of nodes. Some nodes contains other nodes. One root node ultimately contains all nodes. XPath is a language for picking nodes and sets of nodes out of this tree. From the perspective of XPath, there are seven kinds of nodes: the root node, element nodes, text nodes, attribute nodes, comment nodes, processing instruction nodes, and namespace nodes.

Constructs not included in this list are CDATA sections, entity references, and document type declarations. XPath operates on an XML document after these items have been merged into the document. For instance, XPath cannot identify the first CDATA section in a document or tell whether a particular attribute value was included directly in the source element start tag or merely defaulted from the declaration of the attribute in the DTD.

The XPath data model has several inobvious features. First, the tree's root node is not the same as its root element. The tree's root node contains the entire document, including the root element, comments, and processing instructions that occur before the root element start tag or after the root element end tag.

XPath data model does not include everything in the document. In particular, the XML declaration and DTD are not addressable via XPath. However, if the DTD provides default values for any attributes, then XPath recognizes those attributes. Similarly, any references to parsed entities are resolved. Entity references, character references, and CDATA sections are not individually identifiable, though any data they contain is addressable. For example, XSLT does not enable you to make all text in CDATA section bold because XPath doesn't know what text is and isn't part of a CDATA section.

Finally, xmlns attributes are reported as namespace nodes. They are not considered as attribute nodes, though a non-namespace aware parser will see them as such. Furthermore these nodes are attached to every element and attribute node for which that declaration has scope. They are not just attached to the single element where the namespace is declared.

Root Location Path

XPath syntax was deliberately chosen to be similar to the syntax used by the Unix shell. Here / is the root of a Unix filesystem and / is the root node of an XML document.

The forward slash / is an absolute location path because no matter what the context node is, no matter where you were in the input document when this template was applied, it always means the same thing: the root node of the document. It is relative to the document you process, but not to anything within that document.

Child Element Location Steps

The second simplest location path is a single element name. This selects all child elements with the specified name. Exactly which elements they are depends on what the context node is, so this is a relative XPath.

Attribute Location Steps

Attributes are also part of XPath. To select a particular attribute of an element, use an at sign @ followed by the name of the attribute you want.

The comment(), text(), and processing-instruction() Location Steps

Since comments and text nodes don't have names, the comment() and text() functions match any comment or text node that's an immediate child of the context node. Each comment is a separate comment node. Each text node contains the maximum possible contiguous run of text not interrupted by a tag. Entity references and CDATA sections are resolved into text and markup and do not interrupt text nodes.

With no arguments, the processing-instruction() function selects all the context node's processing instruction children. If it has an argument, it selects only the processing instruction children with the specified target. For example, XPath expression processing-instruction('xml-stylesheet') selects all processing instruction children of the context node whose target is xml-stylesheet.


Wildcards allow you to match different element and node types at the same time. There are three wildcards: *, node(), and @*.

The asterisk * matches any element node, regardless of type. The * does not match attribute, text, comment, or processing instruction nodes. You can put namespace prefix in front of the asterisk. In this case, only elements in the same namespace are matched. For example, svg:* matches all elements with the same namespace URI as the svg prefix is mapped to. As usual, the URI, not the prefix, matters. The prefix may differ in the stylesheet and the source document, as long as the namespace URI is the same.

The node() wildcard matches all nodes: element, text, attribute, processing instruction, namespace, and comment.

The @* wildcard matches all attribute nodes. As with element, you can attach a namespace prefix to the wildcard to match only attributes in a specific namespace. For instance, @xlink:* matches all XLink attributes, provided that the prefix xlink is mapped to the http://www.w3.org/1999/xlink namespace. Again, the URI, not the prefix, matters.

Multiple Matches with |

You may want to match more than one type of element or attribute, but not all types. You can combine individual location steps with a vertical bar | to indicate that you want to match any of the named elements. For instance: object|img|embed.

*|@* matches elements and attributes, but does not match text, comment, or processing instruction nodes

Compound Location Paths

The XPath expressions you've seen so far — element names, @ plus an attribute name, / , comment(), text(), node(), and processing-instruction() — are all single location steps. You can combine these location steps with the forward slash to move down the hierarchy from the matched node to other nodes. You can also use a period to refer to the current node, a double period to refer to the parent node, and a double forward slash to refer to descendants of context node.

A double forward slash // selects from all descendants of the context node as well as the context node itself. At the beginning of an XPath expression, it selects from all descendants of the root node. For example, the XPath expression //name selects all name elements in the document.

Comparision operators

XPath supports a full complement of relational operators, including =, <, >, >=, <=, and !=. Note that if < or <= is used inside an XML document, you still must escape the less-than sign as &amp;lt;.

Logical operators

XPath also provides boolean 'and' and 'or' operators to combine expression logically. For example, the XPath expression //person[@born<=1920 and @born>=1910] selects all person elements with born attribute values between 1910 and 1920, inclusive.

Ancestor axis

All element nodes that contain the context node; the parent node, the parent's parent, the parent's parent's parent, etc., up through the root node in reverse document order.

Following-sibling axis

All nodes that follow the context node and are contained in the same parent element node in document order.

Preceding-sibling axis

All nodes that precede the context node and are contained in the same parent element node in the reverse document order.

Following axis

All nodes that follow the end of the context node in document order.

Preceding axis

All nodes that precede the start of the context node in reverse document order.

Namespace axis

All namespaces in scope on the context node, whether declared on the context node or one of its ancestors.

Descendant axis

All the context node's descendants, not including the context node itself.

Ancestor-or-self axis

All the context node's ancestors and the context node itself.

General XPath Expression

XPath expressions can also return numbers, booleans ( true(), false(), not() ), and strings. XPath provides the five basic arithmetic operators: +, -, *, div, and mod.

XPath Functions

XPath function returns one of these four types: boolean, number, node set, string. There are no void function in XPath. XPath is not as strongly typed as Java or C. You can often use these types as a function argument, regardless of which type the function expects, and the processor will substitute one of the two strings true and false for the boolean. The one exception is functions that expect to receive node sets as arguments. XPath cannot convert strings, booleans, or numbers to node sets.

XPath processor can convert a node set to its string value (its text content).

The position() function returns the current node's position in the context node list as a number.

The last() function returns the number of nodes in the context node set, which is the same as the position of the last node in the set.

The count() function is similar to last(), except that it returns the number of nodes in its node set argument rather than in the context node list.

The id() takes a string containing one or more IDs separated by whitespace as an argument and returns a node set containing all nodes in the documents that have those IDs. These are nodes with attributes declared to have type ID in the DTD, not necessarily nodes with attributes named ID or id.

String Functions

XPath includes functions for basic string operations, such as finding string's length or changing letters from uppercase to lowercase. It does not have the full power of the string libraries in Python or Perl. For example, there's no regular expression support.

The concat(s1, s2, s3, …) function call takes as arguments any number of strings and concatenate them together.

The string() function converts any type of argument to a string in a reasonable fashion.

The starts-with(str1,str2) function call return true if str1 starts with str2.

The contains() function take two string arguments, return true if first argument contains second arguments.

The substring-before() function takes two string arguments and returns the substring of the first argument string that precedes the second argument's initial appearance. If the second string doesn't appear in the first string, then substring-before() returns the empty string. For example, substring-before('MM/DD/YYYY','/') is 'MM'.

The substring-after() is similar to substring-before(). This is equivalent to post-match in Perl.

The substring() function takes three arguments: the string from which the substring is copied, the position in the string from which to start extracting, and the number of character to copy. The third argument may be omitted.

The string-length() function returns a number giving the length of the string value of its argument, or of the context node if no argument is given. For example: string-length(//name[position()=1]).

The normalize-space() function remove extra spaces. For example: normalize-space(string(//name[position()=1])).

Numeric Functions

The number() function can take any type as an argument and convert it to a number. If the argument is omitted, it converts the context node. Booleans are converted to 1 if true, 0 if false. Strings are converted in a plausible fashion. Node sets are converted to number by first converting them to their string values and then converting them to numbers. If the object you convert can't be reasonably interpreted as a single number, then NaN is returned.

The round(), floor(), ceiling() functions all take a single number as an argument.

The sum() function takes a node set as an argument. It converts each node in the set to its string value, then converts each of those strings to a number. Finally, it adds the numbers and returns the result.


/html/body/table                      // all <table> tags that are direct descendants of the body tag
/html/body/table[1]                   // the first table that is direct descendants of the body tag
//@id                                 // selects all id attributes of any element in the document
person//@id                           // selects all id attributes of any element that is contained in the <person> child element of the context node.
//@id/..                              // identifies all elements in the document that have id attributes
//middle_initial/../first_name        // select <first_name> elements that are sibling of <middle_initial> elements in the document
//*[@id="firstname"]                  // select element with id="firstname"
//profession[.="physicist"]           // find all <profession> whose value is physicist
//person [profession="physicist"]     // find all <person> that have <profession> child element with value "physicist"
//a[@id="blah"]                       // find <a> with specified id
//*[text()="Advanced Search"]         // find the element containing specified text
count(//person)                       // count <person> tags
//input[following-sibling::text()="label1199734151"] select the input box that has the text .... next to it 
//*[@id="blah"] select an element by id 
//*[contains(@href,"followMeVCR.php")] select an element that the href attribute contains followMeVCR.php 
//*[@value="Submit"] select an element with the value attribute being Submit 
//a[@title="Download Now"] select an element with the title attribute being "Download Now" 
//*[text()="blah"] select an element that encloses the text blah. 
//span[text()="My Templates"]/ancestor::a select the link that has display text "My Templates" using the ancestor axis 
//input[@value="out" and @name="inOrOut"] select an element whose name is input, base on the value attribute and the name attribute

Locate table cell with XPath

Given a table that has column headerA, column headerB, column headerA contains cellA with a unique value, column headerB contains cellB. We can locate cellB because the horizontal distance between cellA and cellB is equal to the horizontal distance between headerA and headerB.

To calculate the distance between headerA and headerB, we calculate how far they are from the first (oldest) sibling:


Assuming that the location of cellA is:


Then the location of cellB is:

//td[contains(text(),"#email#")]/following-sibling::td[count(//span[contains(text(),"#headerB#")]/ancestor::td/preceding-sibling::td) - count(//span[contains(text(),"#headerA#")]/ancestor::td/preceding-sibling::td)]

In the above example, we assume that the element for the table cell is td. In other cases, it can be a span or div, so all the td have to be replace with span or div. The #email#, #headerA#, #headerB# are place holders. Assume that we've define a widget 'Table Cell'. In our test code, we call:

Wiget('Table Cell', array('email' =&gt; $email, 'headerA' =&gt; 'User', 'headerB' =&gt; 'Monthly E-mail'));

Chapter 9 of XML in a nutshell
Mozilla XPath documentation
XPath tutorial by examples

We can use the text() function to find element containing specific text:

//a[.="Advanced Search"]
//a[text()="Advanced Search"]
//*[text()="Advanced Search"]

We can use the @value attribute to find button with specific text.

To select the <planet> that have <name> child with text equal to "Venus":

//input[@type="hidden" and @name="contactID"]
//input[@type="radio" and @value="team" and @name="fromMainOption"]

There are seven kinds of node: root node, element nodes, text nodes, attribute nodes, comment nodes, processing instruction nodes, namespace nodes.

/                           The root node for XPath
/html                    The html tag
/html/body/p        Select all p tags that are direct descendants of the body tag

Attribute Nodes:

To select particular attribute of an element, use an at sign (@) followed by the name of the attribute you want. For example, @src select the src attribute of the context node.