In BQ test automation, if the elements are not found by the general locators like id, class, name, etc. then XPath is used to find an element on the web page . XPath can be used to navigate through elements and attributes in an XML document.
- XPath is used for locating the elements in XML documents
- XML and HTML has similar syntax (HTML is an XML file)
Hence XPath can be used for locating elements in HTML document as well.
XPath Terminology
Nodes
In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document nodes.
HTML documents are treated as trees of nodes. The topmost element of the tree is called the root element.
Look at the following HTML:
<div>
<h1 id = “h1”>BQurious</h1>
<h2 id = “h2”>Software</h2>
</div>
Example of nodes in the HTML above:
<div> (root element node)
<h1 id = “h1″>BQurious</h1> (element node)
id=”h1” (attribute node)
Atomic values
Atomic values are nodes with no children or parent.
Example of atomic values:
BQurious
“h1”
Items
Items are atomic values or nodes.
Relationship of Nodes
Parent
Each element and attribute has one parent.
In the following example; the <div> element is the parent of the <h1> and <h2>:
<div>
<h1 id = “h1”>BQurious</h1>
<h2 id = “h2”>Software</h2>
</div>
Children
Element nodes may have zero, one or more children.
In the following example; <h1> and <h2> are children of the <div> element:
<div>
<h1 id = “h1”>BQurious</h1>
<h2 id = “h2”>Software</h2>
</div>
Siblings
Nodes that have the same parent.
In the following example; <h1> and <h2> are elements are siblings:
<div>
<h1 id = “h1”>BQurious</h1>
<h2 id = “h2”>Software</h2>
</div>
Ancestors
A node’s parent, parent’s parent, etc.
In the following example; the ancestors of the <h1> element are the <div id = “div2”> element and the <div id = “div1”> element:
<div id = “div1”>
<div id = “div2”>
<h1 id = “h1”>BQurious</h1>
<h2 id = “h2”>Software</h2>
</div>
</div>
Descendants
A node’s children, children’s children, etc.
In the following example; descendants of the <div id = “div1”> element are the <div id = “div2”>, <h1> and <h2> elements:
<div id = “div1”>
<div id = “div2”>
<h1 id = “h1”>BQurious</h1>
<h2 id = “h2”>Software</h2>
</div>
</div>
XPath Syntax
We will use the following HTML in the examples below.
<div id = “div1”>
<div id = “div2”>
<h1 id = “h1”>BQurious</h1>
<h2 id = “h2”>Software</h2>
</div>
</div>
Selecting Nodes
XPath uses path expressions to select nodes in an XML document. The node is selected by following a path or steps. The most useful path expressions are listed below:
[table]
Expression
Description
nodename
Selects all nodes with the name “nodename“
/
Selects from the root node
//
Selects nodes in the document from the current node that match the selection no matter where they are
.
Selects the current node
..
Selects the parent of the current node
@
Selects attributes
[/table]
In the table below we have listed some path expressions and the result of the expressions:
[table]
Path Expression
Result
div
Selects all nodes with the name “div”
/div
Selects the root element bookstoreNote: If the path starts with a slash ( / ) it always represents an absolute path to an element!
div/h1
Selects all <h1> elements that are children of <div>
//h1
Selects all <h1> elements no matter where they are in the document
div//h1
Selects all <h1> elements that are descendant of the <div> element, no matter where they are under the <div> element
//@id
Selects all attributes that are named id
[/table]
Predicates
Predicates are used to find a specific node or a node that contains a specific value.
Predicates are always embedded in square brackets.
In the table below we have listed some path expressions with predicates and the result of the expressions:
[table]
Path Expression | Result |
---|---|
//div/h1[1] | Selects the first <h1> element that is the child of the <div> element.Note: In IE 5,6,7,8,9 first node is[0], but according to W3C, it is [1]. To solve this problem in IE, set the SelectionLanguage to XPath:
In JavaScript: xml.setProperty(“SelectionLanguage”,”XPath”); |
//div/h1[last()] | Selects the last <h1> element that is the child of the <div> element |
/div/h1[last()-1] | Selects the last but one <h1> element that is the child of the <div> element |
//div/h1[position()<3] | Selects the first two <h1> elements that are children of the <div> element |
//h1[@id] | Selects all the <h1> elements that have an attribute named id |
//h1[@id=’h1′] | Selects all the <h1> elements that have a “id” attribute with a value of “h1” |
//div/div[h1=’BQurious’] | Selects all the <div> elements of the <div> element that have a <h1> element with a value BQurious |
//div/div[h1=’BQurious’]/h2 | Selects all the <h2> elements of the <div> elements of the <div> element that have a <h1> element with a value BQurious |
[/table]
Selecting Unknown Nodes
XPath wildcards can be used to select unknown XML nodes.
[table]
Wildcard | Description |
---|---|
* | Matches any element node |
@* | Matches any attribute node |
node() | Matches any node of any kind |
[/table]
In the table below we have listed some path expressions and the result of the expressions:
[table]
Path Expression | Result |
---|---|
//div/* | Selects all the child element nodes of the <div> element |
//* | Selects all elements in the document |
//div[@*] | Selects all title elements which have at least one attribute of any kind |
[/table]
Selecting Several Paths
By using the | operator in an XPath expression you can select several paths.
In the table below we have listed some path expressions and the result of the expressions:
[table]
Path Expression | Result |
---|---|
//div//h1 | //div//h2 | Selects all the <h1> AND <h2> elements of all <div> elements |
//h1| //h2 | Selects all the <h1> AND <h2> elements in the document |
//div//h1 | //h2 | Selects all the <h1> elements of the <div> element AND all the <h2> elements in the document |
[/table]
XPath Axes
An axis defines a node-set relative to the current node.
[table]
AxisName | Result |
---|---|
ancestor | Selects all ancestors (parent, grandparent, etc.) of the current node |
ancestor-or-self | Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself |
attribute | Selects all attributes of the current node |
child | Selects all children of the current node |
descendant | Selects all descendants (children, grandchildren, etc.) of the current node |
descendant-or-self | Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself |
following | Selects everything in the document after the closing tag of the current node |
following-sibling | Selects all siblings after the current node |
namespace | Selects all namespace nodes of the current node |
parent | Selects the parent of the current node |
preceding | Selects all nodes that appear before the current node in the document, except ancestors, attribute nodes and namespace nodes |
preceding-sibling | Selects all siblings before the current node |
self | Selects the current node |
[/table]
Location Path Expression
A location path can be absolute or relative.
An absolute location path starts with a slash ( / ) and a relative location path does not. In both cases the location path consists of one or more steps, each separated by a slash:
An absolute location path:
/step/step/…
A relative location path:
step/step/…
Each step is evaluated against the nodes in the current node-set.
A step consists of:
- an axis (defines the tree-relationship between the selected nodes and the current node)
- a node-test (identifies a node within an axis)
- zero or more predicates (to further refine the selected node-set)
The syntax for a location step is:
axisname::nodetest[predicate]
Examples
[table]
Example | Result |
---|---|
child::div | Selects all <div> nodes that are children of the current node |
attribute::id | Selects the lang attribute of the current node |
child::* | Selects all element children of the current node |
attribute::* | Selects all attributes of the current node |
child::text() | Selects all text node children of the current node |
child::node() | Selects all children of the current node |
descendant::div | Selects all <div> descendants of the current node |
ancestor::div | Selects all <div> ancestors of the current node |
ancestor-or-self::div | Selects all <div> ancestors of the current node – and the current as well if it is a <div> node |
child::*/child::h1 | Selects all <h1> grandchildren of the current node |
[/table]