Using XPath in Selenium: All you need to know

Navigating the complex web world as an automation tester aims to ensure the delivery of an application that is bug-free and seamless and offers exceptional user experiences. However, this is often complex to achieve. One common challenge on this journey is locating web elements accurately and efficiently to interact with them. Selenium WebDriver helps identify web elements efficiently with various tools and locators like CSS Selector, XPath, and many more, which enable enterprises to navigate through the XML elements. Today, several enterprises are leveraging XPath owing to its greater flexibility and compatibility with old browsers, making it one of its most powerful and versatile locators. This provides a way to navigate elements and attributes in an XML document, allowing automation testers to locate and interact with web elements on a page. This blog provides a holistic XPath tutorial on how you can use XPath in Selenium, along with examples and tips on handling dynamic elements.

What is XPath in Selenium?

XPath, or XML Path Language, is a language for navigating an XML document and selecting nodes. In the context of Selenium WebDriver, XPath is used as a locator to find web elements on a page. It is a powerful tool that can navigate the webpage's HTML structure, making it extremely useful when other simple locators like ID or class fail to find the elements reliably.

Types of XPath in Selenium

There are primarily two types of XPath used in Selenium:

Absolute XPath

It's a direct path from the root element to the desired element. It starts from the root node and ends with the desired node, providing a complete path. However, it's brittle and can break with small changes in the web page's structure.

Here's a Selenium XPath example of how you might use an absolute XPath in your Selenium code. Suppose you have the following HTML structure:


<html>
  <body>
    <div>
      <p>Hello, world!</p>
    </div>
  </body>
</html>

To find the <p> tag using an absolute XPath, you would start at the root <html> tag and provide the full path to the <p> tag:


WebElement paragraph = driver.findElement(By.xpath("/html/body/div/p"));

In this case, the XPath "/html/body/div/p" represents the absolute path from the root <html> tag to the desired <p> tag.

In several scenarios, however, absolute XPath is not recommended unless necessary because it's brittle, and any change in the HTML structure may cause your test to fail.

Relative XPath

It starts from any node and ends with the node you want to select. It's more flexible and preferred in most cases, as it's not affected by changes in other parts of the HTML structure. A relative XPath allows you to locate elements starting from any location within the HTML document, not just the root. The relative XPath expression usually starts with //.

Here's an example of using relative XPath in Selenium with Python:


```python
from selenium import webdriver
# Create a new instance of the Firefox driver
driver = webdriver.Firefox()
# Navigate to a website
driver.get("https://example.com")
# Find an element using relative XPath
element = driver.find_element_by_xpath("//div[@id='myDiv']/p[2]")
# Perform actions on the element
element.click()
# Close the browser
driver.quit()
```

In the above example, we first import the necessary modules from the Selenium library. Then, we create a new instance of the Firefox driver and navigate to a website (in this case, "https://example.com").

Next, we find an element using a relative XPath expression. The XPath used in this example selects the second <p> element inside a <div> element with the id "myDiv". You can modify the XPath expression to suit your specific needs.

After finding the element, you can perform various actions on it, such as clicking it, entering text, or retrieving its attributes.

Finally, we close the browser using the quit() method to clean up and release the resources used by the driver.

Remember to have the Selenium library installed and the appropriate web driver executable (e.g., geckodriver for Firefox) set up in your system's PATH for this code to work.

Using relative XPaths is generally recommended over absolute XPaths because they are more resilient to page structure changes.

Types of XPath locators

XPath locators in Selenium WebDriver are used to identify elements on a web page. These locators allow complex and flexible navigation of the web page's Document Object Model (DOM). There are several types of XPath locators, each useful in different situations.

● XPath locator by ID: This locator allows you to identify an element by its id attribute.

Example:


driver.findElement(By.xpath("//*[@id='username']"));

Note: XPath is a wildcard in the snippets helping to select unknown XML nodes

● XPath locator by class name: This locator can identify elements based on their class attribute.

Example:


driver.findElement(By.xpath("//*[@class='login-button']"));

● XPath locator by name: This locator identifies elements by their name attribute.

Example:


driver.findElement(By.xpath("//*[@name='password']"));

● XPath locator by tag name: This locator can identify elements by their HTML tag name.

Example:


driver.findElement(By.xpath("//p"));

● XPath locator by text: This locator identifies elements based on their inner text.

Example:


driver.findElement(By.xpath("//*[text()='Submit']"));

● XPath locator using contains: This locator can identify elements based on a substring of one of their attribute values.

Example:


driver.findElement(By.xpath("//*[contains(@href,'google.com')]"));

● XPath locator using starts-with: This locator identifies elements whose attribute values start with a particular string.

Example:


driver.findElement(By.xpath("//*[starts-with(@id,'user')]"));

● XPath locator using ends-with: This locator can identify elements whose attribute values end with a particular string.

Example:


driver.findElement(By.xpath("//*[ends-with(@id,'name')]"));

What is chained XPath in Selenium?

Chained XPath in Selenium is a concept where multiple XPaths are used in conjunction to locate an element that might not be uniquely identifiable by a single XPath expression. In other words, instead of writing one absolute XPath, we can separate it into multiple relative XPaths.

This approach can be specifically useful when dealing with complex or dynamic web structures where elements are not easily accessible through single, unique identifiers. Chaining XPaths can provide more precision and robustness in element location strategy, thus making the automation scripts more stable.

Let us consider a scenario where we need to locate an element that doesn’t have unique attributes to create a precise XPath. Nonetheless, its parent elements do have unique attributes. Here's how you can use chained XPath in Selenium:


WebDriver driver = new ChromeDriver();
driver.get("http://www.yourwebsite.com");
// Let's assume there is a div with id="parent" and it has a child button with text "Submit"
WebElement parentDiv = driver.findElement(By.xpath("//*[@id='parent']"));
WebElement submitButton = parentDiv.findElement(By.xpath(".//button[text()='Submit']"));
submitButton.click();

What are XPath Axes?

XPath Axes are used for finding dynamic elements when normal XPath element search methods like name, ID, class name, etc., aren't possible. XPath Axes navigate through elements in the XML structure of a webpage. They allow you to locate elements based on their relationship with other elements, like parent, sibling, child, ancestor, or descendant. Here are some of the commonly used XPath Axes:

ancestor: Selects all ancestors (parent, grandparent, others.) of the current node.

child: This selects all children of the current node.

descendant: Select all descendants (children, grandchildren, others.) of the current node.

following: This selects everything in the document after the closing tag of the current node.

following-sibling: This selects all siblings after the current node.

parent: This selects the parent of the current node.

preceding: This selects all nodes that appear before the current node in the document, excluding ancestors and attributes.

preceding-sibling: This selects all siblings before the current node.

self: This selects the current node.

attribute: This selects the attributes of the current node.

These axes provide a flexible way to traverse the DOM and locate elements based on their relationships with other elements, making XPath a very powerful tool for web scraping and testing tasks.

Let’s walk through a few of these XPath methods:

1. Following

The following axis in XPath is a powerful tool for navigating XML trees in Selenium tests. It selects all the nodes in the document, posts the closing tag of the current node, no matter where they are nested or at what level.

Let's say you have a web page with multiple sections and want to identify an element that appears after a particular section. With the following axis, you can effectively locate that element without navigating through the entire tree structure.

Here's an example of how you might use the following axis in a Selenium script:


```java
driver.findElement(By.xpath("//div[@id='main-section']/following::div"));
```

In this code, //div[@id='main-section']/following::div would select all div elements in the document that come after the div element with the id 'main-section'.

So, if you have a dynamically changing structure on your webpage, the following axis can be a helpful way to locate elements relative to others. It's particularly useful in instances where you need to find elements that appear after a specific point in the document, regardless of their nesting or hierarchical level.

2. Ancestor

The ancestor axis method in XPath is particularly useful when dealing with complex XML documents or web pages with deeply nested elements. This method allows you to select all ancestor elements (parent, grandparent, others) of the current node in reverse document order (from the closest ancestor to the furthest).

To illustrate, consider the following HTML snippet:


```html
<html>
<body>
  <form id="loginForm">
    <div id="credentials">
      <input id="username" type="text"/>
      <input id="password" type="password"/>
    </div>
    <input id="submitButton" type="submit"/>
  </form>
</body>
</html>
```

If you wanted to find the form element that encloses the input element with the id "username", you could use the ancestor axis method as follows:


```python
form = driver.find_element_by_xpath("//input[@id='username']/ancestor::form")
``

The XPath expression //input[@id='username']/ancestor::form selects the form ancestor of the input element with the id "username". In this case, there's only one such form, but if there were more, this expression would select all of them. If you wanted to select only the closest form ancestor, you could use the following XPath expression:


```python
form = driver.find_element_by_xpath("//input[@id='username']/ancestor::form[1]")
```

This XPath expression would select the first form ancestor of the input element with the id "username", where the first ancestor is the closest one.

Understanding the ancestor axis method is crucial when writing robust Selenium test scripts, as it enables you to locate elements in a more precise and flexible way, especially when dealing with dynamic or complex web pages.

3. Child

The child method in XPath axes is used to select all children of the current node. This is one of the most commonly used XPath axes methods for locating web elements in Selenium WebDriver. It enables testers to directly access the child nodes of a specific element, helping in the navigation of an HTML document from parent to child.

To put it in perspective, consider an HTML structure where a 'div' element with id 'content' has multiple 'p' elements as its children.


```html
<div id='content'>
    <p>Paragraph 1</p>
    <p>Paragraph 2</p>
    <p>Paragraph 3</p>
</div>
```

To select all 'p' elements (children of the 'div' element), the XPath would be


`//div[@id='content']/child::p`.

Breaking this down:

- //div[@id='content']: This selects the 'div' element with the id 'content'.

- /child::p: This selects all 'p' elements that are children of the previously selected 'div' element.

Hence, the child method helps directly access the child elements of a specific node, which can be very handy while writing Selenium scripts, particularly in cases where the parent element is easily locatable, and the structure from the parent to child is stable.

4. Parent

The Parent Axis method in XPath is another crucial aspect of identifying elements in relation to other elements in the XML document or DOM structure. As the name suggests, this method is used to select the parent of the current node.

The basic XPath syntax in Selenium for the parent axis is as follows:


`//tag[@attribute='value']/parent::tagName`

Let's break down this syntax:

- //tag[@attribute='value']: This portion of the XPath identifies the current node in the document. The 'tag' represents the HTML tag of the element (like 'div', 'a', 'span', etc.), the 'attribute' refers to the attribute of the element (like 'id', 'class', etc.), and the 'value' represents the value of that attribute.

- /parent::tagName: This part of the XPath is used to select the parent of the current node. The 'tagName' is the HTML tag of the parent element you want to select.

For example, let's say you have a 'div' element with the id 'username', and you want to select its parent element, which is a 'form'. Your XPath using the parent axis would look something like this:


`//div[@id='username']/parent::form`

This XPath selects the 'form' element which is the parent of the 'div' element with the id 'username'.

In the context of Selenium, the parent axis comes in handy when the child elements have some unique attributes that can be easily located, and you want to interact or check something with their parent element.

However, it's worth noting that parent-child relationships in HTML are not always straightforward, especially with complex, nested structures, so understanding the DOM structure well is crucial to use the parent method in XPath effectively.

How HeadSpin helps streamline Selenium utilization

HeadSpin offers a robust AI-driven testing Platform with easy integration with multiple automation frameworks to improve testing efficiency and ship faster to the market.

HeadSpin expands beyond the traditional capabilities of Selenium and adheres to the W3C WebDriver specification, each HeadSpin host operates a Selenium server supporting custom features. Moreover, HeadSpin's cloud-based Selenium load balancer accommodates extra capabilities, facilitating device selection and redundancy management.

Following are the key capabilities of HeadSpin that enable testers and QA teams to leverage Selenium for unique testing needs—

Support for Selenium Webdriver: HeadSpin's platform fully supports Selenium WebDriver. This allows developers and testers to write Selenium scripts and execute them on real devices in HeadSpin's device cloud, thereby facilitating automated testing across various device and OS combinations.
Parallel testing: With HeadSpin, you can conduct concurrent testing across multiple devices, significantly reducing test times and accelerating your development process. This is particularly useful in Selenium Grid setup, where you might want to run your tests on different browser and operating system configurations simultaneously.
Integration with CI/CD pipelines: HeadSpin can seamlessly integrate into your existing CI/CD pipelines. This means your Selenium tests can be automatically triggered each time there's a commit or before a release, ensuring your application remains bug-free.
Real device testing: HeadSpin allows Selenium tests to run on real, physical devices, providing more accurate results compared to emulators or simulators. This helps in identifying real-world issues that users might face.
Detailed reports: After test execution, HeadSpin provides detailed reports, complete with videos, logs, network stats, and performance metrics, helping you identify, understand, and fix issues quickly.
Global test coverage: HeadSpin's device cloud spans 150+ locations worldwide, enabling you to conduct Selenium testing from a global user perspective. You can understand and address region-specific issues, ensuring a top-notch user experience everywhere.

Conclusion

XPath is critical in Selenium testing, offering a robust method for locating elements within the web page's DOM. Understanding and utilizing XPath's capabilities, such as functions, axes, and expressions, can greatly improve the effectiveness and efficiency of your Selenium tests. Moreover, with platforms like HeadSpin, you can enhance your testing capabilities further, leveraging its unique features aligned with Selenium standards. As the world of web development continues to evolve, it's essential to stay updated with these tools and methodologies to ensure high-quality, reliable web applications.