rakesh kumar

Posted on Oct 2, 2023

How to extract text pagination elements while filtering out the hidden ones in selenium

Using List comprhension

HTML Example:

Suppose you have a web page with pagination elements like this:

<div class="s-pagination">
  <span class="s-pagination-item">1</span>
  <span class="s-pagination-item" aria-hidden="true">2</span>
  <span class="s-pagination-item">3</span>
  <span class="s-pagination-item">4</span>
  <span class="s-pagination-item" aria-hidden="true">5</span>
</div>

Selenium Python Script:

Here's a simplified Selenium Python script to interact with this HTML page and apply the code you provided:

from selenium import webdriver
from selenium.webdriver.common.by import By

# Start a WebDriver session (you should have the appropriate driver installed)
driver = webdriver.Chrome()

# Navigate to the example HTML page
driver.get("file:///path/to/your/example.html")

# Find all pagination elements
pagination_elements = driver.find_elements(By.XPATH, '//span[@class="s-pagination-item"]')

# Create a list to store the text of all pagination elements, including hidden ones
pagination_texts = [element.text for element in pagination_elements]

# Filter out the hidden elements with aria-hidden="true"
visible_pagination_texts = [text for text in pagination_texts if "aria-hidden" not in element.get_attribute("outerHTML")]

# Output the results
print("All pagination texts:", pagination_texts)
print("Visible pagination texts:", visible_pagination_texts)

# Close the WebDriver session
driver.quit()

Output and Explanation:

The script opens a WebDriver session and navigates to the example HTML page.

pagination_elements = driver.find_elements(By.XPATH, '//span[@class="s-pagination-item"]'): This line finds all HTML elements with the tag and a class attribute value of "s-pagination-item" using an XPath query.

pagination_elements would contain references to these elements.

pagination_texts = [element.text for element in pagination_elements]: This line extracts the text content of each element in pagination_elements using the text property.

pagination_texts would now contain: ["1", "2", "3", "4", "5"].

visible_pagination_texts = [text for text in pagination_texts if "aria-hidden" not in element.get_attribute("outerHTML")]: This line filters out the hidden pagination elements based on the presence of the "aria-hidden" attribute in the outer HTML of each element.

The expected result is that visible_pagination_texts would contain only the text from non-hidden elements:

visible_pagination_texts would now contain: ["1", "3", "4"].

Finally, the script outputs the results:

All pagination texts: ['1', '2', '3', '4', '5']
Visible pagination texts: ['1', '3', '4']

It displays both the text of all pagination elements and the text of visible pagination elements. The hidden elements ("2" and "5") have been filtered out in the visible_pagination_texts list.

Using append
HTML Example:

Suppose you have a web page with pagination elements like this:

<div class="s-pagination">
  <span class="s-pagination-item">1</span>
  <span class="s-pagination-item" aria-hidden="true">2</span>
  <span class="s-pagination-item">3</span>
  <span class="s-pagination-item">4</span>
  <span class="s-pagination-item" aria-hidden="true">5</span>
</div>

Selenium Python Script:
from selenium import webdriver
from selenium.webdriver.common.by import By

Start a WebDriver session (you should have the appropriate driver installed)

driver = webdriver.Chrome()

Navigate to the example HTML page

driver.get("file:///path/to/your/example.html")

Find all pagination elements

pagination_elements = driver.find_elements(By.XPATH, '//span[@class="s-pagination-item"]')

Create empty lists to store the text of all pagination elements, including hidden ones,

and the visible pagination texts

pagination_texts = []
visible_pagination_texts = []

Iterate through pagination elements

for element in pagination_elements:
# Extract the text content of each element
text = element.text
pagination_texts.append(text)

# Check if the element is hidden using aria-hidden
if "aria-hidden" not in element.get_attribute("outerHTML"):
    visible_pagination_texts.append(text)

Output the results

print("All pagination texts:", pagination_texts)
print("Visible pagination texts:", visible_pagination_texts)

Close the WebDriver session

driver.quit()
Using lambda function
HTML Example:

Suppose you have a web page with pagination elements like this:

<div class="s-pagination">
  <span class="s-pagination-item">1</span>
  <span class="s-pagination-item" aria-hidden="true">2</span>
  <span class="s-pagination-item">3</span>
  <span class="s-pagination-item">4</span>
  <span class="s-pagination-item" aria-hidden="true">5</span>
</div>

Selenium Python Script:

from selenium import webdriver
from selenium.webdriver.common.by import By

# Start a WebDriver session (you should have the appropriate driver installed)
driver = webdriver.Chrome()

# Navigate to the example HTML page
driver.get("file:///path/to/your/example.html")

# Find all pagination elements
pagination_elements = driver.find_elements(By.XPATH, '//span[@class="s-pagination-item"]')

# Create an empty list to store the text of all pagination elements, including hidden ones
pagination_texts = []

# Iterate through pagination elements
for element in pagination_elements:
    # Extract the text content of each element
    text = element.text
    pagination_texts.append(text)

# Use filter() with a lambda function to filter out hidden elements
visible_pagination_texts = list(filter(lambda elem: "aria-hidden" not in elem.get_attribute("outerHTML"), pagination_texts))

# Output the results
print("All pagination texts:", pagination_texts)
print("Visible pagination texts:", visible_pagination_texts)

# Close the WebDriver session
driver.quit()

Debug School

How to extract text pagination elements while filtering out the hidden ones in selenium

Start a WebDriver session (you should have the appropriate driver installed)

Navigate to the example HTML page

Find all pagination elements

Create empty lists to store the text of all pagination elements, including hidden ones,

and the visible pagination texts

Iterate through pagination elements

Output the results

Close the WebDriver session

Top comments (0)