How to get all nested span element inside td for table and stores in list
How to get nested td element inside td for table and stores in list
how to get second td element of multiple tr
how to get second/third span element of td
Requirement
Top 10 ODI teams in men’s cricket along with the records for matches, points and rating
Solution
Step1:Inspect the data
step 1 first install libraries
pip install bs4
pip install request
step 2 import libraries
from bs4 import BeautifulSoup
import requests
step3:Send an HTTP GET request to the URL and (status code 200)
page = requests.get('https://www.icc-cricket.com/rankings/mens/team-rankings/odi')
page
output
<Response [200]>
step4: check page content
soup= BeautifulSoup(page.content)
soup
output
step5:Get and store all team data in list
second_span_list = []
tbody = soup.find('tbody')
if tbody:
for tr in tbody.find_all('tr'):
tds = tr.find_all('td')
if len(tds) >= 0:
td = tds[1] # Get the second <td> element
spans = td.find_all('span')
if len(spans) == 3:
second_span_content = spans[1].text # Extract the text content of the second <span>
second_span_list.append(second_span_content.strip())
second_span_list
Output
step6:Get and store all match made by team in list
second_td_list = []
tbody = soup.find('tbody')
if tbody:
for tr in tbody.find_all('tr'):
tds = tr.find_all('td')
if len(tds) >= 0:
second_td_content = tds[2].text
second_td_list.append(second_td_content.strip())
second_td_list
Output
step7:Get and store all points made by team in list
third_td_list = []
tbody = soup.find('tbody')
if tbody:
for tr in tbody.find_all('tr'):
tds = tr.find_all('td')
if len(tds) >= 0:
second_td_content = tds[3].text
third_td_list.append(second_td_content.strip())
third_td_list
Output
step8:Get and store all ratings made by team in list
four_td_list = []
tbody = soup.find('tbody')
if tbody:
for tr in tbody.find_all('tr'):
tds = tr.find_all('td')
if len(tds) >= 0:
second_td_content = tds[4].text
four_td_list.append(second_td_content.strip())
four_td_list
Output
step9:Store all the information in data-frame
import pandas as pd
df= pd.DataFrame({'team':second_span_list,'match':second_td_list,'points':third_td_list,'rating':four_td_list})
df
Output
Top comments (0)