How to store all data of specific tag(h3) in list after getting all information of Specific Class
Use of strip to remove /n at leading and trailing position
how to store list of all data before special symbol
Task:
Write s python program to display list of respected former presidents of India(i.e. Name , Term ofoffice)
from https://presidentofindia.nic.in/former-presidents.htm and make data frame.
First Methods using class and header
step 1 first install libraries
pip install bs4
pip install request
step 2 import libraries
from bs4 import BeautifulSoup
import requests
step3:Send an HTTP GET request to the URL and (status code 200)
page = requests.get('https://presidentofindia.nic.in/former-presidents')
page
output
<Response [200]>
step4: check page content
soup= BeautifulSoup(page.content)
soup
output
step5: get all information of Specific Class
data = soup.find_all('div', class_="desc-sec")
data
step6: append all data of h3 tag and class desc-sec and store in list
name=[]
for tag in data:
president_name = tag.find('h3').text
name.append(president_name.strip())
name
output
step7: append all data of h5 tag and class desc-sec and store in list
term=[]
for tag in data:
terms_of_office = tag.find('h5').text
term.append(terms_of_office.strip())
term
output
step 8: Finally make a dataframe of above data
import pandas as pd
df= pd.DataFrame({'president_name':name,'terms_of_office':term})
df
output
Second Methods using class and split
step 1 first install libraries
pip install bs4
pip install request
step 2 import libraries
from bs4 import BeautifulSoup
import requests
step3:Send an HTTP GET request to the URL and (status code 200)
page = requests.get('https://www.wikipedia.org')
page
output
<Response [200]>
step4: check page content
soup= BeautifulSoup(page.content)
soup
output
step5: get all information of Specific Class
data = soup.find_all('div', class_="desc-sec")
data
step6: store all information of Specific Class in list
president=[]
for tag in data:
president.append(tag.text.strip())
president
Output
step7: store all information of above list into seprate list
president_na = []
for item in president:
parts = item.split('\n')
if len(parts) > 1:
president_na.append(parts[0])
president_na
term_of = []
for item in president:
parts = item.split('\n')
if len(parts) > 1:
term_of.append(parts[1])
term_of
output
step 8: Finally make a dataframe of above data
import pandas as pd
datas= pd.DataFrame({'president_name':president_na,'terms_of_office':term_of})
datas
Top comments (0)