python - Writing CSV from scraped HTML data -
i able extract data using below code russian statistics website , create csv file. however, have 2 issues, firstly, don't know why there blank row inserted between two-non-blank rows. secondly, unaware how write nice table data same month spread across different columns. right now, in 1 cell. thanks.
from bs4 import beautifulsoup import lxml import urllib2 import csv f=csv.writer(open("russia.csv","w")) mainurl='http://www.gks.ru/bgd/free/b00_25/isswww.exe/stg/d000/i000750r.htm' urlroot='http://www.gks.ru/bgd/free/b00_25/isswww.exe/stg/d000/' data = urllib2.urlopen(mainurl).read() page = beautifulsoup(data,'html.parser') link in page.findall('a'): page = urllib2.urlopen(urlroot+link.get('href')) soup = beautifulsoup(page, 'lxml') years=soup.findall('title',text=true) table = soup.find('center').find('table') row in table.find_all('tr')[3:]: cells = [cell.get_text(strip=true) cell in row.find_all('td')] f.writerow([cells])
you unintentionally making a list of lists here:
cells = [cell.get_text(strip=true) cell in row.find_all('td')] f.writerow([cells])
instead, write cells
list directly:
f.writerow(cells)
Comments
Post a Comment