python - How to use BeautifulSoup to parse a table? -

- June 15, 2010

this context-specific question regarding how use beautifulsoup parse html table in python2.7.

i extract html table here , place in tab-delim csv, , have tried playing around beautifulsoup.

code context:

proxies = {     "http://": "198.204.231.235:3128", } site = "http://sloanconsortium.org/onlineprogram_listing?page=11&institution=&field_op_delevery_mode_value_many_to_one[0]=100%25%20online"  r = requests.get(site, proxies=proxies) print 'r: ', r html_source = r.text print 'src: ', html_source soup = beautifulsoup(html_source)

why doesn't code 4th row?

soup.find('table','views-table cols-6').tr[4]

how print out of elements in first row (not header row)?

okey, might able give 1 liner, following should started

table = soup.find('table', class_='views-table cols-6')                                                                                                                                                                                                                         row in table.find_all('tr'):                                                                                                                                                                                                                                                    row_text = list()                                                                                                                                                                                                                                                               item in row.find_all('td'):                                                                                                                                                                                                                                                     text = item.text.strip()                                                                                                                                                                                                                                                        row_text.append(text.encode('utf8'))                                                                                                                                                                                                                                        print row_text

i believe tr[4] believed attribute , not index suppose.

Search This Blog

IO

python - How to use BeautifulSoup to parse a table? -

Comments

Post a Comment

Popular posts from this blog

javascript - DIV "hiding" when changing dropdown value -

html - Accumulated Depreciation of Assets on php -

c# - WPF DataGrids for hierarchical information -