python - BeautifulSoup not parsing every tag of the html -


i'm having problem beautifulsoup not parsing html received. tried both lxml , html5lib parsers , had same problem.

html = '<td style="vertical-align: top">1</td> <td style="vertical-align: top"><span class="ui-icon country flg-fr"></span>\t</td><td class="pn"><a class="player-link" href="/players/25604">hugo lloris <span class="incident-wrapper"></span> </a><span class="player-meta-data">29</span><span class="player-meta-data">,  gk  </span></td>   <td class="shotstotal ">0\t</td><td class="shotontarget ">0\t</td><td class="keypasstotal ">0\t</td><td class="passsuccessinmatch ">88\t</td><td class="duelaerialwon ">0\t</td><td class="touches ">35\t</td><td class="rating ">6.24</td> <td style="text-align: left"><span class="incident-wrapper"></span></td> '  parsed_html = ipdb> beautifulsoup(html, 'html5lib') <html><head></head><body>1 <span class="ui-icon country flg-fr"></span> <a class="player-link" href="/players/25604">hugo lloris <span class="incident-wrapper"></span> </a><span class="player-meta-data">29</span><span class="player-meta-data">,  gk  </span>   0   0   0   88  0   35  6.24 <span class="incident-wrapper"></span> </body></html> 

it working me. execute following code (using beautifulsoup4==4.4.1):

from bs4 import beautifulsoup  html = """ <td style="vertical-align: top">1</td> <td style="vertical-align: top"><span class="ui-icon country flg-fr"></span>\t</td> <td class="pn"><a class="player-link" href="/players/25604">hugo lloris <span class="incident-wrapper"></span> </a><span         class="player-meta-data">29</span><span class="player-meta-data">,  gk  </span></td> <td class="shotstotal ">0\t</td> <td class="shotontarget ">0\t</td> <td class="keypasstotal ">0\t</td> <td class="passsuccessinmatch ">88\t</td> <td class="duelaerialwon ">0\t</td> <td class="touches ">35\t</td> <td class="rating ">6.24</td> <td style="text-align: left"><span class="incident-wrapper"></span></td> """  parsed_html = beautifulsoup(html, 'html5lib') print(html) 

and i've got following html printed:

<td style="vertical-align: top">1</td> <td style="vertical-align: top"><span class="ui-icon country flg-fr"></span>    </td> <td class="pn"><a class="player-link" href="/players/25604">hugo lloris <span class="incident-wrapper"></span> </a><span         class="player-meta-data">29</span><span class="player-meta-data">,  gk  </span></td> <td class="shotstotal ">0   </td> <td class="shotontarget ">0 </td> <td class="keypasstotal ">0 </td> <td class="passsuccessinmatch ">88  </td> <td class="duelaerialwon ">0    </td> <td class="touches ">35 </td> <td class="rating ">6.24</td> <td style="text-align: left"><span class="incident-wrapper"></span></td> 

don't see missing.


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -