Extract HTML table with superscripts using R -


i trying extract table @ this webpage using r, following code:

library('htmltab') url <- "http://www.math.leidenuniv.nl/~desmit/abc/index.php?set=2" app.data<- htmltab(url, = 3, rm_superscript = f, rm_whitespace=f, rm_invisible=f) 

however, superscripts integrated main text, entry in table 3^{10}109 outputs 310109, not same thing. if 1 sets rm_superscript = t, output e.g. 3109, i.e. superscripts absent entirely, not right. i'd superscripts indicated, output 3^{10}109. can help? thanks!

here's alternate approach.

library(xml2) library(rvest)  url <- "http://www.math.leidenuniv.nl/~desmit/abc/index.php?set=2"  pg <- read_html(url) 

extract table , convert raw html

tab <- as.character(html_nodes(pg, "table")[[3]]) 

manually replace <sup></sup> {}, convert , extract table

dat <- html_table(read_html(gsub("</sup>", "}", gsub("<sup>", "{", tab) )))[[1]]  head(dat) ##     quality  size merit             on                  b             c ## 1 1  1.6299  6.81  8.64       er 19870101       2     3{10}​109         23{5} ## 2 2  1.6260  7.68 10.18      bdw 19850920   11{2} 3{2}​5{6}​7{3}       2{21}​23 ## 3 3  1.6235 15.70 26.86    jb jb 19940401 19·​1307 7·​29{2}​31{8} 2{8}​3{22}​5{4} ## 4 4  1.5808  9.92 13.01 jb jb 19930312     283   5{11}​13{2} 2{8}​3{8}​17{3} ## 5 5  1.5679  3.64  2.89      bdw 19880106       1       2·​3{7}         5{4}​7 ## 6 6  1.5471  4.77  4.17      bdw 19880106    7{3}        3{10}       2{11}​29 

Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

android - Keyboard hides my half of edit-text and button below it even in scroll view -

css - Make div keyboard-scrollable in jQuery Mobile? -