r - Scraping dataTable gets only header -

i'm trying salary data the feds data center. there 1537 entries read. thought i'd gotten table xpath chrome's inspect. however, code returning header. love know i'm doing wrong.

library(rvest) url1 = 'http://www.fedsdatacenter.com/federal-pay-rates/index.php?n=&l=&a=consumer+financial+protection+bureau&o=&y=2016' read_html(url1) %>%  html_nodes(xpath="//*[@id=\"example\"]") %>% html_table()

i (lonely) header:

[[1]] [1] name       grade      pay plan   salary     bonus      agency     location   [8] occupation fy         <0 rows> (or 0-length row.names)

my desired result data frame or data.table 1537 entries.

edit: here's relevant info chrome's inspect, header in thead , data in tbody tr

the site not expressly forbid scraping data. terms of use generic , taken main http://www.fedsmith.com/terms-of-use/ site (so appears boilerplate). aren't doing source free data adds value. agree should use source data http://www.opm.gov/data/index.aspx?tag=fedscope vs rely on site being around.

but…

it doesn't require using rselenium.

library(httr) library(jsonlite)  res <- get("http://www.fedsdatacenter.com/federal-pay-rates/output.php?n=&a=&l=&o=&y=&secho=2&icolumns=9&scolumns=&idisplaystart=0&idisplaylength=100&mdataprop_0=0&mdataprop_1=1&mdataprop_2=2&mdataprop_3=3&mdataprop_4=4&mdataprop_5=5&mdataprop_6=6&mdataprop_7=7&mdataprop_8=8&isortingcols=1&isortcol_0=0&ssortdir_0=asc&bsortable_0=true&bsortable_1=true&bsortable_2=true&bsortable_3=true&bsortable_4=true&bsortable_5=true&bsortable_6=true&bsortable_7=true&bsortable_8=true&_=1464831540857")  dat <- fromjson(content(res, as="text"))

it makes xhr request data , it's paged. in event it's not obvious, can increment idisplaystart 100 page through results. made using curlconverter package. dat variable has itotaldisplayrecords component tells total.

the entirety of browser developer tools friend , can avoid clunkiness & slowness & flakiness of browser instrumentation.

Search This Blog

Employment

r - Scraping dataTable gets only header -

Popular posts from this blog

Apache NiFi ExecuteScript: Groovy script to replace Json values via a mapping file -

audio - What is the sound ID for the "Glass" sound in iOS? -

python 3.x - PyQt5 - Signal : pyqtSignal no method connect -