node.js - Scrape web with x-ray -
i'm using x-ray extract data web site when point crawl page using built-in functionality, doesn't work.
unitprice parameter want extract "undefined" time.
as can see, i'm passing href value extracted on url property.
var xray = require('x-ray'); var x = xray(); var x = xray({ filters: { cleanprice: function (value) { return typeof value === 'string' ? value.replace(/\r|\t|\n|€/g, "").trim() : value }, whitespaces: function (value) { return typeof value === 'string' ? value.replace(/ +/g, ' ').trim() : value } } }); x('https://www.simply.es/compra-online/aceite-vinagre-y-sal.html', '#content > ul', [{ name: '.descripcionproducto | whitespaces', categoryid: 'input[name="idcategoria"]@value', productid: 'input[name="idproducto"]@value', url: 'li a@href', price: 'span | cleanprice', image: '.miniaturaproducto@src', unitprice: x('li a@href', '.preciokilo') }]) .paginate('.link@href') .limit(1) // .delay(500, 1000) // .throttle(2, 1000) .write('results.json')
there's pull request fix this. meanwhile can use solution 1 line of code. see this: