If you want to do scraping with Selenium / Puppeteer or Playwright on sites like Idealista that is blocked by DataDome you can use Kameleo. While the above mentioned automation frameworks fail due to as mentioned: "Headless browsers have unique characteristics that differ from regular browsers. DataDome can detect these discrepancies and identify the browser as automated." While Kameleo's browsers: Chroma and Junglefox is designed for webscraping. They hide under the radar, melts in the crowd as it provides a natural browser fingerprint. With this you will be able to scrape with Selenium+Kameleo or Puppeteer+Kameleo or Playwright+Kameleo
Well well, it can be easily bypassed / scrapped by solving the audio challenge by using any speech to text API. Now I can effortlessly bypass datadome protected sites in puppeteer itself.
Seriously guys there is no need to use an external service!
Interesting approach, I’d love to investigate more about it. I’m proposing both commercial and OS solutions since people sometimes want to delegate to third parties rather than handling a complex infrastructure
Ohh that is true, I completely forget that aspect. But day by day Datadome is updating their security. They are even suing some AI stuffs to monitor the way I type input, only If i make a mistake and come back and fix it(more like real human), then only I was able to bypass Datadome. That was the only loophole. But now they fixed that and the security is even stronger... Ahh maybe now I realise that third parties take the trouble of handling these...
If you want to do scraping with Selenium / Puppeteer or Playwright on sites like Idealista that is blocked by DataDome you can use Kameleo. While the above mentioned automation frameworks fail due to as mentioned: "Headless browsers have unique characteristics that differ from regular browsers. DataDome can detect these discrepancies and identify the browser as automated." While Kameleo's browsers: Chroma and Junglefox is designed for webscraping. They hide under the radar, melts in the crowd as it provides a natural browser fingerprint. With this you will be able to scrape with Selenium+Kameleo or Puppeteer+Kameleo or Playwright+Kameleo
Well well, it can be easily bypassed / scrapped by solving the audio challenge by using any speech to text API. Now I can effortlessly bypass datadome protected sites in puppeteer itself.
Seriously guys there is no need to use an external service!
Interesting approach, I’d love to investigate more about it. I’m proposing both commercial and OS solutions since people sometimes want to delegate to third parties rather than handling a complex infrastructure
Ohh that is true, I completely forget that aspect. But day by day Datadome is updating their security. They are even suing some AI stuffs to monitor the way I type input, only If i make a mistake and come back and fix it(more like real human), then only I was able to bypass Datadome. That was the only loophole. But now they fixed that and the security is even stronger... Ahh maybe now I realise that third parties take the trouble of handling these...