The Lab Archive

Real-world use cases posts about web scraping and anti-bot techniques

The Web Scraping Club
THE LAB #14: Scraping Cloudflare Protected Websites (early 2023 version)
This article is sponsored by MobileHop, your mobile IP proxy provider. MobileHop provides native mobile IPs on dedicated 4G/5G modems via Verizon and AT&T Wireless to bypass almost all website blocks. A single multihop license gives you access to 50 USA markets and growing…
Read more
The Web Scraping Club
THE LAB #13: Managing a fleet of scrapers with Scrapeops
This article is sponsored by Serply, the solution to scrape search engine results easily. Web Scraping Club readers can save 25% on all SERP scraping plans by using the code TWSC25.The Web Scraping Club is a free weekly newsletter about web scraping. Once every two weeks, I publish…
Read more
The Web Scraping Club
THE LAB #12: Reverse-engineering Mobile API
This post is sponsored by Smartproxy, the premium proxy and web scraping infrastructure focused on the best price, ease of use, and performance. In this case, for all The Web Scraping Club Readers, using the discount code WEBSCRAPINGCLUB10 you can save 10% OFF…
Read more
The Web Scraping Club
THE LAB #11: The Anti-Detect Anti-Bot matrix
This post is sponsored by Oxylabs, your premium proxy provider. Sponsorships help keep The Web Scraping Free and it’s a way to give back to the readers some value. In this case, for all The Web Scraping Club Readers, using the discount code WSC25 you can…
Read more
The Web Scraping Club
THE LAB #10: Bypass Cloudflare Bot Protection with GoLogin
This article is sponsored by Serply, the solution to scrape search engine results easily. Web Scraping Club readers can save 25% on all SERP scraping plans by using the code TWSC25. Cloudflare anti-bot detection If you google “Cloudflare bypass”, you will find hundreds of articles and Github repositories explaining how to bypass Cloudflare (or sell a solution for doing it…
Read more
The Web Scraping Club
THE LAB #9: Scraping OpenSea NFT's data
The NFT hype cycle In the past week, a scandal that involves the famous influencer Logan Paul and his crypto project called “Criptozoo” exploded, thanks to the Cofeezilla investigations (you can see the full story here). Basically, it seems that this crypto game has never been delivered for multiple factors but people, trusting the public profile of Logan, put several million USD into it, hoping to have some return, a thing that never happened. It’s nothing new under the Crypto sun, Ponzi schemes promising impossible returns on investments are discovered every day, and surprisingly there’s always someone who got caught in the fishnet…
Read more
The Web Scraping Club
THE LAB #8: Using Bezier curves for human-like mouse movements
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used.The Web Scraping Club is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber…
Read more
The Web Scraping Club
THE LAB #7: Scraping PerimeterX protected websites
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used.Thank you for reading The Web Scraping Club. This post is public so feel free to share it. Being a paying user gives…
Read more
The Web Scraping Club
THE LAB #6: Changing Ciphers in Scrapy to avoid bans by TLS Fingerprinting
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used.The Web Scraping Club is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber…
Read more
The Web Scraping Club
The Lab #5 - Scraping Airbnb.com using GraphQL
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used.The Web Scraping Club is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber…
Read more
The Web Scraping Club
THE LAB #4: Scrapyd - how to manage and schedule a fleet of scrapers
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used. In the future, this kind of content will be available only to paying subscribers. Being one of the first of the series, this one will be available for free until the 19th of Oct 2022, then will be behind a paywall…
Read more
The Web Scraping Club
THE LAB #3: Scraping Cloudflare protected websites
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used. In the future, this kind of content will be available only to paying subscribers. Being one of the first of the series, this one will be available for free until the 2nd of Oct 2022, then will be behind a paywall…
Read more
The Web Scraping Club
THE LAB #2: scraping data from a website with Datadome and xsrf tokens
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used. In the future, this kind of content will be available only to paying subscribers. Being one of the first of the series, this one will be available for free until 22th of Sept 2022, then will be behind a paywall…
Read more
The Web Scraping Club
THE LAB #1: Scraping data from an app
This is the first post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used.The Web Scraping Club is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber…
Read more