Web Scraping Tutorials

Tutorials about web scraping, Scrapy, Playwright, Selenium, and other tools

The Web Scraping Club
THE LAB #13: Managing a fleet of scrapers with Scrapeops
This article is sponsored by Serply, the solution to scrape search engine results easily. Web Scraping Club readers can save 25% on all SERP scraping plans by using the code TWSC25.The Web Scraping Club is a free weekly newsletter about web scraping. Once every two weeks, I publish…
Read more
The Web Scraping Club
Introducing the Web Scraping 101 Wiki
This article is sponsored by Serply, the solution to scrape search engine results easily. Web Scraping Club readers can save 25% on all SERP scraping plans by using the code TWSC25…
Read more
The Web Scraping Club
THE LAB #12: Reverse-engineering Mobile API
This post is sponsored by Smartproxy, the premium proxy and web scraping infrastructure focused on the best price, ease of use, and performance. In this case, for all The Web Scraping Club Readers, using the discount code WEBSCRAPINGCLUB10 you can save 10% OFF…
Read more
The Web Scraping Club
Bypass Cloudflare with these web scraping tools
This post is sponsored by Oxylabs, your premium proxy provider. Sponsorships help keep The Web Scraping Free and it’s a way to give back to the readers some value. In this case, for all The Web Scraping Club Readers, using the discount code WSC25 you can…
Read more
The Web Scraping Club
The most interesting GitHub Repositories about web scraping (2023)
This post is sponsored by Oxylabs, your premium proxy provider. Sponsorships help keep The Web Scraping Free and it’s a way to give back to the readers some value. In this case, for all The Web Scraping Club Readers, using the discount code WSC25 you can…
Read more
The Web Scraping Club
How I've built my home made mobile proxy
This post is sponsored by Oxylabs, your premium proxy provider. Sponsorships help keep The Web Scraping Free and it’s a way to give back to the readers some value. In this case, for all The Web Scraping Club Readers, using the discount code WSC25 you can…
Read more
The Web Scraping Club
Web Scraping experts: Is AI stealing our job?
This post is sponsored by Oxylabs, your premium proxy provider. Sponsorships help keep The Web Scraping Free and it’s a way to give back to the readers some value. In this case, for all The Web Scraping Club Readers, using the discount code WSC25 you can…
Read more
The Web Scraping Club
HTTP requests in Python explained
Hi, this is Pierluigi from The Web Scraping Club, a newsletter where you can find news, insights, and tutorials with real-world examples about web scraping. Being a paying user gives: Access to Paid Content, like the post series called “The LAB”, where we’ll go deep diving with code real-world cases …
Read more
The Web Scraping Club
The rise of antidetect browsers
Hi, this is Pierluigi from The Web Scraping Club, a newsletter where you can find news, insights, and tutorials with real-world examples of web scraping. Being a paying user gives: Access to Paid Content, like the post series called “The LAB”, where we’ll go deep diving with code real-world cases …
Read more
The Web Scraping Club
Selenium vs Playwright, a comparison
Hi, this is Pierluigi from The Web Scraping Club, a newsletter where you can find news, insights, and tutorials with real-world examples about web scraping. Being a paying user gives: Access to Paid Content, like the post series called “The LAB”, where we’ll go deep diving with code real-world cases …
Read more
The Web Scraping Club
The Kallax Index - Scraping Ikea websites
Hi, this is Pierluigi from The Web Scraping Club, a newsletter where you can find news, insights, and tutorials with real-world examples about web scraping. Being a paying user gives: Access to Paid Content, like the post series called “The LAB”, where we’ll go deep diving with code real-world cases …
Read more
The Web Scraping Club
THE LAB #4: Scrapyd - how to manage and schedule a fleet of scrapers
Here’s another post of “THE LAB”: in this series, we'll cover real-world use cases, with code and an explanation of the methodology used. In the future, this kind of content will be available only to paying subscribers. Being one of the first of the series, this one will be available for free until the 19th of Oct 2022, then will be behind a paywall…
Read more
The Web Scraping Club
Create your first python scraper with Scrapy
Hi, this is Pierluigi from The Web Scraping Club, a newsletter where you can find news, insights, and tutorials with real-world examples about web scraping. Being a paying user gives: Access to Paid Content, like the post series called “The LAB”, where we’ll go deep diving with code real-world cases …
Read more
The Web Scraping Club
What's a proxy server?
Hi, this is Pierluigi from The Web Scraping Club, a newsletter where you can find news, insights, and tutorials with real-world examples about web scraping. Being a paying user gives: Access to Paid Content, like the post series called “The LAB”, where we’ll go deep diving with code real-world cases …
Read more
The Web Scraping Club
The costs of web scraping
There's no doubt in stating that cloud computing enabled a wide range of new opportunities in the tech space, and this is true also for web scraping. Cheap virtual machines and storage enabled to scale the of activities to a new level, allowing companies to crawl a larger number of websites at a fraction of the traditional cost…
Read more
The Web Scraping Club
Is web scraping becoming harder?
The Web Scraping Club is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. Rising costs and difficulties Do you have the feeling that web scraping is becoming more difficult and expensive…
Read more
The Web Scraping Club
From 0 to 2 Billion Prices scraped per months
In this post of The Web Scraping Club blog, I’ll write about what we did at Databoutique.com to scale from 0 to 2 Billion prices per month scraped, bootstrapped, and with a minimal team of developers. I’ll write the principles that lead the developments in our company and let us scale as a data provider keeping low costs and head count…
Read more
The Web Scraping Club
Wanted a parka and got an "Error 429: Too many requests"
In this post of The Web Scraping Club, we’ll see why some websites when we load them for the first time, throw a 429 error before starting to work. TLDR versions: It’s because of Kasada's anti-bot solution. So you want to buy a parka? Let's say you're looking for a nice parka for the coming winter season, you go to your preferred e-commerce and from the d…
Read more
The Web Scraping Club
Wanted a parka and got an "Error 429: Too many requests"
In this post of The Web Scraping Club, we’ll see why some websites when we load them for the first time, throw a 429 error before starting to work. TLDR versions: It’s because of Kasada's anti-bot solution. So you want to buy a parka? Let's say you're looking for a nice parka for the coming winter season, you go to your preferred e-commerce and from the d…
Read more
The Web Scraping Club
3 THINGS + 1 TO DO BEFORE STARTING CODING YOUR SCRAPER
Welcome to the first post of The Web Scraping Club blog. The checklist for your web scraping projects Before starting coding your scraper, a good target website analysis could save you a lot of time. CHECK THE TECHNOLOGY STACK OF THE TARGET WEBSITE I usually do a double check to have a rapid understanding of the website and to identify known anti-bot soluti…
Read more