The Web Scraping Club

The Web Scraping Club

Home
The Lab
Advertise on The Web Scraping…
Proxy pricing benchmark
Consulting
Archive
About
The DMCA Was Built to Stop DVD Piracy. Google Wants to Use It Against Scrapers
How a 12-page complaint is trying to turn every CAPTCHA into a federal copyright perimeter
24 hrs ago • Pierluigi Vinciguerra
THE LAB #99: HTTP Caching for Web Scraping
How Conditional Requests Can Cut Your Proxy Bill, using HTTP caching.
Mar 5 • Pierluigi Vinciguerra
Kadoa: Simplify Your Scraping Workflows with Automation and AI
My review of Kadoa: An AI-powered tool that lets you create scraping workflows in minutes
Mar 1 • Federico Trotta
Why LLM-Ready Scrapers Return Content in Markdown: A Deep Dive
Why do all AI-ready scraping solutions produce Markdown results? Let’s find out!
Feb 22 • Antonello Zanini
THE LAB #98: Scraping Google Search Results in 2026: Device, Location, and Identity
Google does not have one set of results. It has millions. The hard part is knowing which one you are looking at.
Feb 19 • Pierluigi Vinciguerra
Most Popular
View all
THE LAB #1: Scraping data from an app
Sep 4, 2022 • Pierluigi Vinciguerra
THE LAB #3: Scraping Cloudflare protected websites
Sep 27, 2022 • Pierluigi Vinciguerra
THE LAB #73: How to Bypass Cloudflare in 2025
Jan 23, 2025 • Pierluigi Vinciguerra
Ten years of web scraping: a personal perspective about selling web data
Mar 24, 2024 • Pierluigi Vinciguerra

Platinum Partner

NetNut: The Fastest & Most Reliable Proxy Network for Web Scrapers
Meet the Platinum Partner of The Web Scraping Club
Feb 6, 2025 • Pierluigi Vinciguerra
The Great Web Unblocker Benchmark - Cloudflare Edition
Testing different web unblockers against Indeed.com
Sep 22, 2024 • Pierluigi Vinciguerra
The Web Unblocker Cost Benchmark
Price comparison between the most well-known web unblockers on the market
Dec 24, 2023 • Pierluigi Vinciguerra

Web Scraping

View all
The DMCA Was Built to Stop DVD Piracy. Google Wants to Use It Against Scrapers
How a 12-page complaint is trying to turn every CAPTCHA into a federal copyright perimeter
24 hrs ago • Pierluigi Vinciguerra
THE LAB #99: HTTP Caching for Web Scraping
How Conditional Requests Can Cut Your Proxy Bill, using HTTP caching.
Mar 5 • Pierluigi Vinciguerra
Kadoa: Simplify Your Scraping Workflows with Automation and AI
My review of Kadoa: An AI-powered tool that lets you create scraping workflows in minutes
Mar 1 • Federico Trotta
Why LLM-Ready Scrapers Return Content in Markdown: A Deep Dive
Why do all AI-ready scraping solutions produce Markdown results? Let’s find out!
Feb 22 • Antonello Zanini
THE LAB #98: Scraping Google Search Results in 2026: Device, Location, and Identity
Google does not have one set of results. It has millions. The hard part is knowing which one you are looking at.
Feb 19 • Pierluigi Vinciguerra
How to Avoid Copyright Violations While Scraping
Discover how copyright violations can occur in web scraping and how to avoid them
Feb 15 • Federico Trotta
Gold Partners
Ping Proxies
Tailored to meet all your Static ISP, Residential and Datacenter proxy needs. 1 GB Trial here
Rayobyte
Rayobyte is offering a 30% discount on rack rates for residential proxies by emailing sales@rayobyte.com.

AI

View all
Build an AI Agent for Scraping and Analyzing Research Papers
Let’s build an AI agent in Python for research paper scraping and analysis.
Dec 14, 2025 • Antonello Zanini
Using NLP for Entity Extraction From Scraped Data
From theory to practice: how to extract entities from textual scraped data using NLP
Dec 7, 2025 • Federico Trotta
Using AI to Detect Patterns in Scraped Data
A practical guide on finding patterns in scraped data with advanced techniques
Nov 23, 2025 • Federico Trotta
When Browsers Start to Think: ChatGPT Atlas, Stagehand, Cursor, and the Future of Web Scraping
How recent browser integrations with LLMs are changing the way we explore and scrape the web.
Nov 2, 2025 • Pierluigi Vinciguerra
© 2026 Pierluigi · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture