The Web Scraping Club

The Web Scraping Club

Home
The Lab
Advertise on The Web Scraping…
Proxy pricing benchmark
Consulting
Archive
About
Why LLM-Ready Scrapers Return Content in Markdown: A Deep Dive
Why do all AI-ready scraping solutions produce Markdown results? Let’s find out!
Feb 22 • Antonello Zanini
THE LAB #98: Scraping Google Search Results in 2026: Device, Location, and Identity
Google does not have one set of results. It has millions. The hard part is knowing which one you are looking at.
Feb 19 • Pierluigi Vinciguerra
How to Avoid Copyright Violations While Scraping
Discover how copyright violations can occur in web scraping and how to avoid them
Feb 15 • Federico Trotta
Google vs IPIDEA: Anatomy of a Residential Proxy Takedown
Google Took Down 16 Million Proxy IPs. Here is Why It Will Not Be Enough.
Feb 8 • Pierluigi Vinciguerra
THE LAB #97: My first week with OpenClaw
160,000 Stars in Two Months: What OpenClaw Means for Scrapers
Feb 5 • Pierluigi Vinciguerra
Most Popular
View all
THE LAB #1: Scraping data from an app
Sep 4, 2022 • Pierluigi Vinciguerra
THE LAB #3: Scraping Cloudflare protected websites
Sep 27, 2022 • Pierluigi Vinciguerra
THE LAB #73: How to Bypass Cloudflare in 2025
Jan 23, 2025 • Pierluigi Vinciguerra
Ten years of web scraping: a personal perspective about selling web data
Mar 24, 2024 • Pierluigi Vinciguerra

Platinum Partner

NetNut: The Fastest & Most Reliable Proxy Network for Web Scrapers
Meet the Platinum Partner of The Web Scraping Club
Feb 6, 2025 • Pierluigi Vinciguerra
The Great Web Unblocker Benchmark - Cloudflare Edition
Testing different web unblockers against Indeed.com
Sep 22, 2024 • Pierluigi Vinciguerra
The Web Unblocker Cost Benchmark
Price comparison between the most well-known web unblockers on the market
Dec 24, 2023 • Pierluigi Vinciguerra

Web Scraping

View all
Why LLM-Ready Scrapers Return Content in Markdown: A Deep Dive
Why do all AI-ready scraping solutions produce Markdown results? Let’s find out!
Feb 22 • Antonello Zanini
THE LAB #98: Scraping Google Search Results in 2026: Device, Location, and Identity
Google does not have one set of results. It has millions. The hard part is knowing which one you are looking at.
Feb 19 • Pierluigi Vinciguerra
How to Avoid Copyright Violations While Scraping
Discover how copyright violations can occur in web scraping and how to avoid them
Feb 15 • Federico Trotta
Google vs IPIDEA: Anatomy of a Residential Proxy Takedown
Google Took Down 16 Million Proxy IPs. Here is Why It Will Not Be Enough.
Feb 8 • Pierluigi Vinciguerra
THE LAB #97: My first week with OpenClaw
160,000 Stars in Two Months: What OpenClaw Means for Scrapers
Feb 5 • Pierluigi Vinciguerra
WebDriver vs Chrome DevTools Protocol (CDP) vs WebDriver BiDi: How We Control Browsers
Do you know how browser automation libraries actually manage to control browsers? Let’s find out!
Feb 1 • Antonello Zanini
Gold Partners
Ping Proxies
Tailored to meet all your Static ISP, Residential and Datacenter proxy needs. 1 GB Trial here
Rayobyte
Rayobyte is offering a 30% discount on rack rates for residential proxies by emailing sales@rayobyte.com.

AI

View all
Build an AI Agent for Scraping and Analyzing Research Papers
Let’s build an AI agent in Python for research paper scraping and analysis.
Dec 14, 2025 • Antonello Zanini
Using NLP for Entity Extraction From Scraped Data
From theory to practice: how to extract entities from textual scraped data using NLP
Dec 7, 2025 • Federico Trotta
Using AI to Detect Patterns in Scraped Data
A practical guide on finding patterns in scraped data with advanced techniques
Nov 23, 2025 • Federico Trotta
When Browsers Start to Think: ChatGPT Atlas, Stagehand, Cursor, and the Future of Web Scraping
How recent browser integrations with LLMs are changing the way we explore and scrape the web.
Nov 2, 2025 • Pierluigi Vinciguerra
© 2026 Pierluigi · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture