Sitemap - 2024 - The Web Scraping Club

The Web Scraping Club 2024 Wrap

THE LAB #71: Sending Scrapy logs to RabbitMQ

How We Scraped Global Hotel Data to Track Economic Trends

How to Parse JSON with Python: A Beginner-Friendly Guide

THE LAB #70: Advanced logging in Scrapy

How Scraping the Web Became an Expensive Business

Optimizing Proxy Usage for Large-Scale Scraping

THE LAB #69: Building a dashboard for your scrapers with Grafana

Scraping The Inflation

Making Playwright scrapers undetected with open source solutions

THE LAB #68: Scheduling Scrapers with Airflow

How to start with Scrapy and Playwright - Part 2

THE LAB #67: Scraping Telegram using its APIs

How to start with Scrapy and Playwright - Part 1

AI and data: different faces of the same coin

THE LAB #66: How to properly scrape a booking website

Building a generic scraper for multiple websites

HTTP Toolkit, your best friend for network inspection

THE LAB #65: Scraping Datadome protected websites with Camoufox

The Zyte's Extract Summit 2024 Wrap up

THE LAB #64: JWT Tokens and API scraping

Is web scraping a profitable industry?

Building a custom GPT using Firecrawl

THE LAB #63: Oxymouse and Playwright for human-like mouse movements

The Oxycon 2024 wrap up

THE LAB #62: Bypassing Cloudflare with Nodriver

The Great Web Unblocker Benchmark - Cloudflare Edition

Proxy Pricing Playbook - September 2024

THE LAB #61: Evaluating your proxy provider

The AI-Powered web scraping tools landscape

THE LAB #60: Writing scrapers with LLMs

Open source Python libraries for your web scraping projects

The Web Scraping Club season 3!

The Lab #59: Bypassing certificate pinning with Frida and Fiddler - part 2

Two years of The Web Scraping Club

The Lab #58: Intercepting traffic from an App - part 1

Web Scraping Idealista and Bypass Idealista Blockers

The importance of scraping inventory levels data in the retail industry

Scrape like a pro... but not like an AI company

The Lab #57: Improving your Playwright scraper and avoid CDP detection

How to Scrape E-Commerce Websites With Python

Scraping Cloudflare websites using an API

Scraping Insights - A video interview series by The Web Scraping Club - Join us

Google has exclusive access to a browser API

The Lab #56: Bypassing PerimeterX 3

Legal Zyte-geist #5: The X vs Bright Data case

Web scraping and journalism: the Chiara Ferragni case

The Lab #55: Checking your browser fingerprint

Testing the new Botasaurus 4

How LLMs are affecting the costs of web scraping

The Lab #54: Scraping from Algolia APIs

The Great Web Unblocker Benchmark: Kasada edition

Analyzing the cost of a web scraping project

No-Code Web Scraping with Make.com

The Lab #53: Bypassing AWS WAF

The Anti-Detect Browser Royal Rumble - updated with notes

About LLMs, AI and Web Scraping

The Lab #52: Scraping with LLMs and ScrapeGraphAi - part 1

Legal Zyte-geist #4: Overview of the EU AI Act

Web Scraping from 0 to hero: kickstart your career in web scraping

Web Scraping and Coding: Five Programming Languages to Check Out

Scraping Akamai-protected websites with Scrapy

The Lab #51: APIs with Bearer Token

Web Scraping from 0 to hero: data cleaning processes

Celebrating the 50th article of The Lab series

The state of public web data in 2024

The Lab #49: Bypassing Cloudflare with open source repositories

Web Scraping from 0 to hero: XPATH and CSS Selectors in Web Scraping

The Anti-Detect Browser Royal Rumble

The Web Data Landscape Map

How Can Multi-Accounting Browsers Help with Web Scraping?

Web Scraping from 0 to hero: Everything about proxies

The Lab #48: Scraping with AWS Lambda

What is a web unblocker and how does it work?

The Lab #47: Scraping real time data with Python

How to Improve the Performance of Puppeteer Stealth Evasions

Easter egg: ScrapeCon 2024

Web Scraping from 0 to hero: Why my scraper is getting blocked?

The Lab #46: Fingerprint injection in Playwright

Ten years of web scraping: a personal perspective about selling web data

THE LAB #45: Bypassing Geo-fencing While Scraping

The Great Web Unblocker Benchmark: March 2024

Web Scraping from 0 to hero: Our first scraper with Selenium

What is a residential proxy?

Botasaurus: an anti-ban web scraping framework

The Lab #44: Scraping the dark web

Behind the scenes of anti-detect browsers

Web Scraping from 0 to hero: Selenium

The Lab #43: Scraping inventory data: why, how and where

Is Web Scraping Dead?

The Lab #42: Bypassing PerimeterX without a browser automation tool

Web Scraping from 0 to hero: tips and tricks for Microsoft Playwright

The Lab #41: Scrapoxy, the super proxy aggregator

Legal Zyte-geist #3: What the court’s ruling in the Meta v Bright Data case really means for web scrapers

The latest papers about browser fingerpinting

The Lab #40: start a web data monetization project with Data Boutique

Web Scraping from 0 to hero: our first scraper with Microsoft Playwright

The Lab #39: Mouse movements in Playwright

How scraping a single website costed thousands of dollars in proxy

The Lab #38: Bypassing Kasada for web scraping 2024 edition

Web scraping from 0 to hero: Microsoft Playwright

The Lab #37: Bypassing Cloudflare with anti-detect browsers - Part 2

Monetize your web scraping skills

The Lab #36: Bypassing Cloudflare with anti-detect browsers

Legal Zyte-geist #2: Web Scraping and AI 2023 Legal Wrap-Up

Web scraping from 0 to hero: creating our first Scrapy spider - Part 2

What to expect from The Lab posts in 2024

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts