Crawl of Fame - June 2026
What's an ISP proxy, how they get sourced, and how to spawn browsers for cheap
Episode #0 of this new series of posts, where I share with you the most interesting content out there I’ve read about web scraping and all the things related.
Before proceeding, let me thank NetNut, the platinum partner of the month. Their set of solutions cover all your needs for scraping.
IP sourcing in the proxy industry
Two articles written by Spur Intelligence caught my attention in the past week.
It started with a webinar I’ve attended, where they explained what ISP proxies are and how proxy companies get those IPs.
The analysis is great, and I’ve understood better how the proxy industry works.
They also mentioned how NetNut gets high-quality ISP IP addresses, and described that in detail in this article.
How Proxy Providers Co-opt Entire Networks
Spur reverse-engineers how Netnut sources its ISP proxies through a partner called DiviNetworks, which installs GRE tunnels and policy-based routing directly on the border routers of real ISPs. The result is genuine ISP IP space sold as proxy inventory, with actual subscribers still living on the same addresses. The piece is worth reading for the detection detail: outbound connections sit in the TCP source port range 40,000 to 40,200, partner routers expose an rtr-<isp>.divinetworks.com passive-DNS pattern, and the Howard University case shows an entire /16 (AS919, around 17,000 exit IPs) co-opted at the network edge. They even quote DiviNetworks’ own figure of $13,208 a month for a US /16.
Full article here
Smart TV Apps and Residential Proxy SDKs
A more recent article, instead, highlighted how proxy companies get residential proxies. No surprises, at least on my side, in reading that Smart TVs are used as exit nodes for proxies. In fact, just like with mobile apps, developers can use SDKs from proxy providers to monetize their app installations. In some cases, it’s quite clear from the consent screen what’s happening under the hood.
All the industries that rely on web-scraped data (AI in primis, but not only) need residential proxies, so proxy providers, who are the gatekeepers for these tools, should be very careful about onboarding users with legitimate use cases and keeping fraudsters away. In my career, I’ve worked with almost every big name out there, and I’ve found this is the case. As scraping professionals, we should always remember that we’re guests (sometimes unwanted) on both the target website and the proxy infrastructure, so we should be as respectful as possible and not be driven by greed to collect data.
Full article here
Scraping Infrastructure
A great article by the engineering team at Browser Use shows how they’re spawning browsers cheaply and quickly.
How We Made Cloud Browsers 3x Cheaper and Faster
Browser Use walks through rebuilding its cloud so every session is its own Firecracker microVM, and the interesting twist is that they run Firecracker on plain EC2 instead of bare metal, accepting nested virtualization to get faster scale-up and lower cost.
The numbers are great: $0.02 per browser hour down from $0.06, VM cold start under 400ms, and create latency of 825ms at p50 across a 10,000-session test.
The engineering details that are making the difference are multiple, from mapping memory in 2MB pages plus a userfaultfd handler that preloads hot pages (resume-to-ready drops from 9.8s to 3.1s, roughly 91x fewer page-fault stalls) to two-phase vCPU pinning that took a 1,000-browser launch from 17% failed sessions to zero. They also make the case for running fully headless rather than headful, since their low-level Chromium fork pushes block-avoidance to 81% on their own benchmark.
Read and share more of this
Want to flag and discover more articles like the ones above? Scraping News (still in beta) is where the community surfaces them. Sign up and submit what you find worth reading.
Want a concept map of everything covered here, cross-referenced with our own work and other sources worth your time?
The Web Scraping Club Wiki lives on GitHub, and as an interactive site. It is plain Markdown, so you can also clone it into your own Obsidian vault and read it locally.


