The Web Scraping Club

The Web Scraping Club

THE LAB #105: If LLMs Can Bypass CAPTCHAs, Are CAPTCHA Solver Services Cooked?

Bypassing hCaptcha in the AI era

Pierluigi Vinciguerra's avatar
Pierluigi Vinciguerra
May 29, 2026
∙ Paid

Think back to 2023. The story about large language models was that they would automate most work involving reading or writing, and a good slice of the work that means looking at a screen the way you do. We bought that story too. We even wrote Are CAPTCHAs still a thing? that August, reporting on the ETH Zurich paper that claimed AI bots beat humans at reCAPTCHA v2 image challenges by roughly 15%. CAPTCHAs were on the long list of things LLMs were supposed to make irrelevant on the road to general agency.

Before proceeding, let me thank NetNut, the platinum partner of the month. Their set of solutions cover all your needs for scraping.

Visit Netnut


What the 2023 LLM hype promised, and what 2026 actually shipped

The product category that took that promise most literally is the agentic browser. A real Chromium running under an LLM that reads the page, decides what to do, and clicks. Browserbase, Hyperbrowser, Skyvern, Browser Use, BrowserOS, Owl Browser, and a long tail of proxy companies rebranding their scraping browsers as “AI-powered”. The pitch in 2024 and 2025 never changed. You hand the agent a task in plain language, and the model handles the rest, including whatever defensive challenge the site throws back.

It is 2026 now, and that prediction has not played out. CAPTCHAs are still in your pipeline. So we went looking for the answer. Can an LLM actually solve a production-grade CAPTCHA like hCaptcha? We read the code where we could, checked the default configs and the docs, and probed the public surfaces of the solver services these products lean on. The picture is consistent, and it is not the one the marketing sells.

Every major agentic browser ships a CAPTCHA-solving bullet point on its site. Open the code of the open-source agents, though, and you find a different story. Almost none of them actually use an LLM to solve a CAPTCHA. They either refuse to try, or they try and fail.


For your scraping needs, having a reliable proxy provider like Decodo on your side improves the chances of success.

Try Decodo Now


One thing to get straight before we open any code. The two families of CAPTCHA behave very differently, and the hype lands on them unevenly. The invisible ones, reCAPTCHA v3 and Cloudflare Turnstile, score your session in the background and rarely show a puzzle. A stealth-first browser on clean residential proxies usually walks past them without anyone seeing a challenge, which we covered for Turnstile in Cloudflare Turnstile: what is that and how it works? and THE LAB #73: How to Bypass Cloudflare in 2025. The visible image challenges, hCaptcha and reCAPTCHA v2, actually demand an answer. We went deep on reCAPTCHA v2 in Bypassing reCAPTCHAs With Open Source and Commercial Tools - Part 2. The one that matters in 2026 is hCaptcha, and that is where this article lives, because it is where most scrapers break.

So here is the question that drives everything below. When an agentic browser says it “solves” an hCaptcha, what does its code actually do?

Tool landscape

What’s the commercial offer today

There are four families of strategy in the commercial set. In none of them is the LLM in the agent layer doing the CAPTCHA work.

The first family is stealth-first. Residential proxies, fingerprint shaping, and request patterning lower the bot score so the visible challenge never triggers. The CAPTCHA is not solved. It is prevented from appearing. That gives you the cleanest legal posture in the set, because no automated solving is happening. ZenRows is the example.

The second family is the opposite of stealth. Instead of hiding that the traffic is automated, the vendor declares it openly and relies on a business arrangement with the CAPTCHA providers to be let through. Browserbase is explicit about this. Its Stealth Mode documentation says “through Browserbase’s partnerships with CAPTCHA providers, Browserbase can resolve challenges automatically so your sessions continue without interruption”, with solving “enabled by default for all sessions”. This is the verified-bot path, the same idea behind Cloudflare’s Web Bot Auth. A declared, allowlisted agent rather than a disguised one. No model solves a puzzle on either side. The provider recognizes the partner and waves it through. For the common challenge types, according to their documentation, this works without a fight.

The third family pairs a proprietary solver with documented third-party integrations. The vendor ships its own solving for some challenge types. For the rest, it documents how to wire in an external solver service. The external solver watches for the challenge and returns the response token through its extension or REST API. The agent then submits. Hyperbrowser and Skyvern Cloud both present this shape, a native or closed-source component plus a documented third-party path. Hyperbrowser advertises “Native Cloudflare Turnstile & CAPTCHA Solving” with “No external plugins” in its post on native CAPTCHA solving, then points to an external solver for the challenge types the native one does not cover.


Start your scraping journey with Byteful: 10GB New Customer Trial | Use TWSC for 15% OFF | $1.75/GB Residential Data | ISP Proxies in 15+ Countries

Claim your 10GB here


The fourth family runs an in-house solver alongside the agent. Could be a vision LLM, could be classical computer vision, could be something else. No third party is in the critical path. The vendor owns the whole stack. Bright Data, Oxylabs, and Owl Browser sit here. Owl Browser puts hard numbers on the claim: “detect and automatically solve reCAPTCHA v2 (1.2s), hCaptcha (0.8s), Turnstile (0.3s), and image CAPTCHAs.” Bright Data sells “AI-based unlocking logic” that handles “CAPTCHA solving, fingerprinting, retries, best headers, location and more” on its Web Unlocker page.

The table below maps each vendor to the mechanism its public documentation surfaces, with the source page that proves it.


Check the TWSC YouTube Channel


The open-source agents we can read

For the proprietary CAPTCHA-solving strategies we cannot see how AI is used. For the open-source ones we can. So we opened their code and read exactly how they handle CAPTCHAs. Three projects matter here.

browser-use is the most popular open-source LLM agent framework for browser automation. MIT-licensed, vendor-agnostic on the LLM side. The repo contains no LLM-driven CAPTCHA-solving logic. The one CAPTCHA-related file it ships, captcha_watchdog.py, does not solve anything either. It waits for a solver running in the BrowserUse cloud proxy and blocks the agent loop until that solver reports back. Run the library locally with your own model and no cloud proxy, and the watchdog has nothing to wait for. That makes browser-use the cleanest test of the claim that a local LLM agent solves the CAPTCHA by reading the page.

Skyvern OSS is the open core of the Skyvern Cloud product. AGPL-3.0, focused on form-filling and structured workflows, written in Python.

BrowserOS is a YC S24 open-source agent-driven browser. AGPL-3.0, 11k stars on GitHub, active development. It pairs a Chromium fork with an integrated agent runtime.

Modeling hCaptcha and reading what the open-source agents actually do

Before testing, it helps to model the target. hCaptcha embeds a widget on the host page through a script served from hcaptcha.com.

The widget renders inside an iframe whose origin is hcaptcha.com, cross-origin to the host page. When the user clicks the “I am human” checkbox, the widget decides whether to issue a challenge. If it does, a second iframe opens with the puzzle dialog. The puzzle layout varies between runs (3x3 grid, 4x3 grid, area-select with a single click on an image, bounding-box, multiple-choice prompt). When the puzzle is solved, the widget writes a response token to a hidden textarea[name="h-captcha-response"] on the host page. The host form reads that textarea on submit and posts the token along with the rest of the data. The whole solve interaction happens inside a frame the host page cannot script. The same-origin policy boundary blocks it.

That last detail decides what works and what does not. An agent driving the host page through Playwright or CDP has full control over the outer page. Inside the hcaptcha.com frame, its control is limited. A browser extension runs with cross-origin privileges. It can both observe and click inside the widget frame. That asymmetry explains most of what follows.

We started by reading three open-source agent repos to see what each one had decided to do at this boundary.

Skyvern OSS bails to the human

Skyvern’s README is candid about the scope of the OSS release: “All of the core logic powering Skyvern is available in this open source repository licensed under the AGPL-3.0 License, with the exception of anti-bot measures available in our managed cloud offering.” That single sentence puts the CAPTCHA section of every Skyvern Cloud feature page outside the repo you can read.

The repo confirms the framing. In agent_functions.py, lines 800-804, the auto_solve_captchas helper returns False unconditionally. In handler.py, lines 1149-1162, handle_solve_captcha_action does exactly one thing of substance.

async def handle_solve_captcha_action(
    action: actions.SolveCaptchaAction,
    page: Page,
    scraped_page: ScrapedPage,
    task: Task,
    step: Step,
) -> list[ActionResult]:
    LOG.warning(
        "Please solve the captcha on the page, you have 30 seconds",
        action=action,
    )
    await asyncio.sleep(30)
    return [ActionSuccess()]

Thirty seconds of asyncio.sleep and a log message asking a human to handle it. Then a success result, regardless of what the human actually did. The script-generation path in skyvern_page.py, lines 1252-1254 is even more direct. solve_captcha raises NotImplementedError. A Skyvern user opened issue #1117 asking how CAPTCHA solving was meant to work in OSS. A maintainer answered plainly: “We haven’t open sourced anything related to our captcha solver / anti-bot measures. We don’t want people abusing these things, so they must remain closed source unfortunately.”

The Skyvern OSS code does not pretend to solve CAPTCHAs. Whatever Skyvern Cloud does on top of this, the OSS release hands the problem to a human and moves on.

As always, the code that will use to bypass hCaptcha can be found in our GitHub repository reserved to paying users, inside the folder 105.HCAPTCHA.

User's avatar

Continue reading this post for free, courtesy of Pierluigi Vinciguerra.

Or purchase a paid subscription.
© 2026 The Web Scraping Club SRL · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture