The Web Scraping Club

The Web Scraping Club

THE LAB #106: Is Camoufox still effective, and do the forks help?

The project moved to CloverLabs and the fork tree keeps growing. We read the code and ran four builds against DataDome to see what still works.

Pierluigi Vinciguerra's avatar
Pierluigi Vinciguerra
Jun 04, 2026
∙ Paid

Camoufox has been our default anti-detect browser for more than a year. We said so in THE LAB #73: How to Bypass Cloudflare in 2025, and again when we put it on the level of a commercial product in the Kasada article. Lately, that confidence has started to decline. In hallway conversations at PragueCrawl, more than one person told us the same thing we had started to feel. Camoufox does not pass the harder targets the way it used to.

Before proceeding, let me thank NetNut, the platinum partner of the month. Their set of solutions cover all your needs for scraping.

Visit Netnut


Part of that is the cat-and-mouse game every stealth tool plays. Part of it is specific to open source. When the entire fingerprint-spoofing codebase is public, the anti-bot vendors can read it line by line and build the exact counter-signal. We made that argument in the rayobrowse review. The openness that made Camoufox popular is the same openness that let the anti-bot giants study it and catch up.

Two things changed in 2026 that make this worth a fresh look. First, the project moved. The repository at github.com/daijro/camoufox now carries a note at the top of its README:

Browser development is active at github.com/CloverLabsAI/camoufox and github.com/VulpineOS/VulpineOS. This repo is being used to merge checkpoint releases and should be used as the source of truth.

Clover Labs is a Toronto venture studio building AI agents, listed among the project sponsors. The alpha features (per-context fingerprints, hardware spoofing) now ship first in their cloverlabs-camoufox package, and daijro’s repo became the checkpoint mirror. This is not an abandoned project; the main maintainers changed.

Second, that public repo has more than 750 forks. Open source means that when one person stops, others can pick up the work, add features on features, and keep the chase going in parallel. So the real question is not only “is Camoufox still effective”, it is “has anyone in the fork tree built something better”. This is what we tried to discover in this article.


For your scraping needs, having a reliable proxy provider like Decodo on your side improves the chances of success.

Try Decodo Now


The forks we actually tested

We pulled the fork list from the GitHub API and sorted it by recent pushes. Most of it is noise. Many forks share the exact pushed_at timestamp of the parent, which is the signature of mirror bots that never wrote a line of their own. Once you count how many commits each fork is ahead of daijro:main and read what those commits do, the field collapses to a handful. Many of the survivors only touch CI or rebrand the binary. Three of them touch the anti-detect surface for real.

Official Camoufox (github.com/daijro/camoufox) is the baseline. A custom Firefox build with a fingerprint database and stealth patches, driven through Playwright’s Juggler protocol. We covered how it hides Playwright’s own traces in THE LAB #65, so we will not repeat that here.

camoufox-reverse (github.com/WhiteNightShadow/camoufox-reverse) goes the other way. Instead of hiding harder, it adds a PropertyTracer at the SpiderMonkey engine layer that records which DOM properties a page reads. It is an instrument for watching the detector work, not a better scraper. That makes it the most useful tool in the set for understanding what we are up against.

LeooNic/camoufox (github.com/LeooNic/camoufox) is the most ambitious on paper. Its commits add content-aware canvas noise that claims to defeat a 2025 academic pixel-recovery attack, a sigma-lognormal humanized mouse engine, and RDPBrowser, an automation path that drives Firefox over the Remote Debugging Protocol instead of Juggler.

JWriter20/camoufox (github.com/JWriter20/camoufox) is the pragmatic one. Targeted stealth fixes, the headline being a closed WebRTC IP leak under a proxy on Firefox 146 (daijro issue #538), plus a real pytest suite, which none of the others ship.

Let’s start by using camoufox-reverse to discover something more about DataDome installed on Leboncoin.fr.

What DataDome reads, watched from inside the engine

Before testing who passes, we wanted to see what the detector looks at. We have explained the three detection layers before: behavioral, browser, and HTTP, in THE LAB #6. camoufox-reverse lets us watch the browser layer from below the JavaScript, which is a view we have never had in these pages.

The PropertyTracer is documented to be enabled via a config flag. We drove the macOS arm64 build directly with Playwright, set the trace config through the CAMOU_CONFIG environment variable, and pointed it at a DataDome-protected page. Our target throughout this article is leboncoin.fr, the French classifieds site, because it runs only DataDome. That isolates the signal we care about, with no second anti-bot muddying the result.

The full probe is in code/camoufox_fork_analysis/trace_datadome.py. The core of it sets the trace and lets DataDome’s script run:

config = {
    "propertyTrace": {
        "enabled": True,
        "logDir": str(LOG_DIR),
        "objects": [],            # empty = trace all covered getters
        "maxEventsPerSession": 200000,
    }
}
env = os.environ.copy()
env["CAMOU_CONFIG"] = json.dumps(config)
env["MOZ_DISABLE_CONTENT_SANDBOX"] = "1"  # required on macOS for the tracer

When sites get tough, skip the heavy lifting. Get clean, structured CSV datasets, ready for Excel, BI or your apps

Find your dataset


The tracer writes one JSON line per getter access, each shaped like {"o": "navigator", "p": "hardwareConcurrency", ...}. Loading the leboncoin homepage produced 140 engine-level reads across 30 distinct properties. Aggregated by object and property, the access pattern contains information useful for fingerprint creation:

 COUNT  PROPERTY
    14  window.outerWidth
    13  window.devicePixelRatio
    13  window.outerHeight
    13  navigator.plugins.indexedGetter
     9  navigator.hardwareConcurrency
     7  canvas.toDataURL
     6  window.innerWidth
     6  screen.rect
     4  navigator.platform
     4  navigator.userAgent
     4  webgl.getParameter
     4  canvas2d.getImageData
     3  navigator.maxTouchPoints
     2  offscreenCanvas.getContext

Check the TWSC YouTube Channel


This is happening entirely below the JavaScript layer. From the page’s point of view, nothing was instrumented, because the recording lives in the C++ getter, not in a JavaScript proxy. DataDome reads the screen geometry, the navigator core, the plugin and mime enumeration, and then it reaches for the canvas and WebGL. Both canvas.toDataURL and canvas2d.getImageData are in the list, alongside webgl.getParameter and offscreenCanvas.getContext.

That last detail is what connects this experiment to the rest of the article. The canvas readback is exactly the surface LeooNic’s content-aware noise patch sets out to protect, and the WebRTC and screen reads are where the other forks claim improvements. We now know the detector touches it all.

The homepage is the light version. The pages that hold the data are watched far more closely, and the tracer shows it. We pointed the same probe at a car listing (a leboncoin /ad/voitures/ URL). Those pages block direct connections, so this run went through a residential proxy, which is the setup we explain in the next section. The listing loaded its real content (the page title came back as “Alfa romeo Tonale 1.5 Ibrida 175ch Veloce TCT”), so we were tracing a passing ad page, not a challenge screen. The read pattern is a different animal: 584 engine-level reads across 35 properties, against 140 across 30 on the homepage.

 COUNT  PROPERTY (ad page)
   220  document.cookie.get
    47  window.innerWidth
    30  window.innerHeight
    26  navigator.plugins.indexedGetter
    26  screen.rect
    25  sessionStorage.setItem
    22  sessionStorage.getItem
    16  document.cookie.set
    12  performance.timing
     8  window.scrollY
     7  canvas.toDataURL
     6  webgl.getParameter
     4  canvas2d.getImageData
     3  navigator.globalPrivacyControl
     1  mediaDevices.enumerateDevices

The cookie reads jump from one on the homepage to 220 on the ad page. Session storage, which the homepage barely touched, is read and written dozens of times. New surfaces appear that the homepage never queried: window.scrollY for behavior, navigator.globalPrivacyControl, and mediaDevices.enumerateDevices. The canvas and WebGL reads are still there. This is the same DataDome, running a heavier script on the page that matters. It is the concrete reason the homepage passes a clean browser while the listings do not. It also tells you where to spend your effort. The protection you have to beat lives on the content pages, not the landing page.

Setting up a fair comparison

The shared virtual environment we’re creating already had camoufox 0.4.11, which fetches the Firefox 135 official build. We ran on an Apple M2 Max, so we pulled the macOS arm64 binaries for each build, signed them ad hoc (the cross-compiled bundles need it), and pointed the same launcher at each one with executable_path.

Two version details matter for fairness. JWriter20’s WebRTC fix targets a regression introduced in Firefox 146, so we did not compare it against the 135 cache. We pulled the official v146-hardware build (Firefox 146.0.1) as the baseline and JWriter20’s own 146.0.1 build as the patched version. Same Firefox, two builds. camoufox-reverse only ships at 135, which is fine because we used it only as a tracer, not as a contender.

Every test drives the binaries the way a real user would, through the camoufox launcher with proxy and geoip set, so the fingerprint database, the locale coherence, and the stealth patches are all active. The one exception is the WebRTC probe, explained below, where the page we run matters.

The WebRTC leak that JWriter20 actually fixes

JWriter20’s headline fix is a closed WebRTC IP leak under a proxy. We checked it on the official 146 build against the JWriter20 146 build, same launcher, same Bright Data proxy, geoip=True. The probe gathers ICE candidates from a STUN server and reports any IP that escapes (webrtc_leak_test.py).

A quick detour on what those candidates are, because the whole leak lives in them. WebRTC connects two peers directly, and to do that each side has to advertise every network address it could be reached on. Each address it offers is an ICE candidate. A candidate is an IP, a port, a protocol, and a type, and it reaches JavaScript as a string like this:

candidate:842163049 1 udp 1677729535 203.0.113.25 54321 typ srflx raddr 192.168.1.45 rport 54321

Two types matter here. A host candidate is an address of a local network interface, so it carries your LAN IP. A srflx (server-reflexive) candidate is the public address a STUN server reports back when the browser asks which IP it appears to come from, so it carries your real WAN IP. A page gathers all of this with no permission. It opens an RTCPeerConnection pointed at a STUN server, calls setLocalDescription, and reads each candidate as it arrives. The key is that STUN runs over UDP, and an HTTP proxy only tunnels TCP. The STUN request leaves from the real interface, the proxy never sees it, and the srflx candidate comes back with the real WAN IP even though every HTTP request went through the proxy.

The first version of our probe ran the RTCPeerConnection on about:blank and showed both builds leaking the real IP. That was our mistake, not a result. Camoufox’s content-level injection is not active on about:blank, so we were measuring an unprotected page. Moving the probe onto a real https origin changed everything:

official-146   HTTP exit IP (proxy): 189.173.138.17
               ICE candidates: 1  [srflx] ips=['203.0.113.25']   <- real WAN IP leaks

jwriter20-146  HTTP exit IP (proxy): 93.44.185.102
               ICE candidates: 0                                <- nothing leaks

Our real WAN IP is 203.0.113.25. The official build, behind a working proxy, still hands it to any page through the WebRTC reflexive candidate. The proxy exit IP rotates on each run, so the constant 203.0.113.25 in the candidate is unmistakably the real address, not the proxy.

The fix is real, and it is baked into the binary.

We unzipped both camoufox.cfg files to confirm. The official build sets only media.peerconnection.ice.no_host. JWriter20 adds default_address_only, proxy_only_if_behind_proxy, proxy_only_if_pbmode, and obfuscate_host_addresses. Behind a proxy that cannot carry UDP, those preferences make WebRTC gather no candidates at all, so there is nothing to leak. Reproduced across two runs.

If WebRTC leaks were your problem, JWriter20 solves them. Hold that thought, because it does not end where you would expect.

The canvas patch that we could read but not run

To see why this patch exists, you have to understand the small-arms race it sits within. The PropertyTracer run above caught DataDome calling toDataURL and getImageData. Those two calls are how a canvas fingerprint is taken. A script draws the same text and shapes into an off-screen canvas on every machine, reads the pixels back, and hashes them. The drawing commands are identical everywhere. The pixels are not, because the final image depends on your GPU, your graphics driver, and how your system rasterizes fonts. That hash is stable for your device and different from the next one, which is most of what a tracker wants.

The standard way to hide is to add noise. Camoufox, Brave, Firefox’s resist-fingerprinting mode, and a long tail of extensions all nudge a few pixels so the hash will not stay constant across sites. The weakness is in how that noise is generated. If it is a fixed per-session perturbation that depends only on a seed and the pixel position, it can be undone. A 2025 paper at The Web Conference, Breaking the Shield by Hoang Dai Nguyen and Phani Vadrevu, showed exactly that against eighteen extensions and five browsers. Their Pixel-Recovery attack paints a second canvas filled with a known solid color and reads it back. Because it knows what every pixel should have been, it solves for the perturbation and subtracts it from the real fingerprint canvas. Reload ten times, and the recovered fingerprint stays constant while the noised one keeps changing. That is the proof the noise was reversible all along.

Two changes defeat the attack, and the same paper points at both. Leave the flat regions alone, so a detector that paints a solid block and reads it back finds no tampering to measure. And make each perturbation depend on the pixel content rather than its position, so there is no single value to solve for and subtract. The second idea is what Brave’s Farbling does, deriving its noise from the canvas content so two different canvases are altered differently, and it is the one defense the Pixel-Recovery attack could not reverse.

LeooNic’s patch implements both moves, and it is the most interesting code in the whole fork tree. The rewritten ApplyCanvasNoise skips flat regions and only perturbs edges, and the comments name the attack directly:

// Content-aware + content-dependent canvas noise.
//   - Tier 1 known-pixel checks (DataDome, Castle): flat regions are skipped
//     because flat_score < FLAT_THRESHOLD. fillRect(R,G,B) is undisturbed.
//   - WWW'25 Pixel-Recovery Attack (Nguyen & Vadrevu): noise depends on the
//     pixel content AND its 4 neighbors, not just (seed, index).

FLAT_THRESHOLD is the cutoff that decides what counts as a flat region. The edge pixels that survive it get a content-dependent nudge of plus or minus one, small enough to stay invisible but enough to move the hash. The logic is sound on paper. We wanted to confirm it at runtime.

We built a probe that draws two solid blocks with one sharp boundary and counts perturbed pixels in the flat interior versus the edge (canvas_fingerprint_test.py).

First, we learned that canvas noise is off by default in every current build, which lines up with the CloverLabs “Disable Canvas Noise” commit. The noise only runs when canvas:seed is non-zero. With the seed forced on the official 146 build, the original algorithm shows its tell:

official-146 (original noise)
  interior flat pixels perturbed : 9105 / 18240   (~50%)
  boundary edge pixels perturbed : 98 / 192
  max edge delta (per channel)   : 1
  hash varies across sessions    : True

The stock algorithm perturbs roughly half of every pixel, flat fills included. That is precisely the behavior a known-pixel check catches, and precisely what LeooNic set out to fix. So the official baseline is the “before” picture, captured at runtime.

The “after” picture is what we were not able to collect. LeooNic ships only a Windows binary, so we ran it on a Windows cloud box. It would not launch under Playwright at all. Every attempt, headless or headful, with the stock launcher or LeooNic’s own 0.5.0 launcher installed from source, ended the same way:

console.error: "Warning: unrecognized command line flag" "-juggler-pipe"
Remote Settings startup changesets bundle could not be extracted (JSON.parse...)
JavaScript error: AsyncShutdown.sys.mjs, line 587: uncaught exception: undefined
<process did exit: exitCode=0>

The official 135 build launches and drives fine on the same box, so the machine and Playwright are healthy. The Firefox 149 build that LeooNic publishes aborts at startup before Juggler attaches.

This is not only our environment. LeooNic’s own issue #1 is titled “fix: port patches and build system to Firefox 149.0”, an open work in progress, and daijro carries issues #620 and #572 about Juggler failing to initialize in constrained environments. The build does run through LeooNic’s native RDP path, which is the whole point of their RDPBrowser, but even there we could not activate the canvas seed. The global config is ignored, and the per-context setCanvasSeed function the build exposes only at document start was never present on the page when driven over RDP.

So we report LeooNic honestly. The content-aware algorithm is real and well-reasoned in source, and the original algorithm’s weakness is confirmed at runtime on the official build. The published Firefox 149 binary is not something you can pick up and drive with the standard stack today. For a reader choosing a fork, that is the practical signal. The innovation lives in the code, not yet in a usable artifact you can run.

The block-rate test, and the result we did not expect

Given that, we could only test the jwriter fork compared to the original version. As always, the code that will be used for testing can be found in our GitHub repository reserved for paying users, inside the folder 106.CAMOUFOX.

User's avatar

Continue reading this post for free, courtesy of Pierluigi Vinciguerra.

Or purchase a paid subscription.
© 2026 The Web Scraping Club SRL · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture