I’m sure this already happened to you when creating a headful scraper: you run it on your machine, and it works smoothly, but then, after you deploy it on a VM or a server, it gets detected and stops working. And it doesn’t matter that you’re using the same configurations or proxy providers: the program is the same, and the IP used is a residential one, but there’s no way to make it work. The only difference is the hardware on which the scraper runs. While for browserless scrapers, this doesn’t matter, if you’re using a browser for scraping data, this can mean only one thing: the target website is marking your browser fingerprint as a suspicious one.
But what is a browser fingerprint, and how is it collected? In this post, we will explore these questions together.
Before proceeding, let me thank NetNut, the platinum partner of the month. They have prepared a juicy offer for you: up to 1 TB of web unblocker for free.
Introduction to Browser Fingerprinting
As the Mozilla developer's blog states, browser fingerprinting is a collection of techniques for identifying a browser (and, by extension, the device or user) based on its unique characteristics. Using a simple script running inside a browser, a server can collect a wide variety of information from public interfaces called Application Programming Interfaces (APIs) and HTTP headers.
This is nothing new: Cookies were (and still are) used on the web to save information about users so that websites can better understand their visitors, help them with the login process, and show more tailored ads.
In this case, instead of assigning an ID stored in a cookie, fingerprinting gathers myriad device and browser attributes (OS, browser version, language, screen size, installed fonts, etc.) via scripts to create a statistically unique profile. Unlike cookies or other client-side identifiers that leave a trail in the browser, fingerprinting is stateless – it does not store any data on the client device. This means a site can recognize a returning device even if cookies are cleared because the browser’s configuration acts as an identifier.
A quite old research from 2010 by the EFF called Panopticlick project demonstrated how effective this can be: 84% of 470,000+ browsers tested had an instantaneously unique fingerprint from basic config and version info, and with Flash/Java enabled (revealing fonts and plugins) 94% were unique. In other words, most browsers can be distinguished by their exposed settings and APIs, comparable to identifiers like cookies or IP addresses.
Today, Flash has disappeared, but one more recent research shows the same results. In this one from 2024, we can see how user demographics in the USA can be deduced from their browser fingerprints.
How can it be possible? Let’s see in detail what are the most common fingerprinting techniques used today.
Common Browser Fingerprinting Techniques
Modern fingerprinting scripts use a variety of web platform features to gather entropy (i.e. identifying information). Below are some of the most prevalent techniques and what they entail:
Canvas Fingerprinting
One notorious technique involves the HTML5 <canvas>
element. A script can draw hidden text or graphics on a canvas and then read the pixel data back, using the small differences in rendering between systems as a marker. For example, the browser will render text using the system’s font rendering engine and GPU. By instructing the browser to draw a known text string or shape and hashing the resulting pixel output, sites obtain a canvas fingerprint that varies per browser/OS/graphics combination. Because these differences are consistent on the same device but differ between devices, the canvas image hash serves as a unique ID. Canvas fingerprinting gained attention in 2014 when the famous study “The Web Never Forgets” was published, where its authors found that Canvas fingerprinting techniques were deployed on popular sites as a stealthy tracking method. Nowadays, every modern browser supports Canvas, so this is a powerful mechanism to assign a unique identifier to almost every person browsing the web.
WebGL Fingerprinting
WebGL fingerprinting is closely related to canvas fingerprinting, using the browser’s WebGL (Web Graphics Library) capabilities. WebGL allows the rendering of 3D graphics via the GPU, which introduces hardware-level variations. A fingerprint script might ask WebGL for information like the graphics renderer/driver strings or draw a 3D scene off-screen and read back pixels. Because GPUs and driver implementations differ, the output can be used to identify the device’s graphics stack. In another study called “Pixel Perfect”, the authors combined canvas text and WebGL rendering to strengthen fingerprints, effectively measuring the GPU model and driver quirks. The result was a fingerprint highly consistent for a given device, yet divergent between devices – even among machines with the same browser and OS, differences in GPU or driver version caused different WebGL outputs. Thus, WebGL provides additional entropy orthogonal to other fingerprint metrics (it adds hardware-level uniqueness beyond just user-agent strings). Many fingerprinting libraries will capture WebGL vendor/renderer info (e.g. a string like “ANGLE (NVIDIA GeForce GTX 1070...)”), which helps distinguish, say, one Windows/Chrome user with an Nvidia GPU from another with an AMD GPU.
Thanks to the gold partners of the month: Smartproxy, Oxylabs,, and Scrapeless. They’re offering great deals to the community. Have a look yourself.
AudioContext Fingerprinting
Another vector uses the Web Audio API to generate a unique audio signature. In an AudioContext fingerprint, scripts create an oscillator (producing a sound wave) and pass it through audio nodes (filters, compressors, etc.), then sample the resulting audio output (often without actually playing any audible sound).
The premise is that the audio processing pipeline—hardware, OS, driver, and browser implementation—introduces small differences in the resulting signal. The final hashed audio output can serve as a fingerprint. Just like canvas images, the same device produces a stable audio fingerprint, and different devices yield different hashes. AudioContext fingerprinting is less common than canvas or WebGL but offers another high-entropy signal, especially useful in combination with other factors.
Font and DOM Enumeration
The particular set of fonts installed on a system is a well-known fingerprinting vector. Font fingerprinting scripts will try to detect available fonts by measuring text dimensions. One method is to render the same string in dozens of fonts and use Canvas or DOM to check the width/height of the rendered text. If a font isn’t present, the browser falls back to a default font, causing a measurable size difference. By doing this for a large list of font names, scripts can infer which fonts are installed. Another technique uses special Unicode glyphs with known default shapes; the presence of a custom font changes the rendered shape or size, which can be hashed as part of a fingerprint. Since font sets vary by OS, language, and user-installed packages, this provides substantial identifying information (e.g. distinguishing two Windows 10 machines if one has Adobe Photoshop installed fonts).
DOM fingerprinting refers more broadly to examining properties of the browser’s Document Object Model and JavaScript environment that can reveal the browser type or even specific device. This can include enumerating the navigator
object for attributes like navigator.platform, navigator.hardwareConcurrency (number of CPU cores), navigator.deviceMemory, navigator.languages, time zone (Intl.DateTimeFormat().resolvedOptions().timeZone), and more. Many of these are standard APIs, but the combination can be particular to a device’s configuration. Additionally, differences in how browsers implement DOM APIs can be telling – for instance, the order of properties in navigator
or presence/absence of certain objects (window.chrome object exists in Chromium-based browsers). A fingerprint script might also examine the plugins list (navigator.plugins) or MIME types supported, which varies between browsers. Even small quirks like the precision of timers, the behavior of various JavaScript methods, or error message strings can serve as fingerprinting data points. These DOM-based attributes often comprise the “baseline” fingerprint used by libraries like FingerprintJS, alongside the more exotic canvas/WebGL signals. They tend to be easy to retrieve via JavaScript and collectively provide a fingerprint hard for a bot to fake entirely.
Device & OS Level Identifiers
Beyond pure browser APIs, some fingerprinting techniques reach into device or OS characteristics. The User-Agent string itself encodes a lot (browser name, version, OS, CPU architecture). Other examples include the device’s touch support (max touch points), screen resolution and color depth, and even available media devices (number of webcams/microphones via MediaDevices.enumerateDevices). In the past, APIs like the Battery Status API leaked information (battery level, charging time) which could be used to fingerprint a device’s power profile, though browsers have since limited that for privacy. Modern fingerprinting scripts also check for features or properties tied to specific OS/browser combinations – for instance, only Firefox exposes InstallTrigger object, only Chrome sets navigator.webdriver in automation, etc. By probing a multitude of such indicators, a script can deduce, with high probability, the exact browser and OS version and even the environment (some can detect if it’s running in a virtual machine or emulator based on inconsistencies). One of the biggest red flags we encountered when doing web scraping from a server was the number of audio and video devices equal to zero since a server doesn’t need a monitor to reproduce sounds. But this also means that a script is browsing a website and not a real person.
While any one of these attributes may not be unique, together, they significantly narrow the fingerprint. This category covers everything from CPU/GPU info to OS-specific fonts or UI metrics that seep through the browser.
Fingerprinting and Anti-Bot Systems
Beyond tracking for ads, browser fingerprinting has become a cornerstone of modern anti-bot and anti-fraud systems. Detecting malicious bots – such as web scrapers, credential stuffing tools, or automated click fraud agents – is a cat-and-mouse game. Bots often try to imitate legitimate users to slip past defenses. Fingerprinting gives defenders a way to examine subtle signals that are hard for bots to fake consistently.
One basic application is to use the fingerprint as a device identifier to rate-limit or block known bad actors. For example, if a scraper farm rotates IP addresses to avoid IP-based blocking, a consistent fingerprint can reveal that those requests all come from the same browser environment (thus likely the same bot). Anti-bot services correlate fingerprints across visits to recognize when the same client is returning under a different identity. A concrete case study is credential stuffing (mass login attempts with stolen passwords): Even if the attacker cycles through thousands of proxy IPs, if they reuse the same automated browser or toolkit, its fingerprint may remain the same. A defense system can spot that and block further attempts from that fingerprint, essentially “tagging” the bot.
More sophisticated is using fingerprinting to distinguish humans from bots in real time. Bots often run in instrumented or headless browsers which introduce minor anomalies in the fingerprint. Anti-bot scripts run in the client browser to collect a plethora of data (canvas, webGL, audio, fonts, timing, etc.) and send it back for analysis. Machine learning models or rule-based systems then analyze the fingerprint for inconsistencies or known patterns of automation. For example, does the reported User-Agent string match the fingerprint details? If the UA claims “Chrome on Windows,” but the canvas/WebGL fingerprint looks nothing like a real Windows Chrome (perhaps the GPU is reported as “SwiftShader,” which is an emulated GPU in headless Chrome), then it’s flagged.
Case studies from the field illustrate how fingerprinting thwarts bots. One notable example is how headless Chrome (Chrome running without a GUI, commonly used for automation) was quickly detected by its fingerprint. Headless Chrome initially announced itself via a navigator.webdriver=true flag, which bots learned to override. But even with obvious flags hidden, subtle differences remained in the JS environment and rendering. In the past article of The Web Scraping Club, we’ve seen how the bot detection techniques evolved using the CDP protocol and more refined techniques.
As said before, it’s a cat-and-mouse game. Even after these advances in browser fingerprinting, tools can stay undetected in most cases. Consider Camoufox or most of the anti-detect browsers on the market.
How to Modify a Browser Fingerprint and Available Tools
Given the sophistication of fingerprinting, how can web scrapers or researchers modify their browser’s fingerprint to avoid detection?
This is challenging, but there are both open-source techniques and commercial tools aiming to control or randomize the fingerprint a browser presents. Here are some approaches:
Headless Browsers vs. Real Browsers: One immediate consideration is whether to use headless browser automation or a real, full browser. Headless modes (like headless Chrome/Firefox, PhantomJS, etc.) are convenient but come with fingerprinting liabilities. As we saw, headless Chrome advertises itself via navigator.webdriver and lacks certain integrations (like plugins and proper GPU rendering) that make its fingerprint unusual by default.
.In contrast, using a real browser (headed) with automation (driving a real Chrome via Puppeteer or Playwright in non-headless mode) yields a much more native-looking fingerprint. Anti-bot systems certainly can still detect automation in a real browser, but it’s significantly harder if you take care to replicate human-like patterns. This approach has its limitations: while we can remove the navigator.webdriver flag by passing the right args at the start of Playwright
args=CHROMIUM_ARGS,ignore_default_args=["--enable-automation"]
this is, of course, not enough for hiding the hardware where the scraper is running. If your scraper is running on a data center, even if using a residential proxy, it will be easily detected since its fingerprint will be full of red flags. On the contrary, it could be enough if your scraper is running locally on your consumer-grade device.
Open-source tools like Selenium Stealth, puppeteer-extra, and Playwright’s stealth helped bypass these challenges in the past by patching the browser automation tools. However, these tools could be detected, and they have not been updated in years.
Today, Camoufox and Nodriver are the most effective open-source tools for changing the Browser fingerprint (at least in Python). The problem with open-source tools is that anyone can see their code, so they can be studied by anti-bot software engineers and detected in the long run.Anti-Detect Browser Tools and Services: anti-detect browsers are tools that, in the past, were mainly used for automating browser interactions with websites that require a consistent user profile (social media, mainly). Since they automate a browser assigning to a determined profile a legitimate browser fingerprint, from the anti-bot perspective, they seem absolutely real users. Some examples of anti-detect browsers are GoLogin, Dolphin{anty}, Kameleo, MultiLogin, and OctoBrowser, and on our new Club Deals page, you can find some interesting offers for saving bucks.
Like every other solution using a browser, there are some cons, including the limited parallelism we can use in the scraping operations.Browsers as a service: with the advent of AI and the need for a more efficient way to interact programmatically with browsers, there are so many browser-as-a-service solutions that are popping up recently, as we’ve seen in this post. In some cases, they’re launched by the same proxy companies creating web unblockers, like the Bright Data Scraping Browser, but in most cases, they are new actors that will shake the scraping industry (and not only). If they reach the same degree of effectiveness in masking bot traffic of the most “seasoned” players, it will be great for the whole web scraping community.
Final remarks
In conclusion, modifying a browser fingerprint is a complex, ever-evolving craft. Professional scrapers combine multiple strategies: using real browsers when possible, employing stealth plugins or anti-detect browsers to cover obvious tells, using browsers-as-a-service, and, most of everything, they need to stay informed about new fingerprinting techniques discovered by the research community. It’s an arms race – each time scrapers manage to evade detection, anti-bot systems devise new fingerprinting traps or ML models to catch them. For cybersecurity experts and web scraping professionals, understanding the depth and breadth of browser fingerprinting is essential. It enables the defenders to build more robust identification systems, and the attackers (or automated tool developers) to know what they’re up against and design bots that can operate under the radar.
Thank you once again, Pier, for this excellent summary!
While reading the sections on different browser fingerprinting techniques, I wanted to add a few extra insights. If you're interested in diving deeper into this topic, check out the Browser Fingerprint section of our knowledge base:
🔗 https://help.kameleo.io/hc/en-us/sections/360000880477-Browser-fingerprint-technologies
In particular, I recommend this article on Intelligent Canvas Spoofing, which explains Kameleo’s unique approach:
🔗https://help.kameleo.io/hc/en-us/articles/7021925786397-Intelligent-Canvas-Spoofing-Our-research-on-canvas-fingerprinting
Pier mentions in the article:
"You run it on your machine, and it works smoothly, but then, after you deploy it on a VM or a server, it gets detected and stops working."
If this sounds familiar, just watch the video in the linked article, and you’ll see why. With our method, it's possible to emulate a macOS device on a Windows Server while maintaining a consistent browser fingerprint. The next challenge we’re tackling is achieving a bulletproof, masked fingerprint in headless mode within Docker. We hope to soon reach the same level of success as we have on Windows Server.
Another interesting topic in the article is "Headless Browsers vs. Real Browsers." I’m happy to share that when you run our custom-built browsers, Chroma and Junglefox, in headless mode, you get the same high-quality fingerprint as in headful mode.
Lastly, a thought on Browser as a Service: If you're reading this, chances are you're already considering building your own web scraping infrastructure to cut costs. That’s exactly what you can do with Kameleo for web automation. Unlike other solutions, we don’t charge based on bandwidth or requests—only on the number of Parallel Automated Browsers you use. This lets you optimize costs efficiently.