Antidetect browsers?
As anti-bot and user profiling techniques are becoming more and more invasive, a new niche of browsers is born and they are called antidetect. They offer a new level of privacy to the user, with a set of features tailored to tackle these new menaces.
Fingerprints spoofing
Canvas and WebGL disabling
Proxy integration
API for Selenium/Playwright integration
Typically they let you create a set of different profiles you can use and apply them to the browsers. These profiles can be temporary and clean, so it's like a different person is browsing instead of you, with your cookies history and your real fingerprints.
Let's make a quick tour of the most famous ones and see how we can use them for web scraping.
GoLogin
Gologin is a relatively new player in the antidetect browsers industry, being founded in 2019. They have developed a browser, starting from the chromium engine, that spoofs the fingerprint and makes the user's parameters more common as possible.
You can create unlimited profiles that don’t overlap each other and you can use them both locally in the browser or virtually with APIs, the thing that makes GoLogin integrated with Selenium (link to the tutorial on their website) or Playwright.
Inside the GoLogin Github Repository you can find the code to integrate a GoLogin user profile inside a Selenium or Pyppeteer project. At the same time, I’ve written an example of integration for Playwright and you can find it in The Web Scraping Club Github Repository, together with the code of past articles. Basically, instead of opening a standard Plawright Chromium, we connect to an already open GoLogin browser instance, that can be remote or local.
We can see the difference in fingerprinting using a Playwright standard chromium instance and then a GoLogin one.
In the first case, the test is able to detect my device and my browser.
While using GoLogin, the fingerprint uniqueness changes, probably because of the noise added, and the test fails to detect my machine type and browser.
If we make the same test using a Cloudflare-protected website (
https://www.off---white.com/
), with a standard Playwright Chromium we can see the home page but not the product catalog
While with GoLogin we can see both. Test passed!
Compared to the other solutions we’ll see later, GoLogin is one of the most affordable, and with the cheapest plan, we have all we need for our web scraping projects.
Incogniton
Incogniton is pretty much similar to GoLogin, with the same features and integrations with Selenium or Playwright.
The browser, which we need to connect to with our scrapers, unluckily is available only for Mac and Windows and has no cloud run. This can be a limit for larger web scraping projects.
Since there's no free plan with API included I cannot make any tests but is one of the most well-known solutions in the field.
Octo Browser
Octo Browser is an alternative for GoLogin and Incognition, it has API for integration with Selenium or Playwright but the interesting feature is its database of real fingerprints to use on your profile.
Also, in this case, the browser is available only for Mac and Windows and has no cloud run, limiting its scalability. There’s no free plan so I could not test any solution.
VMLogin
VMLogin is another Chinese alternative to the previous antidetect browsers, a bit more expensive. The client is available only for Windows and the API is available from the cheapest plan.
Kameleo
Kameleo differentiates from the previous browsers starting from its website, where there’s an extended section about automation and integration with Selenium or Playwright. It has also, unique in this selection, a mobile app that allows using mobile fingerprints for your profiles, but unluckily the desktop client is only for Windows.
From the outside, it seems a more mature product but also the cost is one of the highest. For full automation support and API, it costs around 200$ a month per user.
Key takeaways
We have seen five antidetect browser solutions that can be integrated into our web scraping projects.
With the quick test we’ve made before, we have seen that could be an interesting option for some websites. A Cloudflare-protected website like off---white.com can be read using GoLogin, so it can be a technique worth a try for the hardest cases, but can be a problem of costs of infrastructure and licenses to scale to larger projects.
For a broader view of antidetect browsers, here’s a list of the most famous ones in this article on Zorbasmedia.
Can you test Multilogin and MoreLogin? Because I don't know how to choose between Multilogin, Gologin and MoreLogin.