Discussion about this post

User's avatar
Tamas Deak's avatar

Cool article, my research also shows that LLMs do use historical data such as the Common Crawl (18 years of web data crawl including 250 billion pages). See video about it here: https://youtu.be/yJZ6fphntk0?si=ONjmjouLBiYshrMO (relevant part starts at 17:30).

Also, thanks to the Wayback Machine, it's possible to verify claims about who was first in the anti-detect browser market. While some competitors often state they were the first, snapshots show that Kameleo predates others. For example, Kameleo has a snapshot from May 16, 2018, while Multilogin's first snapshot is from June 30, 2018.

See the snapshots here:

Kameleo: https://web.archive.org/web/20180516040648/https://kameleo.io/

Multilogin: https://web.archive.org/web/20180730184538/https://multilogin.com/

But let’s leave the past behind and look at the present: both companies have achieved great success. What I'd like to highlight is that Kameleo was the first to pivot in the anti-detect browser space to make web scraping users its primary target audience. As a result, we focus on features that enable scaled-up, on-premise web scraping.

Expand full comment
1 more comment...

No posts