Interview #10 - Germanas Latvaitis
We all know proxy providers but what's happening under the hood. Let's figure it out.
July is the Smartproxy month on The Web Scraping Club. For the whole month, following this link, you can use the discount code SPECIALCLUB and get a massive 50% off on any proxy and scraper subscription.
In this interview, I’ll try to dig deeper into how a proxy provider works and what are the technical challenges behind such a complex business.
Hi Germanas, thanks for joining us at The Web Scraping Club, I’m really happy to have you here, be prepared I’ll ask you a lot of questions about the proxy industry and its use cases.
First, tell us a bit about yourself and your company Smartproxy.
Hi, my friends and colleagues call me Germa (short for Germanas), and I’ve been working for Smartproxy as a head of product for over a year now. Previously, I’ve been building products ranging from air travel fair scraping aggregation to sports data collection and distribution for organizations like NCAA, Fifa, and Fiba.
Smartproxy is the best value proxy and scraping (data collection) infrastructure provider. It was easy for me to choose this company as I’ve been building similar products in the past; in addition, Smartproxy is a product-led organization which I really empathize with.
Proxies nowadays are necessary for web scraping but the more we get used to a service, the more we tend to trivialize it. Can you describe to us what are the challenges behind a proxy infrastructure? What’s the greatest challenge?
Building infrastructure for us as the provider is, as you said, trivial. There’re always some innovations within the proxy market that require us to adapt quickly and stay up-to-date. However, at Smartproxy, we’re great at launching new products as we have the right people to create the infrastructure.
The main challenge for us and the market is the ethical sourcing of IPs (specifically, residential, mobile, and ISP) for scraping purposes. Another significant obstacle is the increasing difficulty of scraping certain targets due to evolving technologies or the prevention of crawlers or bots. Many users struggle to adapt to rapid changes and get their IPs blocked, and that's why we introduced Scraping APIs as out-of-the-box solutions to fulfill their scraping needs.
Recently, Smartproxy launched also a mobile proxy service. Are there any different challenges for the infrastructure than datacenter or residential?
Indeed, this service brings forth new challenges that need to be addressed. As it mainly uses the residential service backend and infrastructure, the obstacle comes not from the IP pool itself, but from the methods we employ to detect and filter instances when the user, whose IP we source, is connected to a cell tower. Also, since these IPs are mostly used for social media, they require longer sessions due to occasional timeouts resulting from the nature of mobile data connections.
How does it work with a typical mobile proxy infrastructure? do you have physical dongles, virtual sims, or buy mobile IPs from a third party? Does the CGNAT add some difficulty in this case?
We acquire mobile IPs from our partners, and while some of these proxies may utilize dongles, the majority of them use real devices as their endpoints, similar to residential proxies.
I suppose that also one of the differences between datacenter and mobile proxies is that, at least in most countries like Italy, there are no “unlimited data” plans, so you need to balance in a more accurate way the bandwidth used per each IP. Is this a real challenge?
Yes, there’re some price balance challenges here. That’s why mobile proxies are more expensive than, let’s say, datacenter proxies.
Several competitors in the market offer hourly-, daily-, and weekly-based plans with higher prices, and some of them have hidden soft caps. Nevertheless, we’ve successfully achieved a balance in this regard, thanks to the R&D team and sourcing partners.
The ethical sourcing of IPs is something me and our customers really care about. Can you explain what are the standards that your partners must adhere to be considered legit?
We source a portion of our proxies independently, but we also collaborate with other providers to build and maintain our proxy pool. Our R&D team is careful to choose only legitimate and best-quality IPs.
We acquire our exit nodes from multiple providers, including wholesalers who receive their exit nodes from application owners. We require these providers to ensure that end users are both reasonably informed and have consented to such use of their devices.
What are the most common use cases for mobile proxies you’re seeing?
The majority of use cases revolve around social media automation, social media scraping, and similar scenarios, which didn’t come as a significant surprise to us.
How does your company handle legal and compliance issues related to the use of mobile proxies for web scraping?
Firstly, we’ve implemented KYC processes, ensuring only legit actors can use our
infrastructure. Additionally, we block most of the fraudulent or risky targets, such as bank endpoints, government sites, and gaming sites. We’ve established monitoring systems to check if the actors use our infrastructure for the right purposes.
Let’s say I want to enter the proxy business and want to sell to Smartproxy mobile IPS. How I can start?
This would require a conversation with our commerce team, but potentially we do have some reselling options with wholesale prices.
Our last question: any fun fact of your early days in your career or at Smartproxy?
It took me a while to understand the meaning behind our logo, which features arrows
pointing left, indicating a shift in direction – a clever representation of a proxy!
And now I also fully understood the Smartproxy logo! Thanks Germanas for all the insights given in this interview.