Scraping Through Tor for Increased Anonymity
Learn how to route your web scraping requests through a local Tor SOCKS proxy to get free IP rotation and enhanced anonymity.
You probably already know that Tor enables you to browse the Internet anonymously. It’s more complex than that, of course—but that’s about all I knew about the project a few years ago. Then I had an idea: why couldn’t I connect to Tor and use it to anonymize my IP while web scraping?
So I went down the rabbit hole and discovered that it’s totally possible. Here, I’ll show you the approach I followed to configure Tor as a local service and use it as a local SOCKS proxy for free IP rotation in your scraping scripts.
Before proceeding, let me thank NetNut, the platinum partner of the month. They have prepared a juicy offer for you: up to 1 TB of web unblocker for free.
Why Is Tor Useful for Web Scraping?
Tor, short for “The Onion Router,” is a decentralized network that anonymizes Internet traffic by routing it through a series of volunteer-operated servers using a technique called onion routing.
In detail, the onion routing method provides strong anonymity by masking your real IP address, making it very difficult for servers to trace the origin of your requests. In practice, Tor has a similar effect to using a proxy server—one of your biggest allies in web scraping.
When configured as a local service, Tor functions like a rotating proxy that automatically provides a new exit IP every few seconds. That's great to avoid rate limiters and IP bans.
Plus, unlike trusted proxy providers—which typically charge based on bandwidth or number of IPs—Tor is completely free to use. And unlike free online proxy services (which can't usually be trusted, as running a proxy network is costly and you should apply the rule “if something is free, you’re the product”), Tor is maintained by a global community of privacy-focused volunteers committed to open access and anonymity.
Thanks to the gold partners of the month: Smartproxy, IPRoyal, Oxylabs, Massive, Rayobyte, Scrapeless, SOAX, ScraperAPI, and Syphoon. They’re offering great deals to the community. Have a look yourself.
How to Use Tor for Web Scraping
To use Tor for web scraping, you must first configure it as a local service. Keep in mind that this procedure is different from installing the Tor Browser. In particular, what you need is the Tor daemon running in the background as a system service on your machine.
To configure the Tor service locally, follow the appropriate guide for your operating system:
Linux and macOS: Tor service on Linux (but works on macOS, too)
Windows: How to install Tor and create a Tor hidden service on Windows
After installation, the Tor service will be listening as a SOCKS proxy on port 9050.
In-depth insight: A SOCKS proxy operates at a lower level compared to a traditional HTTP/HTTPS proxy. Thus, a SOCKS proxy allows you to forward all types of web traffic—including HTTP, HTTPS, but also FTP, POP, IMAP, and other application-level protocols.
To verify that the Tor service is running, on Linux/macOS run:
netstat -tnlp | grep 9050
The result will be something like:
Equivalently, on Windows, run:
netstat -aon | findstr :9050
This time, you should view something like:
In both cases, you can see a process listening on port 9050. That's the Tor SOCKS proxy!
Note: The Tor service attempts to build a new Onion circuit (which potentially gives you a new exit IP) every 30 seconds. This makes it act like a rotating proxy out of the box.
In detail, http://localhost:9050 is the local SOCKS proxy endpoint you’ll connect your web scraping tool or HTTP client to in order to route traffic through the Tor network.
Advanced Settings
If you'd like to change how frequently Tor creates new circuits, you can modify the NewCircuitPeriod setting in your torrc configuration file. For example, you can set it to 10 seconds as follows:
NewCircuitPeriod 10
Keep in mind that lowering the NewCircuitPeriod setting too much may lead to connection instability, putting unnecessary strain on the Tor network.
Take a look at the other configuration options supported by the Tor local service in the official docs. After modifying the torrc file, remember to restart the Tor service for the changes to take effect.
Using Tor for Web Scraping: A Practical Guide
Now that you have the Tor service running locally, you’re ready to connect to it as a SOCKS proxy. As a target site, we’ll use the /ip endpoint from the HTTPBin project, which returns the IP address of the client making the request.
If the Tor SOCKS5 proxy works as expected, you should see an IP address that’s different from your original one.
Run this cURL command in the terminal to see your actual IP address:
curl https://httpbin.org/ip
Note: On Windows, use curl.exe instead of curl (which is an alias for Invoke-WebRequest).
The expected result is something like:
{
"origin": "203.8.111.32"
}
That’s your IP address (or the one currently assigned by your VPN).
Now, rerun the same request, but this time using the Tor SOCKS5 proxy listening on port 9050:
curl -x socks5://127.0.0.1:9050 https://httpbin.org/ip
This time, the result will be something like:
{
"origin": "185.220.101.5"
}
Notice how the returned IP is different from your actual one, which means the Tor proxy is working like a charm!
Also, that new IP address belongs to a Tor exit node:
Note: If you wait about 30 seconds (or whatever interval is set via NewCircuitPeriod in your torrc file) and rerun the command, you’ll likely see a different IP address. That’s because Tor has built a new circuit with a new exit node for you.
Up next, I’ll show you how to connect programmatically to the Tor SOCKS proxy using Python and JavaScript for web scraping.
Using the Tor SOCKS Proxy in Python with Requests
Requests, the most popular Python HTTP client, natively supports SOCKS proxies via the requests[socks] extra plugin. Install it with:
pip install 'requests[socks]'
Here’s how to send a request to your target page through the Tor SOCKS proxy:
# pip install requests 'requests[socks]'
import requests
# Configure the proxy to use the Tor SOCKS service running on localhost:9050
proxies = {
"http": "socks5://127.0.0.1:9050",
"https": "socks5://127.0.0.1:9050",
}
# Make a request to an API endpoint that returns your IP
response = requests.get("https://httpbin.org/ip", proxies=proxies)
# Print the endpoint response
print(response.text)
Note that the way you specify a SOCKS proxy in Requests is the same as for an HTTP/HTTPS proxy. The only difference is the protocol in the proxy URL (socks5:// instead of http:// or https://).
Run the above Python script, and the result will be something like:
{
"origin": "185.220.101.8"
}
Again, that IP belongs to a Tor exit node.
Great! You now have a local rotating proxy to use for your scraping goals.
Using the Tor SOCKS Proxy in JavaScript with Axios
Axios, one of the most widely used HTTP clients in JavaScript, doesn’t support SOCKS proxies out of the box. To route your requests through the local Tor SOCKS proxy, you need to install the socks-proxy-agent package:
npm install socks-proxy-agent
Now, here’s how to configure Axios to send requests through the Tor proxy:
// npm install axios socks-proxy-agent
const axios = require("axios");
// ESM modules: import axios from "axios"
const { SocksProxyAgent } = require("socks-proxy-agent");
// ESM modules: import { SocksProxyAgent } from "socks-proxy-agent"
// configure the Tor SOCKS proxy running on localhost:9050
const socksProxyAgent = new SocksProxyAgent("socks5://127.0.0.1:9050");
async function getTorIp() {
// make a request to an API that returns your IP through the Tor proxy
const response = await axios.get("https://httpbin.org/ip", {
httpAgent: socksProxyAgent,
httpsAgent: socksProxyAgent,
});
// print the API response
console.log(response.data);
}
// execute the async function
getTorIp();
Launch the above JavaScript script, and the output will be something like:
{
"origin": "192.42.116.192"
}
Amazing! Another Tor exit IP.
Limitations of Routing Traffic Through Tor for Web Scraping
Before diving into why using Tor in a production scraping pipeline might not be ideal, let me share a real story. A few years ago, I was hired to scrape a site with millions of pages and strict rate limits for a startup in its early days. As you can imagine, the budget was tight—so I had to get creative…
That’s when I first explored Tor as an alternative to paid proxies!
The main problem was that establishing a new Tor circuit takes time (up to 10 seconds, in my experience), and exit Tor IPs were frequently rate-limited or temporarily banned. So I took a different approach: I configured 50 Tor local services, each listening on a separate port (9050, 9051, etc.). While this setup increased memory usage on the server running the scraper, it gave me access to a rotating pool of around 50 rotating proxies—for free.
Now, this setup didn’t guarantee that each instance would get a different or working exit IP. Still, it was enough to get the job done. In a few days, I scraped millions of rate-limited pages for free.
You might now be wondering: “Cool, Tor gives me a rotating proxy pool at zero cost—so why buy proxies at all?”
Well, it’s not that easy. Compared to premium proxy providers, Tor comes with some real limitations:
While you can request a new circuit, there’s no guarantee it will be fast, that the connection will work, or that the new IP is actually different. Basically, you have very limited control over the IP rotation mechanism.
You can’t choose the country of your Tor exit IP. That’s fine for bypassing geo-restrictions in your country, but not for targeting country-specific content (something residential proxies handle well).
Based on my experience, the Tor network is significantly slower and more error-prone than most proxy networks. Thus, you’ll need patience and some sort of retry logic to handle failed requests.
And perhaps the biggest issue: Tor exit IPs are publicly listed. This means anti-scraping solutions can easily block them all, with no chance for recovery.
Does that mean Tor has no place in scraping? Not at all. Many websites only apply generic rate limits and don’t specifically block Tor IPs. If you're just getting started or need a free rotating proxy solution for projects where speed or location doesn’t matter, Tor is absolutely worth trying.
Conclusion
The goal of this article was to show how Tor can be used as a free rotating proxy for web scraping. By configuring it as a local service, you can access it as a SOCKS proxy listening on port 9050 in your Python, JavaScript, or any other language-based scraper.
All instructions shared here are for educational purposes only. Please use them responsibly and avoid sending excessive requests that could overload the Tor network, which is maintained by a volunteer-driven community.
I hope you found this deep dive into the anonymous world of Tor helpful. Feel free to share your thoughts or experiences in the comments—until next time!