Dealing with Rate Limiting Using Exponential Backoff

A rationale approach for throttling requests to minimize errors

Jun 08, 2025

Imagine this scenario: your web scraping script is making too many requests to a web server, and you get blocked by a rate limiter. What should you do? Well, there are several solutions to handle that (common) situation, and implementing a retry strategy with exponential backoff is one of the best.

But how does this strategy work, and why should you choose it over other waiting approaches? In this post, I’ll answer these questions and show you how to implement exponential backoff in both Python and JavaScript!

Before proceeding, let me thank NetNut, the platinum partner of the month. They have prepared a juicy offer for you: up to 1 TB of web unblocker for free.

Claim the offer

What Is Rate Limiting?

Rate limiting is a strategy employed by API backends and web servers to control the amount of incoming traffic. It works by tracking the number of requests a specific user makes within a defined time window. Users are typically identified by their IP address, an API token, or session information like cookies.

The primary purpose of rate limiting is to prevent automated bots from overwhelming a server with requests. This helps to avoid server overload, which could occur during DoS attacks or aggressive web scraping activities. Thus, it's a very common anti-scraping technique to prevent web scrapers from making too many requests.

If the number of requests from a user exceeds a pre-set threshold, the server typically temporarily blocks new requests from that user until the current time window resets. Once the window ends, the user is allowed to make requests again. In some scenarios, instead of an outright block, requests exceeding the limit might be slowed down or added to a queue for later processing.

Example: A server’s rate limiter is set to allow a maximum of 10 requests per minute. If you send six requests within the first 35 seconds, your 7th request (and any subsequent ones) will likely be blocked—or delayed—for the remaining 25 seconds of that minute.

When you are rate-limited and your requests are blocked, you will typically receive an HTTP 429 Too Many Requests error. In some cases, you might encounter a 403 Forbidden error, and—from what I've seen—even a 503 Service Unavailable error.

This episode is brought to you by our Gold Partners. Be sure to have a look at the Club Deals page to discover their generous offers available for the TWSC readers.

Strategies to Overcome Rate Limiting in Web Scraping

To overcome rate limiters, the two primary approaches are:

Using proxies: Route your requests through a rotating proxy network so that each request appears to come from a different IP address. That’s a common workaround in large-scale web scraping.
Implementing retries: Configure your HTTP client or browser automation tool to retry requests that fail due to rate limiting errors. The most commonly adopted waiting technique for this is the exponential backoff strategy.

Web scraping with proxies is well-known and highly effective, so it won’t be covered here. Instead, I’ll focus on how a well-designed backoff strategy can greatly improve your chances of successful scraping when dealing with rate limits!

Exponential Backoff as a Solution to Rate Limiters

Let’s say you can’t afford proxies, don’t want to use them, or simply can’t due to your project’s specifications. What can you do to deal with rate limiters?

The short answer: wait for the time window to reset so you're allowed to send requests again!

In general, when a request fails due to a rate limiter, it’s smart to wait some time before trying again. If you don’t, your chances of getting another error—or worse, being permanently blocked—go up a lot.

Now, there are a few common waiting strategies for handling failed requests:

Fixed wait: You wait a specific amount of time before retrying (e.g., always wait 2 or 5 seconds).
Linear wait: You increase the wait time linearly (e.g., wait 2 seconds, then 4, then 6, and so on).
Exponential backoff: You increase the wait time exponentially after each failed request (e.g., e.g., wait 2 seconds, then 4, then 8, and so on).

Among the three, exponential backoff is widely regarded as the most effective strategy for handling rate limiters because it:

Gives the server time to recover: By exponentially increasing the wait time between retries, it gives the server progressively more time to recover from the current load.
Adapts to the severity of the rate limit: If you hit a light rate limit, you won't wait excessively long. If the rate limit is strict, you'll back off significantly, preventing further blocks.
Reduces the risk of being permanently blocked: Slowing down after repeated failures makes your traffic appear less aggressive, lowering the likelihood of getting blocked by the target server.

Great! But how does an exponential backoff strategy actually work? Time to find out!

Need help with your scraping project?

Understanding Exponential Backoff: How Retry Timing Works

Exponential backoff is an algorithm used in computer networks to control the rate of retries after a failure. It gradually increases the time between retry attempts, in order to gradually find an acceptable rate.

The delay between retries can be modeled with an exponential function:

\(t = b^{c}\)

Where:

t is the delay before the next retry.
b is the base or multiplier.
c is the number of consecutive failed attempts.

In many real-world implementations, a small random delay—called “jitter”—is added to avoid a "retry storm" (where multiple HTTP clients retry at the same exact time). With jitter, the formula becomes:

\(t = jitter + b^{c}\)

The jitter is a small, randomly selected delay (e.g., a few hundred milliseconds) added to each retry attempt to prevent multiple HTTP clients—especially when started at the same time—from retrying simultaneously.

One of the most common versions of this algorithm is binary exponential backoff, where b = 2. Here’s how it works in a simulated scenario:

Binary exponential backoff with jitter simulation

Before continuing with the article, I wanted to let you know that I've started my community in Circle. It’s a place where we can share our experiences and knowledge, and it’s included in your subscription. Give it a try at this link.

Join the community

How to Implement Exponential Backoff

The good news is that most programming languages offer libraries that handle exponential backoff, so you don’t have to implement the waiting logic from scratch.

In this section, we’ll look at these two libraries:

urllib3.util.Retry: Used with Python’s Requests library to implement retry strategies.
axios-retry: A plugin for Axios that adds retry functionality to the JavaScript HTTP client.

See how to implement exponential backoff in two of the most common HTTP clients used for web scraping!

Exponential Backoff with Python Requests

Requests doesn’t support retries natively. However, since it uses urllib3 under the hood, you can enable retries with exponential backoff by configuring urllib3.util.Retry.

Note: That utility comes from the urllib3 package, which is added automatically when you install requests, so there’s no need to install it separately.

That’s how you can implement an exponential backoff strategy with optional jitter using Retry in requests:


# pip install requests

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

# Define a Retry strategy with exponential backoff
retry_strategy = Retry(
    total=5,                        # Total number of retries
    backoff_factor=1,               # Base delay factor (seconds)
    status_forcelist=[429, 500, 502, 503, 504],  # HTTP status codes to retry on
    backoff_jitter=0.3 # Add a random jitter of no more than 300ms
)

# Create an adapter with the retry strategy
adapter = HTTPAdapter(max_retries=retry_strategy)

# Create a session and mount the adapter
http = requests.Session()
http.mount("https://", adapter)
http.mount("http://", adapter)

# Make your requests with automatic retries and exponential backoff...

The backoff_factor argument is a factor to apply between attempts after the second try using this formula:

random.uniform(0, backoff jitter) + backoff factor * (2 ** ({number of previous retries}))

In most scraping scenarios, it makes sense to set it to 1 to avoid waiting for too long. Also, note how the library automatically embraces a binary exponential backoff (b = 2).

FYI: By default, Retry will apply the retry strategy only to idempotent HTTP methods (DELETE, GET, HEAD, OPTIONS, PUT, TRACE). You can configure that via the allowed_methods argument.

Test the above code against an API endpoint that always returns 429:

response = http.get("https://httpstat.us/429")

The script will retry 5 times and then fail with:

urllib3.exceptions.ResponseError: too many 429 error responses

To log each retry attempt, increase the urllib3 logging level to DEBUG:

import logging

logging.basicConfig(level=logging.DEBUG)
logging.getLogger("urllib3.connectionpool").setLevel(logging.INFO)
logging.getLogger("urllib3.util.retry").setLevel(logging.DEBUG)

Now, before the error, you’ll see:

DEBUG:urllib3.util.retry:Incremented Retry for (url='/429'): Retry(total=4, connect=None, read=None, redirect=None, status=None)
DEBUG:urllib3.util.retry:Incremented Retry for (url='/429'): Retry(total=3, connect=None, read=None, redirect=None, status=None)
DEBUG:urllib3.util.retry:Incremented Retry for (url='/429'): Retry(total=2, connect=None, read=None, redirect=None, status=None)
DEBUG:urllib3.util.retry:Incremented Retry for (url='/429'): Retry(total=1, connect=None, read=None, redirect=None, status=None)
DEBUG:urllib3.util.retry:Incremented Retry for (url='/429'): Retry(total=0, connect=None, read=None, redirect=None, status=None)

Exponential Backoff with Axios in JavaScript

Just like the Requests library in Python, Axios doesn’t support automatic retries out of the box. Luckily, there’s a popular plugin called axios-retry that makes that easy.

First, install it with:

npm install axios-retry

Then, configure it to use an exponential backoff retry strategy as shown below:

// npm install axios axios-retry

import axios from "axios";
import axiosRetry from "axios-retry";

// create an Axios instance
const axiosInstance = axios.create();

// apply axios-retry to the instance
axiosRetry(axiosInstance, {
  retries: 5, // total number of retries
  shouldResetTimeout: true, // to reset the client timeout across retries
  retryDelay: (retryCount, error) => axiosRetry.exponentialDelay(retryCount, error, 1000), // to configure an exponential backoff strategy
  retryCondition: (error) => {
    // to fail in case of custom HTTP error codes or network errors
    const status = error?.response?.status;

    // retry on specific HTTP status codes
    const retryStatusCodes = [429, 500, 502, 503, 504];
    if (status && retryStatusCodes.includes(status)) {
      return true;
    }

    // retry on network errors
    if (error.code) {
      const networkErrors = [
        "ECONNABORTED",
        "ENOTFOUND",
        "ECONNREFUSED",
        "ECONNRESET",
        "EPIPE",
        "ETIMEDOUT",
        "EHOSTUNREACH",
        "EAI_AGAIN",
      ];
      if (networkErrors.includes(error.code)) {
        return true;
      }
    }

    // in all other scenarios, do not retry
    return false;
  },
  onRetry: (retryCount, error, requestConfig) => {
    console.log(
      `Retrying request to "${requestConfig.url}" due to error "${error.message}". Attempt #${retryCount}`
    );
  },
});

// make your requests with automatic retries and exponential backoff...

As explained in the docs, the exponentialDelay() function implements an exponential backoff strategy. By setting the retry base delay to 1000 milliseconds, the delay (in milliseconds) before each retry is calculated using this formula:

1000 * (2 ** ({number of previous retries}))

Pro tip: Setting shouldResetTimeout to true is essential when your Axios instance has a specific timeout configured. For example, when dealing with slow web servers that may take several seconds to respond, you want to consider a request failed after the timeout and then retry it. If you don’t set shouldResetTimeout: true, the timeout is not reset between retries, causing the request to eventually fail due to a connection timeout error before new retry attempts can be performed.

Now, execute a request against a sample endpoint that always returns HTTP 429:

(async () => {
  try {
    const response = await axiosInstance.get("https://httpstat.us/429");
    console.log("Response:", response.status);
  } catch (error) {
    console.error(error.message);
  }
})();

You’ll get this output:

Retrying request to "https://httpstat.us/429" due to error "Request failed with status code 429". Attempt #1
Retrying request to "https://httpstat.us/429" due to error "Request failed with status code 429". Attempt #2
Retrying request to "https://httpstat.us/429" due to error "Request failed with status code 429". Attempt #3
Retrying request to "https://httpstat.us/429" due to error "Request failed with status code 429". Attempt #4
Retrying request to "https://httpstat.us/429" due to error "Request failed with status code 429". Attempt #5
Request failed with status code 429

Amazing! The exponential backoff retry strategy is now implemented exactly as desired.

Check the TWSC YouTube Channel

Challenges and Best Practices in Exponential Backoff

Set appropriate timeouts: Verify that your client’s connection and request timeouts are properly configured to avoid timeout errors caused by long waits during retries.
Analyze rate limiting headers: Check server responses for standard rate limiting headers (e.g., Retry-After or RateLimit-*). Based on my experience, most web servers don’t send these headers, but their API backends (used by web pages to retrieve data) sometimes do. Use these headers to dynamically adjust your retry timing.
Respect robots.txt rules: Always honor the robots.txt file, which may include directives about how many requests you’re allowed to make in a given time span.
Beware of dynamic rate limiting windows: Some sites use dynamic time windows where repeated failures cause the request intervals to shorten. In such cases, exponential backoff alone isn’t generally enough. Consider integrating proxies—or even Tor—to distribute the load and avoid IP-based blocks.
Monitor and log retry attempts: Keep detailed logs of retry attempts and backoff timings to fine-tune your retry strategy.

Extra: IP Rotation Ins’t Always the Solution

Some (rare) websites apply rate limits based on specific queries—not on IP addresses. For example, the Royal Mail tracking page restricts access to individual parcel tracking URLs, regardless of who is making the request.

In other words, the limitation is tied to the unique URL, not the IP address. This is probably done to prevent users from repeatedly refreshing the same page.

In such cases:

The server might return rate limiting headers (like Retry-After)—so be sure to inspect the response headers.
If no headers are present, a general exponential backoff strategy is usually enough to avoid hitting blocks.

Conclusion

The goal of this post was to explain how exponential backoff works and how it can be used to intelligently and respectfully deal with rate limits. As you’ve seen here, you can easily implement it in both Python and JavaScript thanks to built-in or third-party libraries.

I hope you found this technical guide helpful and learned something new today. Feel free to share your thoughts or experiences in the comments—until next time!

The Web Scraping Club

Discussion about this post