Web Scraping & Crawling
What is rate limiting?
A server-side policy that caps how many requests a client can make in a given window, returning 429 Too Many Requests when the cap is exceeded.
Rate limiting protects services from overload, abuse, and runaway clients. The server tracks request counts per identifier (usually IP, API key, or user account) and rejects new requests once the count exceeds a configured threshold. Common algorithms include fixed window, sliding window, token bucket, and leaky bucket.
When you're building against an API, the limit shows up as 429 Too Many Requests plus a Retry-After header (or sometimes X-RateLimit-Remaining and X-RateLimit-Reset). Robust clients read those headers and apply exponential backoff with jitter; naive clients hammer the same endpoint and get themselves blocked harder.
When you're building a service, rate limiting is non-negotiable past a certain scale. The questions are which dimension you bucket on (IP, key, user, route), what algorithm you use, and where the counter lives (in-memory, Redis, edge). Most modern stacks settle on sliding-window-with-Redis at the application layer plus IP-based limits at the edge.
In the wild
- →GitHub returning 5,000 requests/hour for authenticated REST clients
- →Stripe applying a 100 req/sec live-mode limit, scaling on request
- →A scraper backing off when it hits a 429 with
Retry-After: 60
How Brand.dev uses rate limiting
Endpoints in the Brand.dev API where this concept comes up directly.
FAQ
What is HTTP 429?
429 Too Many Requests is the standard status code servers return when a client exceeds the rate limit. The response should include a Retry-After header indicating how long to wait.
How do I handle rate limits in client code?
Read the Retry-After header, sleep for at least that long, and retry with exponential backoff plus jitter. Never retry immediately, that's how you get a temporary block escalated to a permanent one.
What's the difference between rate limiting and throttling?
They're used interchangeably. If you split hairs: rate limiting rejects excess traffic, throttling slows it down without rejecting. Most real systems do both.
Related terms
An Application Programming Interface, a contract that lets one program request actions or data from another in a stable, documented way.
The application protocol the web is built on, a simple request/response format for asking a server for a resource.
Programmatically extracting structured data from websites that were designed to be read by humans.
A program that systematically follows links between web pages to discover and index content at scale.
A server that forwards your network requests, presenting its own IP address to the destination instead of yours.