Skip to main content

Rate Limiting

If you let a single client send unlimited requests per second to your API, two bad things happen. The first is brute force, an attacker tries every password for an account, every recovery code, every TOTP value, until something works. The second is denial of service, a single misbehaving client (malicious or just buggy) consumes so many resources that legitimate users see slow responses or errors.

Rate limiting solves both by capping how many requests a single source can make in a given time window. This page explains where the caps live, how they work, and how Dashify tunes them.

The basic idea

"You may make at most 5 login attempts per minute per IP address."

That sentence is a rate limit rule. It has three parts:

  • A bucket key, what we are counting against. Often the IP address, sometimes the user id, sometimes the email address.
  • A limit, the number of requests allowed.
  • A window, the time period over which the limit applies.

When a request arrives, the rate limiter increments the counter for that bucket. If the counter is at or above the limit, the request is rejected with 429 Too Many Requests. Otherwise the request proceeds.

After the window passes, the counter resets (or, more commonly, slides forward).

How Dashify implements it

The library is rate-limiter-flexible, backed by Redis. Redis is the right place to store rate limit counters because it is fast, atomic, and shared across multiple API instances behind a load balancer (so a client cannot cheat by hitting different instances).

The increment and the limit-check are atomic Redis operations, so two simultaneous requests cannot both squeak under the limit.

Where rate limits apply

Not every endpoint needs rate limiting, adding a tight limit on every API call would frustrate real users. Dashify applies rate limits where the cost of abuse is high.

Endpoint familyLimitWhy
POST /auth/login5 / minute / IP, 10 / minute / emailBrute-force protection
POST /auth/2fa/verify5 / minute / userBlock guessing the 6-digit code
POST /auth/forgot-password3 / hour / emailBlock spamming reset emails
POST /auth/sso/callback10 / minute / IPBlock SSO replay floods
POST /webauthn/login/finish10 / minute / userBlock passkey replay/probing
POST /api/v1/* (general)120 / minute / IPBlock runaway clients
GET /api/v1/* (general)600 / minute / IPMore generous for reads

The limits are tuned for actual usage, a normal user never comes close to them. They only kick in for attackers and broken clients.

Per-IP vs per-user

A subtle question: should the bucket key be the IP, the user, or both?

  • Per-IP is good against a single attacker on a single network. It can be unfair to large NATs (a whole office sharing one external IP) but the limits are loose enough that real offices rarely hit them.
  • Per-user (or per-email) is good against attackers rotating IPs. If an attacker is trying to brute force one specific account, they will hit the per-email limit no matter how many IPs they use.

The login endpoint applies both, per-IP and per-email simultaneously. Whichever fills up first triggers the rejection.

What the client sees

A 429 response includes a Retry-After header telling the client how many seconds to wait before trying again. The browser app (RTK Query) reads this header and surfaces it to the user as a friendly "please try again in N seconds" message. It does not silently retry, that would defeat the purpose.

Burst tolerance

Strict rate limiting can hurt legitimate spiky workloads, a user opening the dashboard might trigger fifteen API calls in a single second to fill all the widgets. A pure "120 per minute, no exceptions" limit could fail.

rate-limiter-flexible supports a leaky-bucket mode where short bursts are permitted as long as the long-term average stays under the limit. Dashify uses this for general API calls. Authentication endpoints use the strict mode, there is no legitimate reason to fire ten login attempts in two seconds.

Distributed enforcement

When Dashify scales horizontally, multiple API instances all consume the same Redis instance for rate limiting counters. An attacker hitting different load-balancer-routed instances still hits the same counter and is throttled correctly.

If Redis briefly goes away, the rate-limiter falls open by default, meaning requests are allowed through. This is a deliberate tradeoff: better to be temporarily un-rate limited than to reject every request. The Redis page covers this in more detail.

Audit and observability

Every 429 is logged with the bucket key, the endpoint, and the IP. Prometheus increments a counter, Grafana dashboards show 429 rates over time. A sudden spike in 429s on the login endpoint usually means someone is trying to brute force a specific account and is worth investigating.

For severe abuse (say, an IP firing 10,000 requests an hour), the platform supports a longer-lived soft block, the IP is blacklisted for an hour rather than being rate limited per-request. This prevents persistent attackers from creating noise even if they are technically below the per-request threshold.

What rate limiting does not solve

Rate limiting is one defence. It does not stop:

  • A botnet of 10,000 IPs each sending one request per minute. (Aggregate volume can still hurt.) The mitigation is a CDN with bot-management.
  • An attacker who has already obtained valid credentials. The mitigation is 2FA, anomaly detection, and audit logging.
  • A logic bug in the API that causes a single legitimate request to be expensive. The mitigation is performance monitoring.

Rate limiting raises the cost of attack and limits the blast radius of accidents. It is necessary, not sufficient.

Key takeaways

  • Rate limiting caps how many requests a single source can make in a given time window.
  • Dashify uses rate-limiter-flexible backed by Redis for atomic, distributed counters.
  • Auth endpoints use strict limits per-IP and per-email; general endpoints use a more generous leaky bucket.
  • 429 responses include a Retry-After header so well-behaved clients back off correctly.
  • The system fails open if Redis is briefly unavailable, better degraded than dead.