Refresh Tokens & the Circuit Breaker

The previous page introduced JWTs and noted that they should expire quickly. Fifteen minutes is a typical expiry. But that creates an obvious problem: if the user's token expires every fifteen minutes, do we make them log in every fifteen minutes?

Of course not. The trick is refresh tokens.

The two-token pattern

The API issues two tokens at login (or when the WebSocket needs auth):

An access token — short-lived, used on every request.
A refresh token — longer-lived, used only to get a new access token.

When the access token expires, the client uses the refresh token to silently obtain a new access token, without bothering the user. The user logs in once and stays logged in for hours or days.

The user clicks once. The browser handles the dance silently. Everything works.

In Dashify, the primary authentication is still the session cookie — but for narrow purposes (Socket.IO and a few API operations) the access/refresh pattern is used. The same shape applies to both.

The runaway-loop bug

The pattern looks innocent. It is not. The access/refresh dance has a famous failure mode that has burned many teams, including Dashify, hard enough that it earned its own engineering memory.

What goes wrong

Imagine the refresh token itself is invalid — perhaps the user's session was revoked on another device, perhaps the refresh secret was rotated, perhaps anything went weird. Now:

Browser makes an API request with an expired access token.
API returns 401.
Browser calls /auth/refresh with the refresh token.
API rejects the refresh token (it is no longer valid).
The "refresh failed" error handler decides to retry the original request.
Browser makes the API request again with… the still-expired access token.
API returns 401.
Browser calls /auth/refresh.
API rejects the refresh again.
Goto 5.

The browser hammers the API with hundreds of requests per second. The API hammers the database. Logs explode. Sentry fires a thousand events. CPU pins. Eventually the user notices the fan on their laptop.

This was real. Dashify saw a 5,000 requests in 14 seconds spike from this exact pattern in the field.

The fix — a circuit breaker

The fix is conceptually simple: once a refresh has failed, do not try again. Mark the auth state as "broken" and short-circuit every subsequent request to fail immediately, until something changes (the user logs in again).

In Dashify the circuit breaker has two parts:

Single-flight refresh. When multiple requests fire simultaneously and all hit a 401 at once, we do not start ten parallel refreshes. The first refresh starts; the rest wait for its result. If it succeeds, they all use the new token. If it fails, they all fail.

Persistent failure flag. Once a refresh has definitively failed, an authFailed flag is set in memory. Every subsequent request short-circuits to a rejection without even trying. The flag stays set until the user explicitly logs in again, at which point the auth state is rebuilt from scratch.

The flag is in-memory only — it does not persist across page reloads. A page reload triggers a fresh auth check, which either resurrects a valid session or sends the user to the login page.

Why it stays simple

The temptation, looking at this, is to make the circuit breaker clever — exponential backoff, jitter, automatic retries on a timer. We deliberately did not. Clever circuit breakers fail in ways that are hard to debug. A one-shot, persistent failure flag is the simplest thing that works, and the simplicity is a feature.

Where this lives

The circuit-breaker logic lives in two places in the client:

apiClient.ts — the lower-level HTTP client used for raw fetch calls.
baseApi.ts — the RTK Query base, used by every Redux Toolkit slice that talks to the API.

Both implement the same flag-and-single-flight pattern. They both have to, because requests can flow through either path.

Refresh token security

A refresh token is more dangerous than an access token because it has a longer life. Two things keep it from becoming a liability:

Refresh tokens are sent only to the /auth/refresh endpoint. They are not used on every request.
Refresh tokens are rotated on every successful refresh. The old refresh token is invalidated; a new one is issued. If a refresh token leaks but the legitimate user refreshes once, the leaked token is now useless.

Some implementations skip rotation. We do not. Rotation costs a database write per refresh and pays off the moment any token leaks.

You might be wondering why the session cookie does not have the same problem. The answer is that the session cookie is stateful — every check looks up the session in Redis. If the session is gone, the API responds 401 cleanly, the client redirects to login, and there is nothing to refresh. No loop.

The refresh-token loop is a hazard specific to the stateless token style. Where Dashify uses stateless tokens, it uses the circuit breaker. Where it uses stateful sessions, the protocol itself prevents the bug.

Key takeaways

The access + refresh pattern lets short-expiry tokens stay invisibly fresh.
A naive client implementation can hammer the server with thousands of requests per second when refresh fails.
Dashify's circuit breaker uses single-flight refresh and a persistent failure flag to make that bug impossible.
Refresh tokens are rotated on every use — a leaked one is killed by the next legitimate refresh.
The session-cookie path does not need a circuit breaker because it is stateful by design.

The two-token pattern​

The runaway-loop bug​

What goes wrong​

The fix — a circuit breaker​

Why it stays simple​

Where this lives​

Refresh token security​

What about the session cookie?​

Key takeaways​