Case Study: Finding an OIDC Callback Race Condition

This is an illustrative, sanitized scenario. Names and implementation details are generic and do not describe a customer repository or claim a measured production outcome.

1. The scenario

An authentication middleware refactor adds OIDC support. The callback handler exchanges an authorization code, creates or retrieves a session, and then calls a session API guarded by a mutex. The proposed change looks concurrency-aware because a lock is visibly present.

The review request is narrow:

Review the OIDC callback refactor for race conditions, unsafe assumptions, and missing tests.

The initial hypothesis is that locking the final session acquisition call prevents duplicate sessions.

2. Reconstruct the concurrent timeline

The relevant question is not “is there a mutex?” It is “which state transition must be atomic?” Consider two callbacks for the same browser session arriving close together:

Time	Request A	Request B
T1	Checks for an existing session: none	—
T2	Starts token exchange	Checks for an existing session: none
T3	—	Starts token exchange
T4	Creates candidate session	Creates candidate session
T5	Acquires lock for final API call	Waits for lock

By T5, the lock is too late. Both requests already observed the same precondition and created competing state. Serializing the final API call does not make the earlier read-exchange-create sequence atomic.

3. Evidence and adversarial critique

A useful review artifact separates the following:

Observed: the existing lock surrounds only the final acquisition call.
Inferred: two callbacks can pass the no-session check before either acquires the lock.
Assumed: token exchange is not already idempotent at an external boundary.
Open: whether the backing store enforces a unique session key transactionally.

The critique matters because the race may already be prevented elsewhere. A unique database constraint, compare-and-swap operation, or idempotency key could invalidate the duplicate-session hypothesis. Those mechanisms must be inspected rather than assumed absent.

4. The bounded engineering decision

If no stronger transactional guard exists, protect the full state transition that must be atomic: the existing-session check, exchange result handling, and session creation or acquisition. Keep slow external work outside a broad process-wide lock where possible; prefer a key-scoped critical section or a transactional store primitive.

Artifact field	Result
Decision	Move protection to the full per-session transition or enforce atomicity in the store.
Rejected hypothesis	Locking only the final acquisition call is sufficient.
Primary risk	Duplicate session state from concurrent callbacks.
Open check	Confirm external idempotency and backing-store uniqueness semantics.
Verdict	Needs review until the atomicity boundary and tests are verified.

5. Tests required before merge

Launch two callbacks for the same logical session and assert one resulting session.
Delay token exchange in one request to force the vulnerable interleaving.
Verify failed exchange does not leave a partial session or locked key.
Verify retries remain idempotent after a successful callback.
Exercise refresh-token behavior so the fix does not serialize unrelated sessions.

A deterministic concurrency test should control the interleaving with barriers or test doubles rather than rely on timing sleeps. Passing the test supports the specific race scenario; it does not prove the entire authentication flow safe.

Concurrency review is about the atomic state transition, not the visual presence of a lock.

The compact version also appears in Undes examples. Read the evidence-verification workflow for the general method behind this case.