Building an Inventory Engine That Never Oversells Under Concurrency

2026-06-13

13 min read

Concurrency

PostgreSQL

Distributed Systems

System Design

Backend

Idempotency

Quick Commerce

Node.js

Written by Shailesh Chaudhari

Full-stack engineer with a backend focus

TL;DR: Overselling is a read-then-write race. I built Holdfast, a reservation engine that prevents it three ways (atomic conditional update, pessimistic FOR UPDATE, optimistic version CAS), keeps multi-item baskets deadlock-free by locking SKUs in a fixed order, makes placement idempotent with a unique key, shards hot SKUs to beat single-row contention, and proves it all with a chaos test that kills transactions mid-flight. Live demo + repo at the end.

Why overselling is harder than it looks

Hello everyone! I'm Shailesh Chaudhari, a backend engineer. I kept seeing the same bug in inventory code: under real concurrency, two buyers both get the "last" unit. So I built Holdfast — a small, focused reservation engine whose entire job is to never oversell, and to prove it.

The naive version looks innocent:

// reserve one unit — DON'T do this
const item = await db.query("SELECT available FROM inventory WHERE sku = $1", [sku]);
if (item.available >= qty) {
  await db.query("UPDATE inventory SET available = available - $1 WHERE sku = $2", [qty, sku]);
}

With one user, fine. With 500 users hitting it at once, it's broken. Two requests both SELECT "1 available", both pass the if, both UPDATE, and now you've sold 2 units of 1. The check and the write are two separate steps, and anything can happen between them. This is a classic read-then-write race, and it's the core correctness problem in quick-commerce, ticketing, and flash sales.

Strategy 1 — Atomic conditional update (the default)

The fix is to make the check and the decrement a single atomic statement, so the database does the guarding:

UPDATE inventory
SET available = available - $qty
WHERE sku = $1 AND available >= $qty;   -- 0 rows changed = not enough stock

The WHERE available >= $qty clause is the guard. There's no separate read, no window for a race. If the row count is 0, there wasn't enough — return out-of-stock. I also add a database CHECK (available >= 0) as a backstop, so even a bug aborts the transaction instead of going negative.

This is the default in Holdfast: one statement, no application-side read, no lock held open. It's simple and it's correct.

Strategy 2 & 3 — when you need locking or versioning

The atomic update handles a single quantity column. But sometimes you need to read more state, decide, and write back. Two more strategies cover that:

Pessimistic (SELECT … FOR UPDATE): lock the row, then decrement. Other transactions wait. Simple, and it serializes the hot row cleanly.
Optimistic (version CAS): read a version column, then UPDATE … WHERE version = $v. If someone else changed it first, 0 rows match and you retry. No locks held; great when conflicts are rare.

Which is best? I didn't guess — I benchmarked them against real PostgreSQL. On a single hot row (all buyers, one SKU), pessimistic and atomic win (~4,200 reservations/sec) because they serialize efficiently; optimistic is slowest there because it averages two attempts per success (it retries the lost races). Optimistic's advantage is the opposite workload — contention spread across many rows, where conflicts are rare and you avoid holding locks. The lesson: pick the strategy for your contention shape, and measure.

The hard part: multi-item baskets without deadlocks

A real cart reserves several SKUs at once, all-or-nothing. The trap is deadlock: cart A locks milk then waits for eggs; cart B locks eggs then waits for milk. Each holds one and waits for the other forever, and the database has to abort one.

The fix is elegant: always acquire rows in one fixed global order — sorted by SKU. If every basket grabs shared SKUs in the same sequence, no lock cycle can form. I have a test that fires 100 baskets locking the same two SKUs in opposite order and asserts zero deadlocks with a consistent ledger. Any out-of-stock line rolls the whole basket back.

Placing an order exactly once (idempotency)

Networks retry. A user double-clicks. A mobile client resends on a flaky connection. Without protection, one intent becomes two reservations. Holdfast accepts an Idempotency-Key with each request, enforced by a UNIQUE constraint:

orders (id, idempotency_key UNIQUE, ...)

The reservation runs in one transaction. If a retry arrives with the same key, the insert collides with the unique constraint, the whole transaction (including the decrement) rolls back, and we return the original order. Proven with 100 concurrent identical requests producing exactly one reservation. This is the same pattern Stripe uses for payment idempotency keys — applied to inventory.

Scaling the hot SKU

One viral product is one inventory row, which is one lock, which is your throughput ceiling. The standard fix is to shard the stock: split a SKU's quantity across N sub-rows, decrement a random shard, and fall back through the others if one is empty. Now you have N locks instead of one. I benchmarked this in Holdfast — throughput climbed roughly 3.5× at 16 shards before plateauing on the database itself (about 49,000 reservations/sec). For extreme drops, the next step is reserving in Redis with periodic reconciliation to the database, but that's only worth the added complexity once you've proven you need it.

Proving it fails closed (the chaos test)

Claiming "no oversell" is easy. Proving it under failure is the interesting part. Holdfast has a chaos test that kills PostgreSQL backends mid-transaction while buyers race. The guarantee that has to hold: when a transaction dies, it rolls back — no phantom stock, no partial reservation, the ledger stays consistent. The system fails closed: under stress it refuses orders rather than risk overselling. In quick-commerce, a refused order is a minor annoyance; an oversold one is a cancelled order and a lost customer.

Stack & honest notes

Holdfast is TypeScript on Fastify, with a deliberate hybrid data layer: Drizzle ORM owns the schema, migrations, and ordinary reads (type-safe and model-driven), while the reservation hot path is raw SQL — because the whole point is the explicit locking, which an ORM's query builder hides, and interactive ORM transactions get flaky under heavy concurrency. "ORM for productivity, raw SQL where correctness and performance demand it" is how production teams actually work. There's Prometheus metrics, a Docker deploy that self-seeds, and 14 tests against real PostgreSQL.

Two honest caveats: the throughput numbers are a single-machine local benchmark — treat them as relative, not absolute. And Holdfast is the reservation core, not a full store; payments live in separate demos by design. Every claim here is backed by a test or a benchmark you can run yourself.

Try it

The engine is live and open source. Fire concurrent requests at it and watch it hold the line:

Live demo: holdfast-50gt.onrender.com
Source: github.com/Shailesh93602/holdfast

If you're building anything that touches stock, money, or limited capacity under concurrency, these patterns — atomic guards, fixed lock ordering, idempotency keys, and failing closed — are the ones that keep you out of trouble. Thanks for reading!