The distributed lock you already deployed: Azure blob leases vs. Redis and etcd

October 24, 2023

A blob lease is a distributed lock with the expiry baked into the storage primitive itself — it can’t outlive the holder, because the storage account stops renewing it. The property that decides which lock you need isn’t mutual exclusion; it’s what happens when the holder dies.

TL;DR

A worker pool needs to stop two machines from processing the same work item at once. The reflex answer is Redis (SET … NX PX) or etcd. Both mean a new dependency, a new failure domain, and a new thing to run.
Azure Blob Storage has a lease primitive — AcquireAsync(60s) on any blob returns a LeaseId or throws LeaseAlreadyPresent. That try/catch is your SET … NX. The store you already use is the lock service — and it’s not just Blob: Postgres, Redis, etcd, and S3 each expose the same atomic-claim move.
The whole mutex is one small class: acquire-or-poll, a 55s timer renewing a 60s lease, and an IDisposable that releases on scope exit. No quorum, no cluster, no extra SLA.
The lock key is per work item (e.g. {resourceId}#{stage}), not global. Two workers on different items never contend; two workers on the same item serialize. Granularity is a string.
The limits are real. Azure leases have no fencing token, so this is a liveness lock (don’t waste work), not a safety lock (never double-write). The renewal runs async void. And the fixed-lease cap is 60s, so the heartbeat is mandatory, not optional. etcd and Redlock each fix one of these — at the cost of being etcd and Redlock.

The problem: same work item, two workers, wasted effort

A service fans work messages onto a queue and lets a pool of workers drain it. Most of the time that’s exactly what you want — more workers, more throughput. The trouble is duplicate delivery. Most queues are at-least-once: a message can be redelivered after a lock renewal hiccup, a worker can be slow enough that the broker hands the same work to a second worker, and a single source event can produce several messages targeting the same output. Now two workers are processing the same work item at the same moment, each spending real time and compute to produce a result the other is about to overwrite. The work is expensive, so serializing it is worth something — that’s the job for a lock.

Aside — the lock is one of three layers, not the whole story. A lock only makes duplicate work rare; it doesn’t make a duplicate safe (that’s an idempotent write) and it doesn’t make a redelivered duplicate cheap (that’s the idempotent-consumer pattern). Those layers are table stakes here and orthogonal to the lease mechanics, so they’re spelled out at the end: the three layers ↓. The short version: get idempotency right first; the lock is a cost optimization layered on top, not a correctness substitute.

The naive fix is a lock, and the naive lock is a new piece of infrastructure. Stand up a Redis, call SET lock:file123 <token> NX PX 60000, delete on completion. Or stand up etcd, take a lease, put a key under it. Both work. Both also mean: another service to provision, monitor, patch, and pay for; another network dependency in the hot path of every single message; and another thing that can be down when your workers are up.

The insight that removes the new dependency: a lock doesn’t need a lock service. It needs one atomic claim primitive — and the stateful store your workers already talk to almost certainly has one. Reach for what’s already in your stack before you provision something new:

If you already run…	Borrow its claim primitive
Azure Blob Storage	Blob lease — `AcquireAsync` returns a `LeaseId` or throws `LeaseAlreadyPresent`
PostgreSQL	Advisory locks (`pg_try_advisory_lock`), or `INSERT … ON CONFLICT` on a lock row
Redis	`SET key val NX PX` — atomic claim with a TTL baked in
etcd / Consul / ZooKeeper	A lease + key, or an ephemeral node that vanishes when the holder dies
S3 / object storage	A conditional write (`If-None-Match: *`) — first writer wins

This post walks the blob-lease version end to end, but the shape is the same everywhere: an atomic acquire, an expiry so a dead holder doesn’t wedge the lock, and a release on scope exit. Pick the row that matches your stack.

A lease is `SET NX PX` with the deadline baked into the primitive

Azure Blob Storage lets you take a lease on a blob — an exclusive, time-bounded claim. Acquire it and you get a LeaseId; while you hold it, nobody else can. Here is the entire acquire path:

public virtual async Task<string?> AcquireLeaseBlob(string key)
{
    try
    {
        BlobClient blobClient = _leaseContainerClient.GetBlobClient(key);
        if (!await blobClient.ExistsAsync())          // the lock target must exist as a blob
        {
            using var ms = new MemoryStream(Encoding.UTF8.GetBytes(string.Empty));
            await blobClient.UploadAsync(ms);         // a zero-byte blob is a fine lock
        }

        BlobLeaseClient blobLeaseClient = blobClient.GetBlobLeaseClient();
        return (await blobLeaseClient.AcquireAsync(TimeSpan.FromSeconds(60))).Value.LeaseId;
    }
    catch (RequestFailedException e) when (e.ErrorCode == "LeaseAlreadyPresent")
    {
        return default;                               // someone else holds it — contention, not error
    }
}

Read that catch filter again, because it’s the whole idea. LeaseAlreadyPresent is not an exception in the “something went wrong” sense — it’s the lock working. Acquire returns a token or signals contention, atomically, on a store you’re already authenticated against. That is precisely the contract of Redis SET key val NX PX <ttl> — claim-if-absent with an expiry, in one atomic step. (The old SETNX is the same claim-if-absent without the expiry, which is exactly why it’s unsafe alone: a holder that dies leaves the key forever.) The difference from Redis is you didn’t deploy anything: the lock blobs live in a dedicated container, and the “lock service” is the same storage account already holding your content.

One subtlety worth pausing on: the blob has to exist before you can lease it. A lock service invents the lock on demand; blob leases lease a thing, so the code creates a zero-byte blob first. The blob’s content is irrelevant — it’s a named peg to hang an exclusive claim on.

The mutex is acquire-or-poll plus a heartbeat

Azure caps a fixed-duration lease at 60 seconds. (An infinite lease — duration -1 — also exists, but it’s the wrong tool here: it never expires, so a worker that dies holding one blocks the file forever, which defeats the whole self-healing property.) That 60s cap is the design’s center of gravity. A 60-second lock is useless for a job that takes minutes — unless you renew it. So the renewal isn’t an optimization; it’s load-bearing. The lease auto-expires shortly after the holder stops renewing, which is exactly the property you want when a worker is killed mid-flight: the lock heals itself without anyone to clean up.

A BlobLeaseLock wraps acquire, renew, and release into one scope-bound object:

// class BlobLeaseLock
public async Task<DelegatingDisposable?> TryCreate(string key, TimeSpan timeout)
{
    string? leaseId = null;
    var cts = new CancellationTokenSource();
    cts.CancelAfter(timeout);                       // bound the wait, don't block forever

    int retryCount = 0;
    while (!cts.Token.IsCancellationRequested &&
           (leaseId = await _blobAccessor.AcquireLeaseBlob(key)) == null)
    {
        retryCount++;
        await Task.Delay(1000);                      // contended: back off and retry (tune the interval)
    }

    if (leaseId != null)
    {
        // Renew the lease every 55 seconds before it expires at 60 seconds.
        var timer = new Timer(55000);
        timer.Elapsed += async (sender, e) => await _blobAccessor.RenewLeaseBlob(key, leaseId);
        timer.Start();

        return new DelegatingDisposable(async () =>
        {
            timer.Dispose();
            await _blobAccessor.ReleaseLeaseBlob(key, leaseId);   // release on scope exit
        });
    }
    return null;                                     // timed out waiting — caller decides
}

Three numbers carry the whole design:

Constant	Value	Why this value
Lease duration	60s	The Azure hard cap for a fixed lease. Not a choice — a ceiling.
Renewal interval	55s	A 5s margin under the cap. Renew too late and the lease lapses; a rival can steal the lock mid-job.
Poll backoff	~1s	The one knob you’d actually tune. A loser checks on a short interval instead of spinning; under real contention, switch to exponential backoff. Fixed-interval is fine here because contention is rare.

And the call site shows the two things that make this usable rather than just correct:

string key = $"{workItem.ResourceId}#{workItem.Stage}";
var mutex = new BlobLeaseLock(_logger, _blobAccessor);
DelegatingDisposable? disposable = await mutex.TryCreate(key, TimeSpan.FromMinutes(/* queue timeout */));
if (disposable != null) return disposable;          // hold it for the life of the message
throw new InvalidOperationException("Timeout waiting for another worker to finish the same work item.");

First, the lock key is the unit of work, not a global name. A composite key like {resourceId}#{stage} means two workers on different stages of the same resource don’t contend, and two workers on the same stage serialize. You tune lock granularity by changing a format string — no partitioning scheme, no shard map.

Second, the lock is an IDisposable tied to message scope. Acquire when you start processing a message, dispose when the message is done, and the using machinery releases the lease on every path including exceptions. The group- completed events that need no lock at all return DelegatingDisposable.None — a no-op disposer — so the lock-free fast path costs literally nothing.

And note what the loser does on timeout: it throws, which is deliberate. The tempting alternative — “another worker already has this, so drop the message and move on” — is a quiet bug. The lock holder is not guaranteed to finish: it can crash, its lease can lapse, its write can fail. If the loser acks its message on the assumption that someone else has the job covered, and the holder then dies, the job is silently lost — no one holds the lock, and the message that would have triggered a retry is gone. Throwing leaves the message un-acked, so the queue redelivers it later, where the dedupe layer sorts out whether there’s still work to do. That TimeSpan.FromMinutes(/* queue timeout */) is sized to the message’s visibility window for exactly this reason: wait up to the redelivery deadline, then hand the message back. The loser acks early only on confirmed completion, never on a presumed-busy holder.

flowchart TD
    A[Message received] --> B{AcquireLeaseBlob}
    B -- LeaseId --> C[Start 55s renew timer]
    B -- LeaseAlreadyPresent --> D[Back off, retry]
    D --> E{Timeout hit?}
    E -- no --> B
    E -- yes --> F[Throw: let the queue redeliver]
    C --> G[Process file]
    G --> H[Dispose: stop timer + release lease]
    C -. worker dies .-> I[No renewal -> lease expires ~60s -> lock frees itself]

How it stacks up against Redis and etcd

The comparison isn’t “blob leases win.” It’s “blob leases win when you’re already in Azure Storage and need a liveness lock, and lose the moment you need a safety guarantee.” Here’s the axis that actually matters.

	Azure blob lease	Redis (`SET NX PX` / Redlock)	etcd lease + txn
New infra to run	None — reuse the storage account	A Redis (or 5, for Redlock)	An etcd cluster (Raft quorum)
Lock acquire =	`AcquireAsync` / `LeaseAlreadyPresent`	`SET k v NX PX`	`Txn` on `CreateRevision==0` under a lease
Auto-expiry on holder death	Yes — lease lapses ~60s after last renewal	Yes — key TTL	Yes — lease TTL, keys deleted on expiry
Keep-alive	Manual renew (55s timer)	Manual `PEXPIRE` / Redlock extend	`KeepAlive` stream, client-managed
Fencing token	No	No (Redlock gives a monotonic value only by convention)	Yes — `mod_revision` is a real fence
Failure model	Single region/account; Azure’s own redundancy	Single-node TTL is unsafe under failover; Redlock is contested	Linearizable via Raft quorum
Watch / wait	Poll (short interval)	Pub/sub or poll	Native watch — block until freed
Right when…	You’re in Azure and want liveness, not safety	You already run Redis; ms-level locks	You need correctness under partition

One row decides most real choices: fencing token. A liveness lock prevents wasted work — usually one worker runs, and the rare double-run is merely inefficient. A safety lock promises a resource is never touched by two holders at once, and that needs a fencing token. Here’s the scenario a lease can’t survive: worker A holds the lease, stalls in GC for 70 seconds, its renewal misses, the lease lapses, worker B acquires and starts writing — then A wakes up believing it still holds the lock and writes too. No lease system stops this. Only a fencing token does: a monotonically increasing number the downstream store checks, rejecting any write whose token is lower than the highest it has seen. etcd’s mod_revision is exactly that token; blob leases — like single-node Redis, and like Redlock under Kleppmann’s critique — have no equivalent. So a blob lease is correct for “don’t waste two workers on one file” and wrong for “this file must never be written twice.” Which of those you actually need is the next section.

Which should you use?

It depends — but the “it depends” collapses to a single question, and everything else is tie-breaking.

Are you preventing wasted work, or preventing corruption?

Liveness — usually one holder runs; a rare double-run merely wastes work. Pick the cheapest lock that auto-expires on death. A blob lease if you’re already in blob storage; Redis SET NX PX if you already run Redis. Don’t pay for a consensus cluster you don’t need.
Safety — the resource must never be touched by two holders at once. You need a fencing token, which means etcd (mod_revision) or a fence you enforce in your own datastore — a monotonic version the write checks and rejects if stale. No lease — blob, Redis, or Redlock — can give you this.

That’s the whole decision. Once you know the tier:

On the liveness tier, if you…	Reach for
Already hold a blob-storage connection	Blob lease — zero new infra
Already run Redis, want millisecond locks	Redis `SET NX PX`
Have neither	Whichever atomic-claim primitive your existing store already gives you

On the safety tier there’s no tie-break: etcd, or a fence in your own store. Redlock is the trap here — it looks like safety but, per Kleppmann, doesn’t deliver it without a fencing token the downstream actually enforces.

The reason this whole pattern is worth knowing: most “we need a distributed lock” asks are secretly the liveness kind — don’t double-process, don’t waste two workers on one item. For those, the lock you already deployed beats standing up a new failure domain. Save the consensus cluster for when a double-acquire genuinely corrupts state.

Honest limits

This is a deliberately small lock, and small has edges:

No fencing token ⇒ liveness only. The load-bearing caveat: never reach for this where a double-acquire corrupts state. If you need safety, you need etcd’s mod_revision (or a fence in your own datastore), full stop.
The renewal is async void. timer.Elapsed += async (s, e) => await Renew(...) is fire-and-forget: if a renewal call throws — a transient 500, a socket reset — the exception is swallowed, the timer keeps ticking, and you may discover the lease is gone only when the next renewal returns LeaseLost. A hardened version catches inside the handler and proactively surfaces a lost lease so the holder stops working immediately.
Sync Dispose over async release. IDisposable.Dispose() invokes an async lambda; the release isn’t awaited by the using block. The lease still expires on its own within ~60s, so correctness holds, but a tighter design would use IAsyncDisposable and await using.
Polling, not waiting. Losers retry on a short interval — fine when contention is rare (the common case: workers usually land on different work items). Under heavier contention you’d switch to exponential backoff; etcd’s native watch would wake a waiter the instant the lock frees, where here you eat a little latency. The poll/renew/lease timings are tuned for low-contention coordination, not a hot lock thousands of workers fight over.
One region, one account. The lock’s blast radius is the storage account. That’s a feature (no extra failure domain) and a limit (no cross-region consensus). If the account is unavailable, so is the lock — but so is the content the workers were going to write anyway, so the dependency is already shared.

The three layers: a lock isn’t enough

The lock is the subject of this post, but on its own it’s the least important of three layers that keep duplicate work from hurting. They’re orthogonal — each covers a failure the others don’t — and getting the order right matters more than the lock mechanics.

First, get idempotency right — the lock does not replace it. At-least-once delivery is a fact of life, so the write must be safe to repeat: last-writer-wins to the same blob, or a content-addressed name, so a double-write lands the same bytes and corrupts nothing. If processing twice produces a wrong result, that’s a correctness bug a lock can’t fix — the lock is best-effort and can itself let two holders overlap (see fencing below). Assume the lock will occasionally fail, and make the write idempotent regardless.

So why lock at all? Because the work is expensive. Idempotency makes a double-process correct; it does nothing about the cost. Two workers each burning ninety seconds to render the same resource — only for one result to clobber the other — is wasted compute, not corrupted data. This is exactly Kleppmann’s efficiency-vs-correctness split: a lock taken for efficiency “saves you from unnecessarily doing the same work twice,” and a failure merely costs you a redo, not a corruption. The lock is an optimization on top of idempotency: it makes the common case cheap by serializing redundant effort, while idempotency keeps the rare case (lock fails) correct.

But the lock only covers concurrent collisions. There’s a second way the same work arrives twice: not two workers at once, but the same message redelivered later — because the first holder timed out a rival, or crashed, or the broker simply re-sent it. By the time it comes back, no one holds the lock, so the lock has nothing to say. If the redelivered worker just redoes the job, you’ve paid the full expensive recompute the lock was supposed to save — the optimization evaporates across the redelivery boundary. The fix is the idempotent consumer pattern: before doing the work, dedupe against what’s already happened. The textbook mechanism is a processed-message-ID table (insert fails on a duplicate id); the natural one here is an effect check — ask whether the result already exists (a status row, the output blob itself), since our output is cheaply observable. Either way: if it’s already done, ack and move on; if not, take the lock and do the work. That dedupe is what tells a finished job apart from a crashed one — both look identical (“no lock held, message in hand”), and only the prior effect distinguishes them. So:

Idempotent write — the correctness floor. Duplicates never corrupt.
Idempotent consumer (dedupe) — makes a redelivered duplicate cheap: skip the recompute when the work is already done.
Lock — makes a concurrent duplicate rare: serialize the workers racing right now.

Three layers, not three alternatives. The lock is the clever part, but the dedupe is what keeps the clever part from leaking its savings the moment a message comes back around.

If you take only three things

A lock service is sometimes a primitive you already deployed. Before adding Redis or etcd, ask whether a store you already authenticate against has an atomic-claim primitive. Azure blob leases, S3 conditional writes, a DB unique constraint — each can be an atomic claim-if-absent you don’t have to operate.
Decide liveness vs. safety first, because it picks the technology. “Don’t waste duplicate work” (liveness) is satisfied by any TTL lease, including this one. “Never double-write” (safety) demands a fencing token, which leases and Redlock don’t have and etcd does. Most “we need a distributed lock” asks are secretly the liveness kind.
The lock is the least important of the three layers. Idempotent writes keep duplicates correct; an idempotent consumer keeps a redelivered duplicate cheap; the lock only makes a concurrent duplicate rare. Get the first two right and a missed lock costs you a redo, not a corruption — which is exactly why a best-effort lease is enough.

Source: notes/blob-leases-as-a-distributed-lock.md @ 15cd198