loke.dev
Header image for A Surgical Fix for Row-Level Deadlocks: Why Your Task Queue Needs 'SKIP LOCKED' to Scale

A Surgical Fix for Row-Level Deadlocks: Why Your Task Queue Needs 'SKIP LOCKED' to Scale

Transitioning your background workers from standard locking to non-blocking concurrency is the only way to prevent the dreaded 'Thundering Herd' from stalling your database.

· 8 min read

I remember staring at a Grafana dashboard three years ago, watching the "Database Connections" metric spike vertically while "Tasks Processed" plummeted to zero. We were running a fleet of twenty background workers, all polling a standard PostgreSQL tasks table. I thought I was being clever by using a simple status column to coordinate work. I’d run a query to find pending tasks, mark them as 'processing', and then get to work.

Under light load, it was fine. But the moment we hit a surge in traffic, the system didn't just slow down—it effectively committed suicide. Workers were tripping over each other, waiting for row locks that would never release, and eventually throwing deadlock errors that rolled back entire batches of work. I was essentially asking twenty people to grab the same single pen to sign twenty different documents at the exact same time. It was a mess.

The fix wasn't more RAM or a bigger RDS instance. It was a two-word addition to our SQL queries: SKIP LOCKED.

The Naive Approach and the Thundering Herd

Most of us start building a task queue the same way. We have a table that looks something like this:

CREATE TABLE task_queue (
    id SERIAL PRIMARY KEY,
    payload JSONB,
    status VARCHAR(20) DEFAULT 'pending',
    locked_at TIMESTAMP,
    attempts INT DEFAULT 0
);

When a worker is ready for work, it tries to grab a batch of tasks. The "intuitive" logic usually looks like this:

-- The "I think this works" approach
UPDATE task_queue
SET status = 'processing', locked_at = NOW()
WHERE id IN (
    SELECT id 
    FROM task_queue 
    WHERE status = 'pending' 
    LIMIT 10
)
RETURNING *;

This query is a ticking time bomb. Here’s why: Postgres (and most RDBMS) performs the SELECT first. If you have ten workers running this simultaneously, they might all see the same ten "pending" rows. When they attempt to UPDATE, the database engine forces them to wait for one another to ensure data integrity.

This is the Thundering Herd problem. Worker A locks the rows. Worker B waits for Worker A. Worker C waits for Worker B. If Worker A takes too long or if Worker B and C try to lock the rows in a slightly different order due to an index scan, you get a deadlock. Even if you don't deadlock, you’ve turned your parallel processing system into a sequential one. You aren't scaling; you're just queuing.

Understanding Row-Level Locking

To understand the solution, we have to look at what's happening under the hood. When you run SELECT ... FOR UPDATE, you are telling the database: "I intend to change these rows. Give me exclusive access."

If Worker 1 has an exclusive lock on Row 5, and Worker 2 executes a query that *includes* Row 5, Worker 2 will stop and wait. It doesn't matter if there are 10,000 other rows ready for processing. Worker 2 is stuck at the gate because Row 5 is currently occupied.

In a high-throughput environment, this "waiting" behavior creates a bottleneck that compounds. As latency increases, more workers pile up, more locks are requested, and eventually, the database spends more time managing lock contention than actually moving data.

The Surgical Fix: SKIP LOCKED

In PostgreSQL 9.5+ and MySQL 8.0.1+, a beautiful feature was introduced specifically for concurrency: SKIP LOCKED.

When you add SKIP LOCKED to a SELECT ... FOR UPDATE query, it changes the fundamental behavior of the lock request. Instead of waiting for a locked row to become available, the database simply ignores it and moves on to the next available row that satisfies the query.

Here is how you write a truly scalable worker query:

BEGIN;

-- Find 10 rows that aren't currently locked by anyone else
SELECT id 
FROM task_queue 
WHERE status = 'pending'
ORDER BY id ASC
LIMIT 10
FOR UPDATE SKIP LOCKED;

-- Now perform the update for the IDs we actually grabbed
UPDATE task_queue 
SET status = 'processing', locked_at = NOW()
WHERE id = ANY(array_of_ids_from_above);

COMMIT;

With SKIP LOCKED, your workers never wait. If Worker 1 is processing IDs 1-10, Worker 2 will immediately jump to IDs 11-20. There is no contention, no waiting, and—most importantly—no deadlocks.

Why Not Just Use a Message Broker?

At this point, you might be thinking, "Just use RabbitMQ or SQS." That's a valid argument, but it ignores the "Transactional Integrity" benefit of a database-backed queue.

If your task involves updating a user's balance *and* scheduling a confirmation email, doing both in a single Postgres transaction ensures that you never send the email if the balance update fails. If you use an external message broker, you have to deal with the "Dual Write" problem. You might update the DB but fail to send the message, or vice versa.

Keeping the queue in the database allows for Atomic Task Processing. You grab the task, update your business logic, and mark the task as complete all within the same COMMIT. If the worker crashes halfway through, the transaction rolls back, the lock is released, and another worker picks up the task automatically. It’s incredibly robust.

A Practical Python Implementation

Let’s look at how this looks in a real application. I’ll use Python with psycopg2, but the logic translates to any language with a decent DB driver.

import psycopg2
import time
from psycopg2.extras import RealDictCursor

def fetch_and_process_tasks(conn):
    try:
        with conn.cursor(cursor_factory=RealDictCursor) as cur:
            # Step 1: Claim the tasks using SKIP LOCKED
            # We use a CTE (Common Table Expression) to do this in one go
            cur.execute("""
                UPDATE task_queue
                SET status = 'processing', 
                    locked_at = NOW(),
                    attempts = attempts + 1
                WHERE id IN (
                    SELECT id 
                    FROM task_queue 
                    WHERE status = 'pending'
                    AND (locked_at IS NULL OR locked_at < NOW() - INTERVAL '5 minutes')
                    ORDER BY id ASC
                    LIMIT 5
                    FOR UPDATE SKIP LOCKED
                )
                RETURNING *;
            """)
            
            tasks = cur.fetchall()
            
            if not tasks:
                return False

            for task in tasks:
                print(f"Processing task {task['id']} with payload {task['payload']}")
                # Perform actual work here...
                
                # Step 2: Mark as completed
                cur.execute(
                    "UPDATE task_queue SET status = 'completed' WHERE id = %s", 
                    (task['id'],)
                )
            
            conn.commit()
            return True

    except Exception as e:
        conn.rollback()
        print(f"Error occurred: {e}")
        return False

# Example of a worker loop
def worker_loop():
    conn = psycopg2.connect("dbname=mytasks user=postgres")
    while True:
        work_found = fetch_and_process_tasks(conn)
        if not work_found:
            time.sleep(1) # Back off if there's no work

In the query above, notice the AND (locked_at IS NULL OR locked_at < NOW() - INTERVAL '5 minutes'). This is a crucial safety net. It allows the system to "re-queue" tasks that were picked up by a worker that subsequently died or timed out. Because we are using SKIP LOCKED, we don't have to worry about these "stuck" tasks blocking new work.

The Visibility Gotcha

There is one quirk with SKIP LOCKED that often trips people up: it affects how LIMIT is applied.

Imagine you have a WHERE clause that is quite restrictive, and many of your rows are currently locked. If you ask for LIMIT 10, Postgres will scan until it finds 10 *unlocked* rows. However, if your index isn't optimized, the database might have to scan thousands of locked rows to find those 10.

Usually, this isn't an issue for a task queue where rows are processed and deleted (or moved to a different status) quickly. But if you have millions of rows in the processing state and very few in pending, your SELECT could become slow.

The Fix: Always ensure you have a partial index on your queue table.

CREATE INDEX idx_task_queue_pending 
ON task_queue (id) 
WHERE status = 'pending';

This index makes the "scan" for available tasks nearly instantaneous, as the database doesn't even have to look at the "completed" or "processing" rows.

Comparing SKIP LOCKED vs. NOWAIT

You might also see NOWAIT in some documentation. It’s important to distinguish the two.
- FOR UPDATE NOWAIT: If the row you want is locked, the query immediately fails with an error.
- FOR UPDATE SKIP LOCKED: If the row you want is locked, the query just ignores it and looks for the next one.

For a task queue, NOWAIT is almost never what you want. You don't want your workers throwing errors; you want them to be productive. SKIP LOCKED is the surgical instrument that allows workers to slide past each other silently.

When Scaling Hits the Wall

While SKIP LOCKED is a massive improvement over standard row locking, it isn't infinite. If you have 500 workers all hitting the same table every 100ms, your database CPU will eventually redline just handling the overhead of the UPDATE statements and the index maintenance.

Before you reach that point, consider these optimizations:

1. Batching: Don't grab 1 task at a time. Grab 50 or 100. This reduces the number of round-trips to the database and the number of times the transaction log needs to be flushed.
2. Table Partitioning: If your queue table gets massive, partition it by task type or date. This keeps indexes small and manageable.
3. The "Unlogged" Table: If you don't care about losing tasks in the event of a total database crash, you can use an UNLOGGED table in Postgres. This skips the Write-Ahead Log (WAL), making writes significantly faster. (Use this with extreme caution!)

Conclusion

Building a custom task queue is often seen as a "junior" mistake when tools like Celery or Sidekiq exist. But those tools often introduce their own complexities—Redis persistence issues, serialization overhead, and the loss of transactional guarantees.

If you understand SKIP LOCKED, you can build a task queue directly in your existing database that is more reliable, easier to debug, and capable of handling millions of tasks per day. You move from a "Thundering Herd" that breaks your database to a streamlined, non-blocking pipeline where workers operate with surgical precision.

The next time you see row-lock contention in your logs, don't reach for a bigger instance. Reach for SKIP LOCKED. It’s the difference between a system that fights itself and a system that scales.