loke.dev
Header image for The Pipelined Query Is a Latency Cheat Code

The Pipelined Query Is a Latency Cheat Code

An in-depth analysis of the Postgres wire protocol’s pipeline mode and how it eliminates the round-trip penalty for sequential database operations.

· 8 min read

Most of the time, we blame the database when a page feels sluggish. We look at the execution plan, we squint at the indexes, and we wonder why a simple set of updates is taking 200 milliseconds when the "Total Query Runtime" says it only took 5. Usually, the database isn't the problem. The speed of light is the problem.

If your application server is in Virginia and your database is in Ohio, you’re looking at roughly 10-15ms of round-trip time (RTT). That doesn't sound like much until you realize that most modern application patterns are incredibly chatty. If you need to perform five sequential operations to fulfill a single API request, you’ve already baked 75ms of pure "waiting for photons" into your response time before the database has even broken a sweat.

We’ve been taught to solve this with better indexing or by bulk-loading data, but there is a middle ground that often gets overlooked: the Postgres wire protocol’s pipeline mode. It is, quite literally, a latency cheat code.

The Invisible Tax of Round Trips

In a standard Postgres interaction, the client and server engage in a very polite, very slow dance. The client sends a query, then stops. It waits. The server receives the query, executes it, sends the results back, and then waits. Only when the client receives the *entire* result set does it send the next command.

This is the "Stop-and-Wait" approach. It looks like this:

1. Client: "Here is an INSERT."
2. Network: (10ms travel time)
3. Postgres: (1ms processing) "Okay, done."
4. Network: (10ms travel time)
5. Client: "Great. Now here is an UPDATE."
6. Network: (10ms travel time)
7. ...and so on.

If you have ten queries, you pay the 20ms network tax ten times. You’ve spent 200ms on transit for 10ms of actual work. Even within the same availability zone, where RTT might be 1ms or less, this overhead adds up at scale.

What is Pipeline Mode?

Pipelining changes the fundamental rules of the conversation. Instead of waiting for a response to every message, the client sends a stream of queries without waiting for the server to acknowledge them. The server processes these queries in order and sends the results back as they are ready.

It’s the difference between a waiter who takes your drink order, goes to the bar, comes back, takes your appetizer order, goes to the kitchen, and comes back—versus a waiter who takes your entire order at once and lets the kitchen start firing plates.

In the Postgres wire protocol (specifically version 3.0), this is handled by sending multiple Parse, Bind, and Execute messages followed by a single Sync message at the end of the batch.

Deconstructing the Wire Protocol

To understand why this works, you have to look at what happens under the hood. Most developers use high-level ORMs that hide the wire protocol, but the protocol is where the magic (or the bottleneck) happens.

When you send a parameterized query to Postgres, it’s not just one message. It’s a sequence:

* Parse: Turns the query string into a prepared statement.
* Bind: Plugs your variables into that statement.
* Describe: Asks the server "what does the output look like?"
* Execute: Runs the query.
* Sync: Tells the server to close the current transaction block and report completion.

In a non-pipelined world, the client sends these and waits for a ReadyForQuery response after almost every interaction. In Pipeline Mode, the client fills the outgoing network buffer with multiple sets of these messages.

Practical Implementation: Pipelining in the Wild

You won't find "Pipeline Mode" in every driver. Many older libraries are architected around a strict request-response loop that makes pipelining difficult to implement. However, if you're in the Go ecosystem using pgx or using libpq in C/C++, you have first-class access to this.

Here’s a look at how you’d actually write this in Go using pgx. I’ve found that pgx has one of the cleanest implementations of this concept.

package main

import (
	"context"
	"fmt"
	"github.com/jackc/pgx/v5"
	"os"
)

func main() {
	ctx := context.Background()
	conn, _ := pgx.Connect(ctx, "postgres://user:pass@localhost:5432/db")

	// We want to perform several operations without waiting for each one.
	batch := &pgx.Batch{}

	// Query 1: An insert
	batch.Queue("INSERT INTO users (name, email) VALUES ($1, $2)", "Alice", "alice@example.com")
	
	// Query 2: An update to a different table
	batch.Queue("UPDATE inventory SET stock = stock - 1 WHERE item_id = $1", 101)
	
	// Query 3: A logging operation
	batch.Queue("INSERT INTO audit_log (action, user_id) VALUES ($1, $2)", "purchase", 1)

	// Send all three queries in one network trip.
	// This is the "cheat code" moment.
	br := conn.SendBatch(ctx, batch)
	defer br.Close()

	// Now we collect the results. Even though we sent them together, 
	// we still handle the responses sequentially in our code.
	_, err := br.Exec() // Result of Query 1
	if err != nil {
		fmt.Fprintf(os.Stderr, "Query 1 failed: %v\n", err)
	}

	_, err = br.Exec() // Result of Query 2
	if err != nil {
		fmt.Fprintf(os.Stderr, "Query 2 failed: %v\n", err)
	}

	_, err = br.Exec() // Result of Query 3
	// ... and so on
}

In this example, conn.SendBatch flushes all three queries to the network socket at once. If your RTT is 20ms, this entire block takes roughly 21ms (RTT + processing), whereas standard execution would take 60ms+.

Why Not Just Use a Single Transaction?

I hear this a lot: "Can't I just wrap these in BEGIN and COMMIT?"

Transactions are for atomicity and consistency, but they don't inherently solve the latency problem. If you start a transaction and then execute five separate INSERT statements using a standard driver, you are still doing five round trips. You’ve just ensured that if the fourth one fails, the first three roll back.

Pipelining and Transactions are orthogonal. You can (and usually should) pipeline *inside* a transaction.

tx, _ := conn.Begin(ctx)
defer tx.Rollback(ctx)

batch := &pgx.Batch{}
batch.Queue("INSERT ...")
batch.Queue("INSERT ...")

br := tx.SendBatch(ctx, batch)
br.Close()

tx.Commit(ctx)

By combining these, you get the safety of the transaction with the speed of the pipeline.

The Performance Reality Check

Let’s look at some numbers. I ran a small test against a Postgres instance with a simulated 25ms network delay (typical for cross-region traffic). I performed 10 sequential INSERT operations.

* Sequential (Standard): ~510ms (10 trips * 50ms round trip + processing)
* Pipelined (Batch): ~52ms (1 trip * 50ms round trip + processing)

That is a 10x improvement in performance simply by changing how we talk to the protocol. The database did the same amount of work. The CPU usage on the DB was identical. We just stopped wasting time waiting for the wire.

This is particularly transformative for logic that can't be easily converted into a single bulk SQL statement. Sometimes your second query depends on the result of the first, but you know you’re going to run five queries regardless. While you can't pipeline queries that depend on *values* returned by previous queries in the same batch, you can pipeline "independent-but-related" operations that you'd otherwise run one by one.

The "What Could Go Wrong?" Section

Pipelining isn't a free lunch. There are three main things that will trip you up:

1. Error Handling (The "Abort" behavior)

By default, Postgres treats a pipeline as a bit of a "run until failure" sequence. However, most drivers implement it such that if the second query in a batch of ten fails, the remaining eight are skipped. This is generally what you want, but you need to be prepared for the fact that br.Exec() for queries 3 through 10 will all return errors if query 2 fails.

2. Memory Pressure

If you queue up 50,000 queries in a single pipeline, your client library has to buffer all of those in memory before sending them. Similarly, the Postgres server has to buffer the results. I’ve seen developers crash their application containers by trying to pipeline massive data migrations instead of using COPY. Pipelining is for latency-sensitive operational work, not for moving 10GB of data.

3. The "Dependency" Trap

You cannot pipeline a query that needs the ID generated by the query right before it—*unless* you use some clever SQL.

For example, this won't work in a single pipeline if you're trying to use the ID in application code:
1. INSERT INTO users ... RETURNING id
2. INSERT INTO profiles (user_id) VALUES (<id from query 1>)

To pipeline this, you have to shift the logic into SQL using WITH (Common Table Expressions):

WITH new_user AS (
  INSERT INTO users (name) VALUES ('Bob') RETURNING id
)
INSERT INTO profiles (user_id) SELECT id FROM new_user;

When to Reach for the Pipeline

I’ve started using pipelining as a default pattern for any operation that requires more than two touches to the database.

* Side Effects: If you need to update a primary record and then kick off several audit or logging entries.
* Complex Forms: When a single UI action maps to four different tables that aren't perfectly normalized.
* Microservices: If your service is a gateway that aggregates data from a single DB but multiple tables, pipelining the fetches can shave significant time off your p99 response rates.

The Postgres wire protocol is one of the most robust pieces of engineering in the software world. Most of us are only using about 10% of its capability. We treat it like a simple text-in, rows-out interface, but it's much more capable.

Next time you’re looking at a slow endpoint, stop looking at the EXPLAIN ANALYZE for a second. Look at the network tab. If you see a sequence of small, fast queries happening one after another, you don't have a database problem. You have a protocol problem.

Pipe it.