
Stop Writing Transient Data to the WAL: Why Postgres UNLOGGED Tables Are the Secret to High-Velocity Staging
Learn how to bypass the Write-Ahead Log for temporary data to achieve massive I/O gains without the overhead of an external caching layer.
I used to watch my disk I/O metrics during massive data imports and wonder why my database felt like it was trying to run through a pool of molasses. I had the RAM, I had the NVMe drives, yet every INSERT felt heavy and sluggish. It took me way too long to realize I was forcing Postgres to meticulously record every single byte of temporary staging data into the Write-Ahead Log (WAL) like it was a sacred historical text. Once I realized that I didn't actually care if that transient data survived a sudden power failure, I stopped treating it like permanent data, and my ingestion speed practically tripled.
The WAL Tax You Didn't Know You Were Paying
By default, Postgres is obsessed with durability. When you insert a row, it doesn't just write to the table (the "heap"); it first writes that change to the Write-Ahead Log. This is a brilliant safety mechanism. If your server explodes, Postgres looks at the WAL upon reboot to figure out what it missed.
But here is the rub: Writing to the WAL takes time and disk I/O.
If you are pulling 10 million rows from an S3 bucket into a staging table, transforming them, and then moving them to a final production table, you are writing that data twice. You're paying the "WAL Tax" on data that you’re probably going to delete in five minutes anyway.
The Solution: UNLOGGED Tables
Postgres has a specific feature for exactly this scenario. It's the UNLOGGED keyword. When you create a table as UNLOGGED, you’re telling Postgres: "Hey, don't worry about the WAL for this one. If the database crashes, I’m fine with losing the data in this specific table."
Here is how you do it:
CREATE UNLOGGED TABLE staging_events (
id UUID PRIMARY KEY,
raw_payload JSONB,
created_at TIMESTAMP DEFAULT now()
);That’s it. One word. No extra infrastructure, no Redis sidecar, no complicated caching logic. Just a regular table that happens to be on steroids.
How much faster is it?
In my experience, for write-heavy workloads, you're looking at a 2x to 5x performance increase.
Because Postgres doesn't have to wait for the WAL flush to return a "success" to your application, the throughput is limited mostly by your CPU and the raw write speed of your disk.
If you already have a regular table and want to flip the switch, you don't even have to recreate it:
-- Speed up an existing table for a big batch job
ALTER TABLE heavy_lifting_table SET UNLOGGED;
-- Do your massive imports here...
-- If you want it to be permanent/safe again afterward
ALTER TABLE heavy_lifting_table SET LOGGED;*Note: Changing a table from unlogged to logged (or vice versa) triggers a full rewrite of the table, so don't do this every five seconds on a 500GB table.*
When should you actually use this?
Don't go making your users or orders tables unlogged. That's a recipe for a very bad Friday afternoon. Use UNLOGGED for:
1. ETL Staging: Moving data from a CSV/API into a middle-man table before cleaning it.
2. Session Storage: If your app stores sessions in Postgres and you don't mind users getting logged out if the DB reboots.
3. Intermediate Calculation Tables: If you're running complex reports that require building large temporary result sets.
4. Caches: Using Postgres as a key-value store where the data can be rebuilt from a source of truth.
The "Gotchas" (The part you should actually read)
There is no such thing as a free lunch. Here are the three things that will bite you if you aren't careful:
1. The Crash Wipe
If the Postgres service crashes (a segfault, a power loss, a kill -9), all data in your UNLOGGED tables is automatically truncated. When the DB comes back up, the table structure remains, but the rows are gone. This is intentional. Since there was no WAL, Postgres has no way to guarantee the data isn't corrupted, so it clears it for safety.
2. No Standby Replication
This is the big one. UNLOGGED tables do not replicate to read replicas. Since replication is based on the WAL, and UNLOGGED tables don't write to the WAL, your replicas will see the table but it will appear empty. If you rely on read replicas for scaling your staging data, this won't work.
3. Vacuum is still a thing
Even though you're bypassing the WAL, you aren't bypassing the way Postgres handles MVCC (Multi-Version Concurrency Control). If you are constantly updating or deleting rows in an UNLOGGED table, you still need to worry about bloat and autovacuum.
A Practical Workflow Example
Here is how I typically use this in a data pipeline. Imagine we are importing a massive log file.
-- 1. Create the fast landing zone
CREATE UNLOGGED TABLE temp_log_ingest (
data text,
ingested_at timestamp
);
-- 2. Bulk load (this will be blazing fast)
COPY temp_log_ingest FROM '/path/to/data.csv' WITH (FORMAT csv);
-- 3. Run your transformations and move to the permanent home
INSERT INTO permanent_logs (event_type, user_id, payload)
SELECT
(split_part(data, ',', 1))::int,
(split_part(data, ',', 2))::uuid,
(split_part(data, ',', 3))::jsonb
FROM temp_log_ingest;
-- 4. Clean up
TRUNCATE temp_log_ingest;By keeping the "messy" work in the UNLOGGED space, you drastically reduce the I/O pressure on your database. Your WAL stays clean, your backups stay smaller, and your ingestion scripts finish before you have time to go grab a coffee.
Is it worth it?
If you're struggling with I/O wait or your WAL partition is growing faster than a sourdough starter, yes. UNLOGGED tables are one of the most underutilized tools in the Postgres performance toolkit. They give you the speed of an in-memory store with the convenience of SQL, as long as you're okay with the data being "easy come, easy go."


