loke.dev
Header image for The Fillfactor Is a Performance Insurance Policy

The Fillfactor Is a Performance Insurance Policy

Learn how the Heap-Only Tuple (HOT) optimization turns intentional disk-space waste into your best defense against catastrophic table bloat.

· 9 min read

Wasting disk space on purpose is one of the smartest things you can do for a high-traffic database. It sounds counter-intuitive—we are trained to normalize, compress, and prune—but in PostgreSQL, leaving "empty" holes in your data pages is the only way to prevent your indexes from turning into a fragmented, bloated mess.

If you leave your table's fillfactor at the default of 100, you are essentially telling Postgres: "Pack every page to the brim; I don't care if updates become five times more expensive later." For read-only data, that’s fine. For a table that sees any significant volume of UPDATE statements, it’s a recipe for a performance collapse that usually happens at the worst possible time.

The Original Sin of PostgreSQL Updates

To understand why we need to waste space, we have to look at how Postgres handles updates. Postgres uses Multi-Version Concurrency Control (MVCC). This means it never actually "edits" a row in place.

When you run an UPDATE, Postgres marks the old row (the tuple) as dead and inserts a brand-new version of that row. In the eyes of the storage engine, an UPDATE is just a DELETE followed by an INSERT.

This creates a massive secondary problem: Index Maintenance.

Every row in a table has a physical address called a TIDs (Tuple Identifier), which looks like (page_number, item_offset). Your indexes don’t store the data; they store the value and a pointer to that TID. When a row moves to a new page because the old page was full, every single index on that table must be updated to point to the new location.

If you have a table with 10 indexes and you update one non-indexed column, you might expect a tiny bit of I/O. Instead, Postgres has to write the new row version and then update all 10 indexes. This is called Write Amplification, and it is the silent killer of database performance.

Enter the Heap-Only Tuple (HOT)

In version 8.3, the Postgres team introduced a "god-mode" optimization called Heap-Only Tuples, or HOT.

The goal of HOT is simple: if the new version of a row can fit on the exact same physical page as the old version, and none of the columns being updated are part of an index, Postgres can skip the index updates entirely.

Instead of updating the index, Postgres creates a "redirect" inside the page. The index still points to the old TID. When Postgres follows that pointer, it sees a little signpost that says, "Hey, the data you're looking for actually moved three inches to the left on this same page."

This is a massive win. You save the I/O of updating indexes, you reduce WAL (Write Ahead Log) volume, and you drastically reduce index bloat.

But there is a catch. HOT only works if there is room on the page.

Fillfactor: The Insurance Policy

By default, the fillfactor for a Postgres table is 100. This means when you INSERT data or VACUUM FULL a table, Postgres packs the 8KB pages as tightly as possible.

If a page is 100% full, the very first UPDATE to any row on that page *must* move the new version to a different page. You’ve just broken the HOT optimization before it even had a chance to work.

By setting a fillfactor of, say, 90, you are telling Postgres to leave 10% of every page empty during initial inserts. You are prepaying a "space tax" to ensure that when you update those rows later, there is a landing pad available on the same page.

How to Detect if You’re Missing Out

Before you start messing with your table settings, you need to know if you're actually benefiting from HOT updates. Postgres tracks this in the pg_stat_user_tables view.

Run this query to see your hit rate:

SELECT 
    relname AS table_name, 
    n_tup_upd AS total_updates, 
    n_tup_hot_upd AS hot_updates,
    CASE 
        WHEN n_tup_upd > 0 
        THEN (n_tup_hot_upd::float / n_tup_upd::float) * 100 
        ELSE 0 
    END AS hot_ratio
FROM 
    pg_stat_user_tables
WHERE 
    n_tup_upd > 0
ORDER BY 
    hot_ratio ASC;

If you see a table with a high number of updates but a hot_ratio near zero, you are bleeding performance. Every update is forcing a cross-page move and triggering a cascade of index writes.

Setting the Fillfactor

You can set the fillfactor at table creation:

CREATE TABLE orders (
    id uuid PRIMARY KEY,
    status text,
    updated_at timestamptz,
    payload jsonb
) WITH (fillfactor = 80);

Or, more likely, you'll want to adjust it on an existing table:

ALTER TABLE orders SET (fillfactor = 80);

Wait! Running that ALTER TABLE statement does not magically move your existing data to create gaps. It only affects *future* inserts. To apply the fillfactor to existing data, you have to rewrite the table.

The "nuclear" option is VACUUM FULL, but that takes an exclusive lock (meaning your app can't read or write to it). A better way is to use pg_repack or simply wait for the table to naturally churn through its data if it's a high-velocity table.

If you can afford the brief lock, you can also use:

CLUSTER orders USING orders_pkey;

This rewrites the table and respects the new fillfactor.

Real-World Example: The "Status" Column Nightmare

Imagine a tasks table. You have an id, a name, a description, and a status. You probably have an index on id and status.

You update the description (not indexed) frequently.

1. With Fillfactor 100: Every time you update the description, the row moves to a new page. The index on id and the index on status must both be updated. Even though the status didn't change, its pointer (the TID) did.
2. With Fillfactor 80: The updated row stays on the same page. The indexes on id and status remain untouched. The CPU and Disk I/O required for the update drop by 60-80%.

The "Broken" HOT Update

It is vital to remember: HOT updates cannot happen if you update a column that is part of an index.

If I have an index on (user_id, status) and I update status, HOT is off the table for that specific write. The index entry itself must change because the value changed. This is why you should be surgical with your indexing. Don't index everything "just in case." Every index you add is a potential anchor that prevents HOT from saving your life.

Finding the Sweet Spot

What should your fillfactor be? There’s no magic number, but here are my rules of thumb:

- 100 (Default): For static tables, logs, or append-only data. If you never UPDATE, don't waste the space.
- 90-95: For tables with light update activity. This provides just enough breathing room for the occasional row versioning.
- 70-80: For "hot" tables. If a row is updated multiple times throughout its lifecycle (like a shopping_cart or user_session), 70 is often better.
- < 50: Rarely necessary and usually indicates a schema design problem. If you need this much space, your rows are probably too wide.

The Cost of Insurance

Nothing is free. When you lower the fillfactor, you are essentially increasing the size of your table on disk.

- Memory Pressure: Larger tables mean fewer rows fit into the shared_buffers (Postgres's internal cache).
- Scan Speed: Sequential scans will take longer because they have to read more (partially empty) pages to get the same amount of data.
- Disk Space: Obviously, your AWS bill or hardware requirements go up.

However, in the context of modern infrastructure, disk space is cheap. IOPS and CPU cycles are expensive. Bloated indexes that lead to "Stop the World" vacuuming events are *extremely* expensive.

Monitoring Bloat Post-Optimization

Once you've adjusted your fillfactor, you should monitor the "bloat" of your indexes. Since HOT updates reduce the number of new index entries, your index growth should slow down significantly.

I use this snippet to check for index bloat (it's a rough approximation, but it works):

SELECT
  current_database(), schemaname, tablename, /* etc */
  ROUND(CASE WHEN otta=0 THEN 0 ELSE ipl_fillfactor*relpages/otta END,1) AS ipl_bloat,
  CASE WHEN relpages < otta THEN 0 ELSE relpages-otta END AS ipl_is_bloat
FROM (
  /* This is a complex internal query, usually provided by scripts like check_postgres.pl */
) AS foo;

Actually, skip the complex SQL for a second. The easiest way to see if it’s working is to watch your n_tup_hot_upd again. If that ratio moves from 5% to 90%, you've won.

Is it worth it for small tables?

You might think, "My table is only 500MB, who cares about bloat?"

I care. Because 500MB is small enough to fit in RAM, but if that 500MB table has 2GB of bloated, fragmented indexes, your cache hit rate will tank. The database will start hitting the disk for index lookups that *should* have been in memory. Small tables become slow tables when they are messy.

Summary Checklist for Implementation

If you're looking at a table and wondering if you should pull the fillfactor trigger, follow this mental model:

1. Check the HOT ratio: n_tup_hot_upd / n_tup_upd.
2. Check the Update frequency: Is this table getting hundreds of updates per second?
3. Check the Index count: Does the table have 5+ indexes? (More indexes = higher penalty for non-HOT updates).
4. Identify the columns updated: Are you mostly updating non-indexed columns? (If yes, fillfactor is your best friend).
5. Test and Apply: Start with 90. If the HOT ratio doesn't improve enough, try 80.

A Final Word on "Wasted" Space

We spend so much time trying to make systems efficient that we sometimes forget that slack is a feature.

A highway that is 100% full is a parking lot. A filesystem that is 100% full is a disaster. A Postgres page that is 100% full is a performance bottleneck waiting to happen.

Treat fillfactor as your buffer. It’s the breathing room your database needs to handle the churn of everyday life without collapsing under the weight of its own indexes. It isn't wasted space; it’s an insurance policy against the inevitable overhead of MVCC.

Go check your hot_ratio. Your future self will thank you when the database doesn't lock up during your next traffic spike.