
What Nobody Tells You About the Visibility Map: Why Your 'Index-Only Scan' Is Still Hitting the Disk
A technical deep dive into why your covering indexes fail to prevent heap access and how to tune autovacuum to guarantee zero-I/O queries.
Have you ever spent hours crafting the "perfect" covering index, only to watch your production logs show thousands of random I/O reads against the main table heap anyway?
It’s one of the most frustrating experiences in Postgres performance tuning. You’ve done everything right. You identified the slow query, you saw it was doing a Sequential Scan, and you added an index that includes every single column in the SELECT clause. You expect an Index-Only Scan—the holy grail of Postgres performance—where the database never has to touch the heavy, bloated table data (the "heap") and gets everything it needs from the lean, sorted index.
But then you run EXPLAIN (ANALYZE, BUFFERS) and see that dreaded line:
Heap Fetches: 4329
Wait, if it’s an *Index-Only* Scan, why on earth is it fetching thousands of rows from the heap? The answer lies in a tiny, often overlooked structure called the Visibility Map. If you don't understand how this map works, your covering indexes are basically just expensive decorative items for your storage volume.
The MVCC Blind Spot
To understand why Index-Only Scans fail, we have to talk about how Postgres handles concurrency. Postgres uses Multi-Version Concurrency Control (MVCC). When you update a row, Postgres doesn't actually overwrite the old data in place. Instead, it marks the old version as "deleted" and inserts a brand-new version of the row.
This is great for performance because readers never block writers. But it creates a massive problem for indexes.
Postgres indexes do not store visibility information.
In the index, a pointer exists for every version of a row that hasn't been vacuumed yet. The index knows the value is 42, but it has no idea if the transaction that created that 42 has actually committed, or if another transaction has since deleted it. Only the "heap" (the main table) knows the truth, because the heap stores the xmin and xmax metadata—the transaction IDs that tell Postgres if a row version is visible to you right now.
Historically, this meant Postgres *always* had to visit the heap to check visibility, even if the index had all the data you needed.
Enter the Visibility Map
In Postgres 9.2, the developers introduced a clever workaround: The Visibility Map (VM).
The Visibility Map is a simple bitmask stored on disk alongside your table. It uses two bits for every page (an 8KB block of data) in the heap.
1. Bit 1: Is every row on this page visible to all current and future transactions? (The "All-Visible" bit).
2. Bit 2: Is every row on this page "frozen"? (The "All-Frozen" bit, mostly used to speed up VACUUM FREEZE).
When you run an Index-Only Scan, Postgres checks the Visibility Map first. If the "All-Visible" bit for a page is set to 1, Postgres knows that every pointer in the index referring to that page is definitely valid. It can skip the heap entirely.
If that bit is 0, Postgres cannot guarantee that the data in the index is actually "live." It has to take a trip to the heap to check the xmin and xmax for that specific row. This is what a "Heap Fetch" is.
Seeing the Failure in Action
Let's look at a concrete example. Suppose we have a table of user accounts.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username TEXT NOT NULL,
email TEXT NOT NULL,
last_login TIMESTAMP
);
-- Fill it with 1 million rows
INSERT INTO users (username, email, last_login)
SELECT
'user_' || i,
'user_' || i || '@example.com',
now() - (random() * interval '30 days')
FROM generate_series(1, 1000000) s(i);
-- Create a covering index for a specific query
CREATE INDEX idx_users_username_email ON users(username, email);
-- Ensure the stats are up to date and the Visibility Map is populated
VACUUM ANALYZE users;Now, let's run a query that should, in theory, be an Index-Only Scan.
EXPLAIN (ANALYZE, BUFFERS)
SELECT username, email FROM users WHERE username = 'user_500000';In the output, you’ll likely see something like this:
Index Only Scan using idx_users_username_email on users (cost=0.42..8.44 rows=1 width=44) (actual time=0.021..0.022 rows=1 loops=1)
Index Cond: (username = 'user_500000'::text)
Heap Fetches: 0
Buffers: shared hit=4Heap Fetches: 0. Success. But watch what happens when we start messing with the data.
-- Update a few rows NEAR our target
UPDATE users SET last_login = now() WHERE id BETWEEN 499900 AND 500100;
-- Run the query again
EXPLAIN (ANALYZE, BUFFERS)
SELECT username, email FROM users WHERE username = 'user_500000';Now look at the results:
Index Only Scan using idx_users_username_email on users (cost=0.42..8.44 rows=1 width=44) (actual time=0.045..0.047 rows=1 loops=1)
Index Cond: (username = 'user_500000'::text)
Heap Fetches: 1
Buffers: shared hit=5 read=1Even though we only updated one row, the entire *page* that row lives on has been marked as "not all-visible" in the Visibility Map. The next time we query any row on that page via the index, Postgres has to go to the disk (or at least the buffer cache) to check the heap.
If you have a high-churn table, your Visibility Map is constantly being shredded, and your Index-Only Scans are essentially becoming regular Index Scans with extra steps.
The "Dirty" Truth About Autovacuum
The only thing that sets the All-Visible bit in the Visibility Map is VACUUM.
This is the part that nobody tells you: Index-Only Scans are only as effective as your Autovacuum settings. If your table is being updated frequently and Autovacuum is lagging behind, your "Covering Index" is a lie.
By default, Autovacuum is quite conservative. It triggers when 20% of a table has been changed. On a 100-million-row table, that’s 20 million changes before the Visibility Map gets a refresh. During those 20 million changes, your Index-Only Scans are hitting the heap for almost every query.
To fix this, we need to make Autovacuum more aggressive on tables where we rely on Index-Only Scans.
Tuning for Visibility
You can tune Autovacuum at the table level. For a table where I need guaranteed Index-Only Scan performance, I’ll often drop the vacuum_scale_factor significantly.
ALTER TABLE users SET (
autovacuum_vacuum_scale_factor = 0.01, -- Vacuum every 1% of changes
autovacuum_vacuum_cost_limit = 1000, -- Give it more "budget" to work faster
autovacuum_vacuum_cost_delay = 10 -- Don't sleep as much between chunks
);By dropping the scale factor to 1%, we ensure the Visibility Map is updated much more frequently. This keeps the "Heap Fetches" low and the "Index-Only Scans" actually "Index-Only."
How to Audit Your Visibility Map
If you’re suspicious that your Visibility Map is the bottleneck, you don't have to guess. Postgres provides an extension called pg_visibility.
CREATE EXTENSION pg_visibility;
-- See how many pages are NOT all-visible in our table
SELECT count(*)
FROM pg_visibility('users')
WHERE NOT all_visible;If that count is high relative to the total number of pages in your table, your Index-Only Scans are underperforming. You can even see which specific rows are causing the "dirtiness."
SELECT blkno, all_visible
FROM pg_visibility('users')
LIMIT 10;The "Append-Only" Misconception
You might think, "I only insert data into this table, I never update it. So my Visibility Map should always be clean, right?"
Actually, no. When you INSERT a row, the page is not immediately marked as All-Visible. Why? Because the transaction that inserted the row is still "in progress" until it commits. Even after it commits, the Visibility Map doesn't update itself magically. VACUUM still needs to run to scan those new pages and verify that there are no uncommitted transactions left.
For high-volume logging tables where you do 100% inserts, you might notice that Index-Only Scans on the "tail" of the data (the most recent entries) always show Heap Fetches. This is because Autovacuum hasn't caught up to the end of the table yet.
For these scenarios, I often recommend a more frequent autovacuum_vacuum_insert_threshold or even a manual VACUUM scheduled during off-peak hours if the data pattern is predictable.
When Index-Only Scans Are Still the Wrong Choice
I've seen developers go overboard with covering indexes, adding 10 columns to an index just to avoid a heap fetch. You have to remember the trade-off.
Every column you add to an index:
1. Bloats the index: Larger indexes take longer to scan and consume more of your shared_buffers.
2. Slows down writes: Every INSERT and UPDATE now has to update a much larger index structure.
3. Increases "HOT" update failure: Postgres has an optimization called "Heap Only Tuple" (HOT) updates. If you update a row and *none* of the indexed columns change, Postgres can avoid updating the index. If you include every column in your index, *every* update becomes an index update.
If your query returns 5,000 rows, a "Heap Fetch" for each one is a disaster. But if your query returns 1 or 2 rows, a single heap fetch is almost free. Don't sacrifice your write performance and disk space to fix a "Heap Fetch" that isn't actually hurting your latency.
The Checklist for Index-Only Scans
If you’re looking at an EXPLAIN plan and wondering why your Index-Only Scan is slow, follow this checklist:
1. Check the Buffer usage: Run EXPLAIN (ANALYZE, BUFFERS). If shared hit is high but Heap Fetches is also high, the Visibility Map is your problem.
2. Look at the Map: Use pg_visibility to see how much of your table is "dirty."
3. Check Autovacuum Health: Check pg_stat_user_tables to see when the last autovacuum ran on that table.
`sql
SELECT relname, last_autovacuum, last_vacuum
FROM pg_stat_user_tables
WHERE relname = 'users';
`
4. Tune the Table: Decrease autovacuum_vacuum_scale_factor for that specific table.
5. Consider FILLFACTOR: If you have frequent updates, set a FILLFACTOR (e.g., 80 or 90) on the table. This leaves free space on each page for updates, which helps keep updates "local" and can sometimes interact better with how vacuum cleans up pages.
Summary
The Visibility Map is the invisible glue that makes modern Postgres indexing work. It is the bridge between the version-heavy reality of MVCC and the high-speed requirements of modern applications.
An Index-Only Scan isn't a guarantee; it's a request. Postgres will fulfill that request if—and only if—your maintenance routine is keeping the Visibility Map clean. Stop looking only at your index definitions and start looking at how your table is being cleaned. Your disk I/O will thank you.


