loke.dev
Header image for Fixing Core Web Vitals Regressions After a Code Deployment
Web Performance Core Web Vitals Frontend Engineering Debugging

Fixing Core Web Vitals Regressions After a Code Deployment

Stop guessing why your Core Web Vitals dropped. Learn to isolate LCP, CLS, and INP regressions using field data versus Lighthouse to fix performance issues fast.

Published 5 min read

Lighthouse scores are vanity metrics. If you’re optimizing for the green badge in local DevTools while your Search Console traffic is bleeding out, you’re looking at the wrong data.

I spent three weeks chasing a Core Web Vitals regression last quarter. The lab score was a perfect 100. The field data in the Chrome User Experience Report was a catastrophe. Users on budget Android devices were bouncing at a rate 40% higher than the previous month. Here is how we mapped that gap and fixed a production disaster.

The Lighthouse Trap

Your Lighthouse score shows 100 while production fails for one reason. Lighthouse is a simulation. It runs in a controlled sandbox with a high-end CPU. It uses a throttled network connection that doesn't replicate real-world jank.

When we pushed the release that triggered our last regression, the team looked at the Lighthouse CI report. It showed a passing score. But the field data, pulled via the PageSpeed Insights API, showed our Largest Contentful Paint (LCP) ballooning from 1.8s to 3.2s for the 75th percentile of mobile users.

The lab environment wasn't executing the third-party marketing tags that triggered when a user had a specific cookie set. It also ignored the background processing task we introduced in our new analytics service. Lighthouse sees the skeleton. Your users see the meat.

Troubleshooting LCP After Third-Party Updates

Our LCP hit went from 1.8s to 3.2s after we updated a third-party chat widget. The before state had an LCP hero image rendered via an optimized img tag. The after state turned the chat widget’s custom iframe into the LCP element.

The fix wasn't removing the widget. The fix was prioritizing the browser's view of the critical path. We moved from a standard script tag to an intersection-observer-based lazy load. Instead of the browser fighting for bandwidth between our hero image and the heavy chat widget, we forced the widget to wait.

// Before: Blocking the main thread
<script src="https://third-party-chat.js" async></script>

// After: Idle-time loading
const chatObserver = new IntersectionObserver((entries) => {
  if (entries[0].isIntersecting) {
    const script = document.createElement('script');
    script.src = 'https://third-party-chat.js';
    document.body.appendChild(script);
    chatObserver.disconnect();
  }
});
chatObserver.observe(document.querySelector('#chat-container'));

By deferring that script, our LCP returned to 1.9s within the next CrUX window. If you aren't seeing improvements, check your waterfall in the Network tab. If a third-party script has a higher priority than your image, your LCP will never be stable.

The Cumulative Layout Shift Fix for Dynamic Content

We had a persistent CLS score of 0.18. The culprit was a font loader that swapped fonts after the layout had already painted.

Conventional advice pushes font-display: swap. That’s bad advice. It forces a layout shift the moment the system font is replaced by your brand font because the line heights and glyph widths rarely match perfectly.

Instead of waiting for the font to load, we calculated the aspect ratio of the text block and reserved space for it using a placeholder with a similar ratio.

/* Before: font-display: swap causes a shift */
.body-text {
  font-family: 'BrandFont', sans-serif;
}

/* After: Reserve space with fallback metrics */
.body-text {
  font-family: 'BrandFont', sans-serif;
  font-size: 16px;
  line-height: 1.5;
  background-color: #f0f0f0; /* Visualization for debugging */
  min-height: 1.5em; 
  display: block;
}

By locking the min-height, the container remained fixed regardless of whether the font was the system fallback or the custom web font. Our CLS dropped from 0.18 to 0.04 overnight.

Identifying INP Bottlenecks

Interaction to Next Paint (INP) replaced FID. FID only measured the delay of the first interaction. INP measures the latency of all interactions throughout the page lifecycle.

When we saw our INP spike to 450ms, we found the "Add to Cart" button was the culprit. It wasn't the API call that was slow. It was the JavaScript task updating the cart UI. We had a massive forEach loop inside the event handler that re-calculated the entire cart price whenever a single item was updated.

We identified the block using the Long Tasks API.

const observer = new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) {
    console.log('Long Task detected:', entry.duration);
    // Log the script responsible
  }
});
observer.observe({ entryTypes: ['longtask'] });

We refactored the cart logic to only update the DOM node for the item changed. We stopped re-rendering the entire list. The INP dropped from 450ms to 95ms.

Validating Performance Fixes Without 28-Day Lag

The biggest pain in handling Core Web Vitals regressions is the 28-day rolling window in Google Search Console. You make a fix and wait a month. If it fails, you’ve lost 30 days of SEO authority.

Don't wait for Google to tell you if you won. Implement Real User Monitoring (RUM). Use a lightweight script that captures LCP, CLS, and INP as performance entries and beacons them back to your own analytics endpoint.

If your RUM data shows a positive trend in the 24 hours following your deploy, your GSC data will follow. If it doesn't, you know you have another 24 hours to fix it before the field data reports aggregate.

Why Your "Fix" Might Be a False Positive

Be careful with Preload tags. I see developers preloading every image on the page in a desperate attempt to boost LCP. When you preload everything, you preload nothing. You end up starving the main document of bandwidth.

LCP is a moving target. If you preload a secondary image that isn't the LCP element, you are hijacking the network connection from the main thread. Only preload the one image that constitutes the LCP element. Use the fetchpriority attribute on that specific image tag rather than a meta tag.

If you are struggling with a persistent regression, stop guessing. Open the Performance tab in your browser. Record a profile of a user interaction and look at the main thread usage. If you see a sea of red blocks, you have too much JavaScript execution. If you see white space, you have a network priority issue.

The green light in Lighthouse is a goal. The numbers in your analytics dashboard are the truth. Manage the truth, not the badge.

Resources