Why most sites end up slow
Sites get slow the same way most projects fall apart: nobody is watching the number, so the number drifts.
A team ships a feature. The bundle goes from 180KB to 220KB. Nobody notices. Next sprint, somebody adds a date picker library, and the bundle goes from 220KB to 310KB. Nobody notices. Six months later, the home page takes four seconds to become interactive on a mid-range Android, and the team is in a meeting trying to figure out what happened.
What happened is they never set a budget. So this post is mostly about budgets, and then about the four or five techniques that actually move the number.
What a performance budget actually is
A performance budget is a number you commit to. That is the whole concept. Examples:
- JavaScript bundle for the main route stays under 150KB compressed.
- Largest Contentful Paint under 2.5 seconds on a 4G connection.
- Time to Interactive under 3.5 seconds.
- No single long task over 200ms during page load.
The number lives somewhere a CI job can read it. When a pull request pushes the number over the limit, the build fails. The author either makes the feature smaller, removes a dependency, or argues for a budget change. All three are useful conversations. The bad outcome is the silent drift.
A simple version with size-limit:
{
"size-limit": [
{
"path": "dist/assets/index-*.js",
"limit": "150 KB"
}
]
}
Then npm run size in CI. If the bundle grows past 150KB, the job fails. That is the whole budget. You do not need a fancy dashboard to start.
For runtime metrics like LCP and INP, Lighthouse CI does the same thing for the browser.
Long tasks: the thing that makes pages feel broken
When users click a button and nothing happens for half a second, that is usually a long task on the main thread. The browser has one main thread that handles JavaScript, layout, paint, and input. While JavaScript is running, the browser cannot respond to the click. Anything over 50ms is officially a βlong taskβ and starts to feel laggy. Anything over 200ms feels broken.
Common causes:
- Parsing a large JSON blob synchronously.
- Looping over thousands of items to build a derived list.
- Heavy work inside a React render or a
useEffectthat runs on every state change. - A library doing initialization work the moment you import it.
The fix is not always to make the work faster. Often it is to break the work into pieces so the browser can breathe between them.
// Bad: blocks the main thread for the full duration
function processItems(items) {
return items.map(heavyTransform);
}
// Better: yield to the browser between chunks
async function processItems(items) {
const result = [];
for (let i = 0; i < items.length; i++) {
result.push(heavyTransform(items[i]));
if (i % 50 === 0) {
// Let the browser handle clicks, paints, animations
await new Promise(resolve => setTimeout(resolve, 0));
}
}
return result;
}
In modern browsers there is also scheduler.yield(), which is cleaner than setTimeout(0) and tells the browser to handle pending input first.
Web workers: when the work is genuinely heavy
If you are processing a 5MB CSV, decoding a video frame, or running a search index over 50,000 documents, no amount of yielding will save you. That work belongs on a worker thread.
A worker is a separate JavaScript context that runs in parallel and cannot touch the DOM. You send it data, it sends back a result. The main thread stays free for clicks and animations.
// main.js
const worker = new Worker(new URL('./search-worker.js', import.meta.url), {
type: 'module'
});
worker.postMessage({ query: 'astro performance' });
worker.onmessage = (e) => {
renderResults(e.data.results);
};
// search-worker.js
import { buildIndex, search } from './lunr-helpers.js';
const index = buildIndex(); // expensive, but we are not blocking the UI
self.onmessage = (e) => {
const results = search(index, e.data.query);
self.postMessage({ results });
};
The user types, the main thread stays smooth, the worker does the work. For things like search, image filters, parsers, and crypto, this is the right answer. For βI have a 50-item list to sortβ, it is overkill.
Libraries like Comlink make the messaging look like normal function calls if you find raw postMessage clunky.
Lazy loading and code splitting in plain words
Lazy loading means: do not download it until you need it. Code splitting means: ship the page in pieces instead of one giant file.
Most routers in modern frameworks do route-level splitting automatically. You navigate to /dashboard, the dashboard bundle downloads. You never visit /admin, the admin bundle never loads. Free win.
The wins you have to ask for are component-level. The classic example is a heavy editor or a chart library that only appears after a user clicks something.
import { lazy, Suspense } from 'react';
const RichEditor = lazy(() => import('./RichEditor'));
export function CommentForm() {
const [editing, setEditing] = useState(false);
if (!editing) {
return <button onClick={() => setEditing(true)}>Write a comment</button>;
}
return (
<Suspense fallback={<p>Loading editor...</p>}>
<RichEditor />
</Suspense>
);
}
Now the editor only ships to users who click the button. On most pages, that is a small fraction of users, and the editor is often a hundred KB or more. This is the cheapest way to cut bundle size in a mature codebase: find the biggest components that load by default, and check whether they need to.
In Astro, the equivalent is the client: directives. client:visible waits for the component to scroll into view. client:idle waits for the browser to be free. client:load ships immediately. Picking the right one for each island is the entire performance story for a content site.
Measuring with real users, not just lab tools
Lighthouse and WebPageTest are useful, but they run on a clean machine on a fast network. Your users are on a five-year-old phone in a cafe with three bars of LTE. Lab tools tell you what is possible. Real-user monitoring tells you what is happening.
The simplest way to collect real numbers is the web-vitals library. It hooks into the browser APIs that report LCP, INP, and CLS, and gives you a callback when the values are final.
import { onLCP, onINP, onCLS } from 'web-vitals';
function sendToAnalytics(metric) {
const body = JSON.stringify({
name: metric.name,
value: metric.value,
id: metric.id,
page: location.pathname
});
// sendBeacon survives page unloads, fetch does not always
if (navigator.sendBeacon) {
navigator.sendBeacon('/api/vitals', body);
} else {
fetch('/api/vitals', { body, method: 'POST', keepalive: true });
}
}
onLCP(sendToAnalytics);
onINP(sendToAnalytics);
onCLS(sendToAnalytics);
That is the whole instrumentation. The endpoint stores the numbers, and you build a dashboard that shows the 75th percentile per page over the last week. Now you know which routes are slow for your actual users, on their actual devices, on their actual networks.
The 75th percentile matters because the median user often has a fine experience. It is the slowest quarter that churns. If your p75 LCP is 4 seconds, a quarter of your users wait 4 seconds or more.
A reasonable workflow
If you are starting from a site that has never had performance attention, this is roughly the order to do things:
- Set a bundle size budget. Wire it into CI. Let the next pull request feel the friction.
- Run web-vitals in production for two weeks. Look at p75 LCP and INP per route.
- For the worst route, open the Performance tab in Chrome DevTools and record a load. Look for the long tasks. Most slow pages have two or three obvious offenders.
- Pick one fix per week. Lazy load a heavy component. Move work to a worker. Replace a 90KB date library with a 5KB one. Shrink an oversized hero with the image resizer. Measure the impact in real-user data, not in Lighthouse.
After a couple of months, the budget keeps you from regressing, and the metrics tell you when something gets worse. That is the whole job.
For more on related topics, see our image format guide and the INP guide.