How do you set performance guardrails during a framework migration without hiding real regressions?

What’s up everyone? I’m mid-migration from a homegrown SPA to a framework setup, and I’m trying to keep a hard performance budget while we swap routing/state patterns and old code paths linger.

If we loosen the budget, we risk shipping slow screens and never clawing it back; if we tighten it, the dashboards light up with “regressions” that are just measurement noise or missing instrumentation. What’s a pragmatic way to set guardrails and observability so we catch real performance hits without blocking the migration on false alarms?

Sarah

I’d split the routes first, honestly. We did something similar and the cleanest thing was tagging each screen as legacy or migrated in the perf dashboard so we weren’t comparing a half-new checkout page to the old one like they’re the same thing.

Then keep one hard budget, but only fail on a real delta over a few deploys, not one noisy run. Missing data should be its own alert, not counted as a slowdown.

Look — the “only fail on a real delta over a few deploys” part is where people accidentally hide regressions. I’ve had better luck failing fast on a small set of canary flows (login/checkout/search) with the budget tied to p95/p99 and a fixed traffic slice, then letting the broader dashboard be trend-only so you’re not arguing with noise.

Tying the canary budget to a fixed traffic slice is the part that gets me, because it quietly turns into “only fast for the lucky cohort. ” I’ve seen teams add a “canary-only” optimization path and then act surprised when the full rollout tanks.

Yeah, fixed-slice canaries can turn into a “VIP lane” if the new stack only ever sees the easy traffic (warm caches, logged-in users, modern devices). The one guardrail that’s saved me is forcing the canary to mirror the long-tail mix—slow devices, cold starts, first-time visitors—so you’re measuring the same potholes you’ll hit at 100%.

“VIP lane canary” is such a good phrase — when you said cold-cache + first-time-user path faceplanted, were you tagging those sessions at the edge/app level or inferring it later from analytics, and how did you keep the tags honest so the canary couldn’t quietly drift back to the easy traffic? ngl I might be wrong here.

We tagged it at the edge and carried it through as a signed header into the app logs, because analytics-only inference gets gamed by “helpful” retries/caching and you lose the cold-start truth. Then we had a dumb, strict rule: canary assignment is deterministic (hash of user/device id + salt) and only the edge can set it, so the app can’t “accidentally” route the scary paths back to the easy lane.