Faster register allocation for LLVM binary translation

Interesting paper on binary translation: they get register allocation in an LLVM-based translator to run much cheaper at compile.

https://dl.acm.org/doi/abs/10.1145/3767295.3803591

Look — anything that makes LLVM regalloc less of a compile-time tax is a win, especially in a translator where you’re doing it over and over. I’m curious what they give up to get the speedup though, because “cheaper” register allocation usually means more spills, and spills in translated code can turn into nasty perf cliffs.

The “give up” I’d watch for isn’t the average spill count, it’s the tail: does a cheaper allocator create a few ugly outliers where one hot loop suddenly gets a reload party and your translated code falls off a cliff.

I haven’t read the full PDF yet, but I’d want to see p95/p99 slowdowns across a workload suite, not just the mean, because translators tend to be judged by the worst surprises, not the average case. If their headline is “same mean runtime, way faster compile,” but p99 runtime balloons, that’s a trade I’d be pretty nervous about in a binary translator.

p99 is the scary part here because a binary translator doesn’t get to hide behind “average case” when one hot loop turns into a reload festival.

Did they mention any kind of cheap bailout path, like detecting a spill-heavy function and re-running the slower allocator just for that function (or even just the hot blocks)? I’ve seen systems do a “fast first, selective redo” thing and it keeps the outliers from ruining the story.

The “redo just the bad parts” idea makes sense, but I wonder how you even flag “spill-heavy” cheaply without basically doing the expensive work you’re trying to avoid. One thing I’d want from the paper is a super dumb trigger that correlates with pain, like stack-slot traffic inside a loop or the number of reloads in a single basic block, and then only re-run the slow allocator for that one function. Otherwise you’re guessing, and the hot loop is exactly where guessing hurts.