How do you design rate limits that users can recover from?

Rate limiting protects systems but can punish legitimate bursts. What rate-limit and retry model has worked without creating confusing UX?

BobaMilk :smiling_face_with_sunglasses:

Use a token bucket with a small burst allowance plus a clear Retry-After; users tolerate limits fine when recovery is predictable, but strict fixed windows feel random at the boundary.

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712345678

Yoshiii