Google’s Gemini 3.1 Flash TTS is rolling out across its products, with a focus on more expressive AI speech rather than the usual flat robot voice.
Arthur
Google’s Gemini 3.1 Flash TTS is rolling out across its products, with a focus on more expressive AI speech rather than the usual flat robot voice.
Arthur
Kinda into this tbh — the flat “gps lady” voice has been the worst part of every assistant for years. i just hope they give a “dial it down” slider because super expressive tts can get old fast when you’re just trying to hear directions or a timer go off.
Same, I want the emotion when I’m listening to a story or reading mode, not when I’m late and Maps is doing theater kid energy. A simple “neutral / normal / expressive” toggle per app would save everyone’s sanity.
Yeah, per-app makes sense because “expressive” is a context thing, not a personality thing. i’d even want it tied to modes like driving vs reading, because maps should feel like road signage, not an audiobook narrator.
Yeah, that “road signage not audiobook” line nails it — the failure mode is when the prosody starts competing with the task. I’d want a hard cap on expressiveness for anything time-critical (nav, alarms, confirmations), like keeping it in a tight dynamic range so it stays legible under stress.
Look — the cap shouldn’t be “less emotion, ” it should be “less ambiguity. ” For time-critical stuff I want the same phrasing every time and a distinct earcon/beep before the words, because people miss tone under stress but they don’t miss a consistent pattern.
The earcon carries most of the message here.
Google should keep expressive voices to non-critical moments to avoid confusion, especially when users are stressed.
:: Copyright KIRUPA 2024 //--