~380 ms latency
A five-run Beijing Qwen 3.5 Flash benchmark on April 24, 2026 recorded 379 ms p50 TTFT. Separate GPT-5.4 nano relay runs averaged 329 ms in SJC and 395 ms in IAD. These measure the model's first token, not the entire spoken-question pipeline.
Download benchmark results