The Metric Your Users Feel Before You Measure It
Working on a streaming chat product taught me something: the standard latency metrics don't really describe what users experience. They're not waiting for a page to load or an API to return a JSON blob. They're watching tokens appear — and what they feel before anything appears is the thing most teams aren't measuring.
That thing is time-to-first-token. TTFT.