When OpenAI releases a new version of GPT, or when Anthropic ships an update to Claude, the headlines focus on benchmark ...