Theo just released commentary on this very subject:
watch 14:26-15:36
TL;DR: "Low" or "Medium" are appropriate. "High" just makes the request 3x more expensive.
In other words: The quality of the output changes very little, but the cost increase is substantial.
@Vlad If that's true, then Assistant could have a toggle for reasoning (like we do with Assistant + web search). The toggle to activate reasoning would be the equivalent of "Low" or "Medium" mode in other LLM providers so that we're not burning tokens. Assistant could also have a separate setting for reasoning by default under https://kagi.com/settings/assistant > Custom Assistants.
Would love to see this ASAP. We don't even have a toggle to activate "Low" reasoning for a model as cheap as Gemini 2.5 Flash, and I'd never use Claude 4 Opus as that's expensive and overkill for my use case.
Article in the video discussing the future of subscription-based LLM services: https://ethanding.substack.com/p/ai-subscriptions-get-short-squeezed