When Kagi Assistant models are asked to give detailed explanations, they may often do multiple batches of thinking and web searching, sometimes outputting text prematurely to tell the user what they're doing. Kagi Assistant's software however limits these work blocks in such a way that the model may end up using all of them before it actually creates an answer for the user.
An example of such an AI workflow:
> Original query from user
> Status update
Web Searching (hidden)
Thinking (hidden)
Web Searching (hidden)
Thinking (hidden)
Web Searching (hidden)
Thinking (hidden)
> Empty response given to user
I expect the Assistant to be provided with clearer limits to thinking and web search such that it can both think and search multiple times to get better answers, while also making sure it also keeps one work block for the final answer that is outputted.