It seems that all of a sudden (over the weekend?) Claude Opus output started being cut off somewhere around 6500 tokens, with a message:
======The response is incomplete because it exceeded the model's output limit.
Of course, if it goes blithering on forever when this isn't desired, this could get expensive.
But in this case I was trying to generate some mid-length stories, and the extra output is exactly what I was going for.
If I understand correctly, Kagi is setting its own output length limit, which is different for each model.
e.g. a while back the output limit for DeepSeek was too low, triggering the error a lot: https://kagifeedback.org/d/6117-deepseek-r1-token-generation-stops-abruptly-reaching-around-5800-total-tokens
This limit probably made more sense back when it was an all-you-can-eat pricing model, as opposed to now where the API costs are passed through.
Output continues to its natural conclusion rather than being stopped midway through