Visualize input text cutoff in Assistant

Thibaultmol

Currently if you provide text in Assistant that is over the 40k char limit, the user doesn't get told their text is getting cut off. it should visualize it. (maybe have the text limit only show up once user gets close or is over the limit.

But currently the user just doesnt know unless they manually go and check the submitted input text.

To clarify:

you should still be able to type or paste more than you can actually send
the submit button should become inactive if it's over the allowed length
any text over the length should be highlighted in red in the input text field

akkdfkz

Issue Description:

The current token input limit in Kagi Assistant restricts advanced use cases, despite the fact that many underlying AI models support much larger context windows.
The interface does not display information about token usage (current context count and limit).
There is no option to select or maximize the context window based on the capabilities of the chosen AI model.
Steps to Reproduce:
Attempt to input a document exceeding 32K tokens into Kagi Assistant
Observe that the system truncates the input
Note the absence of any token count indicators in the interface

Expected Behavior:

Kagi Assistant should support the maximum context window that each underlying AI model can handle (e.g., Claude or GPT-4 models support 100K+ tokens).
The interface should clearly display:
- Current token count for user input
- Remaining available tokens
- Maximum token limit for the selected model
Users should have the option to select their preferred context window size up to the maximum supported by each model

fblissjr

Agreed. Longer inputs with examples enables better results, as does being able to toggle temperature and top p. Depending on your use cases and model, certain models perform better or worse with few shot examples vs structured inputs vs sampler settings.

I understand limiting max output tokens for cost reasons, but at least exposing these and allowing different assistants to be created under different settings fits exactly with what made Kagi the only platform that managed to get me to move off Google.

There’s often misconceptions that temp=0, for example, is the best way to use LLMs for code. But well structured and longer inputs, and higher entropy constrained with higher diversity (top p) is really important for getting good outputs, and no two models have the exact same optimal ones (though we know between 0.5 and 1 temp with a .95 top p tend to work best with thinking models). Often higher with non thinking.

I understand limiting this on the main assistant screens to defaults, but making this like a baby would make me recommend this more to people who still use perplexity.

RobOK

i think kagi should have optionally visual speedometers for price and context window usage (i.e. as a convo is getting long a visual indicator of the context window), optional meaning user can turn them off

Thibaultmol

As mentioned on the discord, (while we should still eventually visualize the input text cutoff), Assistant should be better in the new release (soon) to recall information from earlier in the convo even if it's outside of the usual cutoff