Currently the Assistant has a hidden system message that takes around 570(?) tokens, which is added to every single thread the person creates, inflating token usage in short conversations to possibly way more than the conversations themselves. You can test this by typing anything into the AI and looking at the info section. You'll see a disparency with end-to-end time, tokens per second and the running total of tokens.
This kind of makes any fair use token amount misleading, and in general makes the Assistant more expensive for Kagi if the user doesn't want a pre-prompted AI. There already is a custom instruction area in the settings.
If the user adds any custom instructions the base token usage drops down to around 500 for some reason, but it's still a huge overhead.
It seems to be mostly made so the Assistant knows some basic things like the date, possible formatting and knowing that it's being used via Kagi. However I feel the user should be in control of this, while now it's forced, adding possibly useless tokens to every message.
You could have it on by default, but allow turning it off.
This adds more control to the user, allows more precise and less costly prompting. Also makes it cheaper for Kagi by not having the redudant tokens in every thread (if the user so chooses). There should be no overhead if the user doesn't want any, as they're essentially using the API of whichever model, but currently having forced "invisible" tokens added.