I would love to be able to be able to chat with the assistant by talking through my mic instead of typing. It would make chat a lot easier.
ChatGPT (still) does not have speech-to-text in the web, so this would be another feature that sets Kagi Assistant apart.
The user would either press a microphone icon to "Start recording" and their words would automatically go into the text input field or they would hold a button down (like ChatGPT does in VS Code) to use the speech-to-text.
Kagi has a few options for implementation—
- Browser API - Chrome has a built-in "speech-to-text" API that works offline. Orion might be able to do the same (like embedding Whisper or similar in the browser). This would ensure privacy and responsiveness, but might sacrifice quality.
- Kagi API - Kagi could add an API on their server to proxy speech-to-text requests to another service like Deepgram or OpenAI Whisper (like how Kagi does it now with the AI assistant)
- Integration / Addon - I don't think this is really Kagi's style, but Kagi could always just build out the basic client-side functionality and then ask the user to bring their own Deepgram / OpenAI API key to enable dictation. The reason I mention this is because I've built dictation into a few apps, it takes only a few hours and it offers a ton of value to the user.
EDIT - I put it in a comment down below, but I later realized Kagi does have dictation, but not on Firefox. This feedback request will instead be specifically for adding dictation to Firefox.