Qwen 3 Omni is a new model capable of native voice and video understanding with real time speech generation.
https://github.com/QwenLM/Qwen3-Omni
Adding support for this allows for natural verbal conversation with Assistant with real time video context. You could point your phone toward an ingredients list and ask questions about it in real time.
This feature could be comparable to ChatGPT Advanced Voice and Gemini Live.