Side-by-side LLM comparator answering the same prompt

Mat00

The “Arena (side-by-side)” feature of lmarena.ai's Chatbot Arena, previously lmsys.org, allows you to submit one prompt and have different LLMs answer it simultaneously and the output is shown in two columns.

Could Kagi Assistant get a similar feature, having two different AI language models next to each other replying to the same question? Of course without the feedback buttons "A is better" since it doesn't train on user data.

It regularly happens that Kagi adds a new LLM provider or that a third-party model receives an update (like Sonnet 3.5 in October 2024). Then it's useful to compare the responses side-by-side to see for yourself which one fits your specific use case better.

A while back when configuring Custom Assistants with Response Instructions, I noticed that Claude had more trouble using ResearchAgent correctly compared to an OpenAI LLM. It would have been easier to compare such differences if Kagi had a side-by-side feature. It's difficult to use external tools for this since Kagi doesn't expose an official API for the Assistant. And when calling the third-party LLM directly, you can't experiment with the Kagi specific features such as ResearchAgent when choosing between AI models.

Thibaultmol

I can see it make sense although obv it will increase costs for Kagi to double run prompts like (you'll garanteed have poeple using such a UI for every conversation so doubling their usage).

And again, while it could be useful def in the situation of a custom assistant. I wonder how much this wouldn't work because of the random nature of LLm's. (maybe one of the two just by random chance happens to give a bad output).

but it's an interestitng idea!

miicat_47

Thibaultmol

Kagi could maybe implement it so that the prompt is not run automatically on the other window by default, instead it will have suggestion button to send the same prompt to the other

I'm thinking that we could use that double window in case we want to talk to to different AIs or have two separate chats at the same time (with same LLM)

And make this behind "Advanced mode" toggle obliviously

Mat00

Thibaultmol And again, while it could be useful def in the situation of a custom assistant. I wonder how much this wouldn't work because of the random nature of LLm's. (maybe one of the two just by random chance happens to give a bad output).

Yes, it might indeed lead to cherry picking if you don't repeat the experiment often enough.

The topic of randomness triggered me to think about something else.
On lmarena.ai, when you ask a follow-up question, both models keep their own context. Like if you ask them to think of a random number, and then in a next prompt to multiply that number, they will each multiply a different number. And that behavior is often what you want I guess.
Sometimes, when the conversation diverges too much, one might want to synchronize the chat history between the columns. Though you could also start a new thread and copy paste the important context.

If the prompt is not ran automatically for both columns, and you have already submitted multiple queries in a thread, before asking for the "second opinion" of a different LLM, what happens? Should it submit all the queries from the start? Or should it copy the chat history of the first column and only do one additional query against the other model?

Mat00

miicat_47 Kagi could maybe implement it so that the prompt is not run automatically on the other window by default, instead it will have suggestion button to send the same prompt to the other

That would be a bit like Poe's multi-bot chat feature that has suggestion buttons to compare the response. When you ask for a second opinion that way, it confusingly keeps everything in one column. So, it isn't intuitive what context gets passed to the LLM. It visually looks as if the second model also sees the reply of the first model.

afpco

I really need this!