- I was using a custom LLM based on Opus for a writing task in assistant v2.
- I kept the chat saved, then hopped onto a different chat with Sonnet.
- When I went back to the Opus chat, it started using Sonnet. It also started using internet searches, even though they were turned off. I also noticed the token count going crazy, using the info button. After a lengthy chat with Claude, involving detailed text analysis, it was at around 100k. Then, after getting kicked over to Sonnet and asking a single question, the count was over 300K.
The LLM should stay on the specified model. The token count should accurately reflect the tokens used.