Configuring LLM sampling parameters

youngji22

It would be nice to have a way to configure sampling methods like Top P, Top K, and temperature for LLMs for custom assistants.

Model performance can vary based on these sampler settings and the requested task, e.g. a temperature like 0.7 can lead to improved performance compared to a temperature of 1. It will allow the user to get more out of the assistant.

RoxyRoxyRoxy

We plan to expose configuration for these parameters soon (assuming the provider of the model in question supports it)