A request refers to a single interaction sent to a large language model (LLM) from Botpress. It includes the input data, configuration parameters, and any relevant context needed for the model to process the request and return a response. Each request represents one call to an LLM, such as generating text, answering a question, or performing other tasks.
The data in the charts above shows the number of individual requests made to each of the LLMs queried by users from Botpress.
What does speed refer to?
Speed refers to the average number of tokens an LLM generates per second when processing a request. Tokens are units of text, such as words or parts of words, that the model reads or produces. This measurement reflects the output performance of the model, indicating how quickly it can return a response.
The data in the charts above shows an average of the total number of tokens a model generates per second when queried from Botpress.
What does cost refer to?
Cost refers to the average price in US dollars for processing 1,000 requests to a specific LLM from Botpress. This metric helps demonstrate the relative expense of using different models, providing insight into their cost efficiency when handling large volumes of requests.
A single conversation may contain multiple requests. You can use the data in the charts above to roughly estimate, based on the nature of your conversations, your monthly AI spend.
How often is the data on this page updated?
Though information on LLM usage in Botpress is collected in real-time, the charts on this page are updated every 48 hours.