Benchmarks

Compare the performance of LLMs across endpoint providers to find the best possible configuration for your speed, latency and cost requirements. Our objective benchmarks are continuously updated with the newest models and endpoints.

Filters

Modality
Task

Models

Select from the list of models below

llama-2-70b-chat
text-generation
llama-2-13b-chat
text-generation
llama-2-7b-chat
text-generation
codellama-34b-instruct
text-generation
gemma-7b-it
text-generation
codellama-13b-instruct
text-generation
codellama-7b-instruct
text-generation
yi-34b-chat
text-generation
pplx-7b-chat
text-generation
mistral-medium
text-generation
gpt-4
text-generation
pplx-70b-chat
text-generation
gpt-3.5-turbo
text-generation
gemma-2b-it
text-generation
gpt-4-turbo
text-generation
mistral-small
text-generation
mistral-large
text-generation
claude-3-haiku
text-generation
claude-3-opus
text-generation
claude-3-sonnet
text-generation