Special Offer ✨  $50 Free Credits for All New Sign Ups 💸

The Best LLM on Every Prompt

The Best LLM on Every Prompt

Combine All Models to Create a Faster, Cheaper,
and Higher Performing Solution Than Any Single Model ->
Combine All Models to Create a Faster, Cheaper, and Higher Performing Solution Than Any Single Model ->
trusted by engineers at
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
trusted by engineers at
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
It Starts with Your Query
api

All Models,
All Providers,
One API

All Models, All Providers, One API

Access all LLMs across all providers with a single API key and a standard API.

import requests
url = "https://api.unify.ai/v0/inference"
headers = {
   "Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
   "model": "mixtral-8x7b-instruct-v0.1",
   "provider": "anyscale",
   "arguments": {
       "messages": [{
           "role": "user",
           "content": "YOUR_MESSAGE"
       }],
       "temperature": 50,
       "max_tokens": 500,
       "stream": True,
   }
}

response = requests.post(
        url, json=payload, headers=headers, stream=True
)
python
node.js
c
php
ruby
...
make your first request
import requests
url = "https://api.unify.ai/v0/inference"
headers = {
   "Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
   "model": "mixtral-8x7b-instruct-v0.1",
   "provider": "anyscale",
   "arguments": {
       "messages": [{
           "role": "user",
           "content": "YOUR_MESSAGE"
       }],
       "temperature": 50,
       "max_tokens": 500,
       "stream": True,
   }
}

response = requests.post(
        url, json=payload, headers=headers, stream=True
)
python
node.js
c
php
ruby
...
make your first request
Unify Router Preview
router

Get the Most
from Models with
Expert Routing

Get the Most from Models with Expert Routing

Automatically send your queries to the most appropriate model and get the best output, with the fastest providers, at the lowest cost.

Unify Router Preview
performance

Constantly Achieve
Peak Performance

Systematically send your queries to the fastest provider, based on the very latest benchmark data for your region of the world, refreshed every 10 minutes.

Mistral logo
Mixtral8x7B Instruct v0.1
+102.56
%
Tokens / Sec
+406.66
%
TTFT
+138.37
%
E2E Latency
+206.85
%
ITL
Meta logo
LLaMa2 70B Chat
+83.97
%
Tokens / Sec
+262.96
%
TTFT
+95.02
%
E2E Latency
+155.66
%
ITL
anyscale logo
anyscale
replicate logo
replicate
together.ai logo
together.ai
octo-ai logo
octoai
Mistral logo
mistral-ai
Unify logo
router
Unify Benchmarks Mistral Preview
Throughput Scatter Graph
modular

Your Query,
Your Needs,
Custom Routing

Your Query, Your Needs,
Custom Routing

Setup your own cost, latency and output speed constraints. Define a custom quality metric. Personalize your router for your requirements.
Throughput Scatter Graph

Frequently Asked Questions

Do I need to create an account with each provider?
Do you charge anything on top of the upstream providers?
How do you determine what the best model is?
->
back to top