Otari Gateway routes requests to LLM providers through any-llm. any-llm is the Python library Otari Gateway uses internally to talk to providers. If you do not need a gateway, see the any-llm GitBook.
Model format
Models are specified as provider:model_name.
openai:gpt-4o
anthropic:claude-sonnet-4-6
mistral:mistral-large-latest
vertexai:gemini-2.0-flash
The provider prefix tells the gateway which backend to route to. The model_name is passed directly to that provider's API.
Connected mode note
In connected mode, otari.ai resolves the final provider and model choice for each request, so the local
providers
configuration is not used for routing.
Provider support
Capability columns show common support patterns by provider family. Exact support still depends on the model you choose.
Hosted providers
Direct hosted provider APIs. These are the most common standalone gateway integrations.
Provider | Config key | Example model | Chat | Embed | Image | Audio | Rerank | Notes |
Anthropic | anthropic | anthropic:claude-sonnet-4-6 | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Also supports /v1/messages and batches. |
Cerebras | cerebras | cerebras:llama3.1-8b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Cohere | cohere | cohere:command-r-plus | Chat: supported | Embeddings: supported | Images: not listed | Audio: not listed | Rerank: supported | Also supports rerank. |
DeepInfra | deepinfra | deepinfra:meta-llama/Llama-3-70b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
DeepSeek | deepseek | deepseek:deepseek-chat | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Fireworks | fireworks | fireworks:llama-v3-70b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Gemini | gemini | gemini:gemini-2.0-flash | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Capability coverage depends on the specific Gemini model. |
Groq | groq | groq:llama3-70b-8192 | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
HuggingFace | huggingface | huggingface:meta-llama/Llama-3-70b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Inception | inception | inception:mercury-coder-small | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
MiniMax | minimax | minimax:abab5.5-chat | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Mistral | mistral | mistral:mistral-large-latest | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Moonshot | moonshot | moonshot:moonshot-v1-8k | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Nebius | nebius | nebius:llama-3-70b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
OpenAI | openai | openai:gpt-4o | Chat: supported | Embeddings: supported | Images: supported | Audio: supported | Rerank: not listed | Also supports /v1/responses, /v1/moderations, images, audio, and batches. |
OpenRouter | openrouter | openrouter:openai/gpt-4o | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Acts as an upstream router. Exact capabilities vary by model. |
Perplexity | perplexity | perplexity:llama-3-sonar-large | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
SambaNova | sambanova | sambanova:llama3-70b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Together | together | together:meta-llama/Llama-3-70b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Voyage | voyage | voyage:voyage-large-2 | Chat: not listed | Embeddings: supported | Images: not listed | Audio: not listed | Rerank: not listed | Embeddings only. |
xAI | xai | xai:grok-2 | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Cloud platform providers
Platform-specific integrations that usually require project, account, or service credentials.
Provider | Config key | Example model | Chat | Embed | Image | Audio | Rerank | Notes |
AWS Bedrock | bedrock | bedrock:anthropic.claude-v2 | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | AWS credentials required. |
Azure OpenAI | azureopenai | azureopenai:gpt-4o | Chat: supported | Embeddings: supported | Images: supported | Audio: supported | Rerank: not listed | Requires api_base. |
Azure Anthropic | azureanthropic | azureanthropic:claude-sonnet-4-6 | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Requires api_base. |
DashScope | dashscope | dashscope:qwen-turbo | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Alibaba Cloud. |
Databricks | databricks | databricks:dbrx-instruct | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Requires api_base. |
SageMaker | sagemaker | sagemaker:my-endpoint | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | AWS credentials required. |
Vertex AI | vertexai | vertexai:gemini-2.0-flash | Chat: supported | Embeddings: supported | Images: supported | Audio: not listed | Rerank: not listed | Requires service account credentials. |
Vertex AI Anthropic | vertexaianthropic | vertexaianthropic:claude-sonnet-4-6 | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Anthropic models through Vertex AI. |
WatsonX | watsonx | watsonx:ibm/granite-13b | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed |
|
Local providers
OpenAI-compatible local runtimes and self-hosted backends that stay inside your own environment.
Provider | Config key | Example model | Chat | Embed | Image | Audio | Rerank | Notes |
Llama.cpp | llamacpp | llamacpp:default | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Local server. |
Llamafile | llamafile | llamafile:default | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Local server. |
LM Studio | lmstudio | lmstudio:local-model | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Local server. |
Ollama | ollama | ollama:llama3 | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Local server. |
vLLM | vllm | vllm:my-model | Chat: supported | Embeddings: not listed | Images: not listed | Audio: not listed | Rerank: not listed | Self-hosted OpenAI-compatible runtime. |
Listing available models
In standalone mode, query the gateway to see which models are currently exposed.
curl http://localhost:8000/v1/models \
-H "Authorization: Bearer <your-api-key>"
