Skip to main content

Models & providers

Otari Gateway routes requests to LLM providers through any-llm. any-llm is the Python library Otari Gateway uses internally to talk to providers. If you do not need a gateway, see the any-llm GitBook.

Model format

Models are specified as provider:model_name.

openai:gpt-4o
anthropic:claude-sonnet-4-6
mistral:mistral-large-latest
vertexai:gemini-2.0-flash

The provider prefix tells the gateway which backend to route to. The model_name is passed directly to that provider's API.

Connected mode note

In connected mode, otari.ai resolves the final provider and model choice for each request, so the local

providers

configuration is not used for routing.

Provider support

Capability columns show common support patterns by provider family. Exact support still depends on the model you choose.

Hosted providers

Direct hosted provider APIs. These are the most common standalone gateway integrations.

Provider

Config key

Example model

Chat

Embed

Image

Audio

Rerank

Notes

Anthropic

anthropic

anthropic:claude-sonnet-4-6

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Also supports /v1/messages and batches.

Cerebras

cerebras

cerebras:llama3.1-8b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Cohere

cohere

cohere:command-r-plus

Chat: supported

Embeddings: supported

Images: not listed

Audio: not listed

Rerank: supported

Also supports rerank.

DeepInfra

deepinfra

deepinfra:meta-llama/Llama-3-70b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

DeepSeek

deepseek

deepseek:deepseek-chat

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Fireworks

fireworks

fireworks:llama-v3-70b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Gemini

gemini

gemini:gemini-2.0-flash

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Capability coverage depends on the specific Gemini model.

Groq

groq

groq:llama3-70b-8192

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

HuggingFace

huggingface

huggingface:meta-llama/Llama-3-70b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Inception

inception

inception:mercury-coder-small

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

MiniMax

minimax

minimax:abab5.5-chat

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Mistral

mistral

mistral:mistral-large-latest

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Moonshot

moonshot

moonshot:moonshot-v1-8k

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Nebius

nebius

nebius:llama-3-70b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

OpenAI

openai

openai:gpt-4o

Chat: supported

Embeddings: supported

Images: supported

Audio: supported

Rerank: not listed

Also supports /v1/responses, /v1/moderations, images, audio, and batches.

OpenRouter

openrouter

openrouter:openai/gpt-4o

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Acts as an upstream router. Exact capabilities vary by model.

Perplexity

perplexity

perplexity:llama-3-sonar-large

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

SambaNova

sambanova

sambanova:llama3-70b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Together

together

together:meta-llama/Llama-3-70b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Voyage

voyage

voyage:voyage-large-2

Chat: not listed

Embeddings: supported

Images: not listed

Audio: not listed

Rerank: not listed

Embeddings only.

xAI

xai

xai:grok-2

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Cloud platform providers

Platform-specific integrations that usually require project, account, or service credentials.

Provider

Config key

Example model

Chat

Embed

Image

Audio

Rerank

Notes

AWS Bedrock

bedrock

bedrock:anthropic.claude-v2

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

AWS credentials required.

Azure OpenAI

azureopenai

azureopenai:gpt-4o

Chat: supported

Embeddings: supported

Images: supported

Audio: supported

Rerank: not listed

Requires api_base.

Azure Anthropic

azureanthropic

azureanthropic:claude-sonnet-4-6

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Requires api_base.

DashScope

dashscope

dashscope:qwen-turbo

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Alibaba Cloud.

Databricks

databricks

databricks:dbrx-instruct

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Requires api_base.

SageMaker

sagemaker

sagemaker:my-endpoint

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

AWS credentials required.

Vertex AI

vertexai

vertexai:gemini-2.0-flash

Chat: supported

Embeddings: supported

Images: supported

Audio: not listed

Rerank: not listed

Requires service account credentials.

Vertex AI Anthropic

vertexaianthropic

vertexaianthropic:claude-sonnet-4-6

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Anthropic models through Vertex AI.

WatsonX

watsonx

watsonx:ibm/granite-13b

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Local providers

OpenAI-compatible local runtimes and self-hosted backends that stay inside your own environment.

Provider

Config key

Example model

Chat

Embed

Image

Audio

Rerank

Notes

Llama.cpp

llamacpp

llamacpp:default

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Local server.

Llamafile

llamafile

llamafile:default

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Local server.

LM Studio

lmstudio

lmstudio:local-model

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Local server.

Ollama

ollama

ollama:llama3

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Local server.

vLLM

vllm

vllm:my-model

Chat: supported

Embeddings: not listed

Images: not listed

Audio: not listed

Rerank: not listed

Self-hosted OpenAI-compatible runtime.

Listing available models

In standalone mode, query the gateway to see which models are currently exposed.

curl http://localhost:8000/v1/models \
-H "Authorization: Bearer <your-api-key>"
Did this answer your question?