Platform
Models we support
The model families and runtimes we work with, and how we pick.
We're model-agnostic and pick the right model for each task and budget — you're never locked to one vendor.
Open / self-hostable
- Llama
- Mistral
- Qwen
- DeepSeek
- Gemma
- Phi
- GLM
- Kimi
API providers
- OpenAI GPT
- Anthropic Claude
- Google Gemini
- xAI Grok
- Cohere
Runtime & infrastructure
- vLLM
- Ollama
- llama.cpp
- Hugging Face TGI
- RunPod
For a given use case we'll benchmark a couple of options and recommend the best balance of quality, speed, and cost.