Bring your own AI
Cabinet doesn't host inference. You connect the providers you already pay for, and Cabinet routes your agents' calls to them. There's no Cabinet middleman, no inference markup, no opaque quota. You pay your own bill, you use your own quota.
Supported providers today
Cabinet runs through a provider adapter layer. As of v0.4.0, eight CLI providers ship in the default runtime — all local, so your prompts go through a tool you've already installed and authenticated:
| Adapter | Provider | Install |
|---|---|---|
claude_local | Anthropic Claude (via Claude Code CLI) | npm install -g @anthropic-ai/claude-code |
codex_local | OpenAI GPT (via Codex CLI) | npm install -g @openai/codex or brew install --cask codex |
gemini_local | Google Gemini (via Gemini CLI) | See provider docs |
opencode_local | OpenCode | See provider docs |
pi_local | Pi (Inflection) | See provider docs |
| + 3 more CLI providers | See provider settings in app |
You need at least one CLI installed. The defaults Cabinet boots with are claude_local and codex_local; everything else turns on per-agent. Each adapter has shared runtime controls — model picker, effort sliders, dynamic listModels(), brand icons.
Coming next
Direct API access (no CLI required) and more open-weights models are on the way:
| Provider | How it'll connect | Status |
|---|---|---|
| Hermes (Nous) | Local via Ollama | Planned |
| Llama / Mistral / Qwen | Local via Ollama | Planned |
| LM Studio | Local, no key | Planned |
| Anthropic API (direct) | API key, no CLI required | Planned |
| OpenAI API (direct) | API key, no CLI required | Planned |
| xAI Grok (direct) | API key | Planned |
| Image generation | FLUX, DALL·E, local SD | Planned |
Track progress on the Roadmap.
Why a mix of models
Most AI products lock you to one model behind one API. Cabinet's bet is the opposite: your team should be a mix of models, picked per job. The strategy lead runs on Opus because it writes carefully. The triage agent runs on Haiku because it's cheap and fast. The local agent runs on a local model because the data is sensitive. Same cabinet, different brains.
What each provider is good for
| Provider | Strengths | When to pick |
|---|---|---|
| Anthropic Claude | Long careful writing, strong reasoning, big context | Defaults for leads, editors, anything where tone matters |
| OpenAI GPT | Fast, cheap, strong tool use | Triage, dispatchers, high-volume routine work |
| Google Gemini | Long context, low cost per token, vision | Research synthesis, document QA, long-form summaries |
| xAI Grok | Fresh news access, lighter touch | Trend analysts, social listening |
| Local (Ollama / LM Studio) | Offline, private, free | Sensitive data, air-gapped work, hobby experiments |
Setting an agent's model
In persona.md:
---
name: GTM Lead
model: claude-opus-4-7 # the default for this agent
---
Or override per task at the composer:
┌───────────────────────────────────┐
│ → GTM Lead ▾ gpt-4.1 ▾ │ ← override the model here
│ ◯ low ● medium ◯ high │
└───────────────────────────────────┘
Or per heartbeat:
heartbeats:
- cron: "0 9 * * 1-5"
model: claude-haiku-4-5 # cheap for the daily pulse
prompt: "Inbox triage."
- cron: "0 17 * * 5"
model: claude-opus-4-7 # opus for the weekly synthesis
prompt: "Friday wrap."
Routing rules
In .cabinet you can set global routing:
providers:
defaults:
lead: claude-opus-4-7
specialist: claude-sonnet-4-6
triage: gpt-4.1
fallbacks:
claude-opus-4-7:
- claude-sonnet-4-6
- gpt-4.1
budgets:
daily:
maxCostUsd: 25
perTask:
maxCostUsd: 5
If a request hits a budget cap or a 429, Cabinet falls back to the next model in the list and writes a note in the run transcript.
What Cabinet never does
- Forwards your prompts to a Cabinet-hosted backend.
- Caches your prompts for "model improvement."
- Charges you for inference.
- Locks you to a provider after install.
If you delete your API key, the provider disappears. Your prompts and outputs stay in your cabinet folder, on your disk.
Local-only mode
Want every call to stay on your machine? Set providers.localOnly: true in .cabinet:
providers:
localOnly: true
defaults:
lead: ollama/llama-3.3-70b
specialist: ollama/qwen-2.5-coder
Cabinet refuses any non-local provider. If an agent's persona points at claude-opus-4-7, the run errors with a clear "local-only" message instead of silently falling through.
Read on
- Persona — where you set an agent's model.
- Routines — per-routine model overrides.
- Tips → Performance & cost — the cheap-fast / careful-slow recipe.