Bring your own AI

Bring your own AI

Cabinet doesn't host inference. You connect the providers you already pay for, and Cabinet routes your agents' calls to them. There's no Cabinet middleman, no inference markup, no opaque quota. You pay your own bill, you use your own quota.

Supported providers today

Cabinet runs through a provider adapter layer. As of v0.4.0, eight CLI providers ship in the default runtime — all local, so your prompts go through a tool you've already installed and authenticated:

AdapterProviderInstall
claude_localAnthropic Claude (via Claude Code CLI)npm install -g @anthropic-ai/claude-code
codex_localOpenAI GPT (via Codex CLI)npm install -g @openai/codex or brew install --cask codex
gemini_localGoogle Gemini (via Gemini CLI)See provider docs
opencode_localOpenCodeSee provider docs
pi_localPi (Inflection)See provider docs
+ 3 more CLI providersSee provider settings in app

You need at least one CLI installed. The defaults Cabinet boots with are claude_local and codex_local; everything else turns on per-agent. Each adapter has shared runtime controls — model picker, effort sliders, dynamic listModels(), brand icons.

Coming next

Direct API access (no CLI required) and more open-weights models are on the way:

ProviderHow it'll connectStatus
Hermes (Nous)Local via OllamaPlanned
Llama / Mistral / QwenLocal via OllamaPlanned
LM StudioLocal, no keyPlanned
Anthropic API (direct)API key, no CLI requiredPlanned
OpenAI API (direct)API key, no CLI requiredPlanned
xAI Grok (direct)API keyPlanned
Image generationFLUX, DALL·E, local SDPlanned

Track progress on the Roadmap.

Why a mix of models

Most AI products lock you to one model behind one API. Cabinet's bet is the opposite: your team should be a mix of models, picked per job. The strategy lead runs on Opus because it writes carefully. The triage agent runs on Haiku because it's cheap and fast. The local agent runs on a local model because the data is sensitive. Same cabinet, different brains.

What each provider is good for

ProviderStrengthsWhen to pick
Anthropic ClaudeLong careful writing, strong reasoning, big contextDefaults for leads, editors, anything where tone matters
OpenAI GPTFast, cheap, strong tool useTriage, dispatchers, high-volume routine work
Google GeminiLong context, low cost per token, visionResearch synthesis, document QA, long-form summaries
xAI GrokFresh news access, lighter touchTrend analysts, social listening
Local (Ollama / LM Studio)Offline, private, freeSensitive data, air-gapped work, hobby experiments

Setting an agent's model

In persona.md:

---
name: GTM Lead
model: claude-opus-4-7      # the default for this agent
---

Or override per task at the composer:

┌───────────────────────────────────┐
│  → GTM Lead   ▾  gpt-4.1   ▾     │   ← override the model here
│  ◯ low  ● medium  ◯ high          │
└───────────────────────────────────┘

Or per heartbeat:

heartbeats:
  - cron: "0 9 * * 1-5"
    model: claude-haiku-4-5    # cheap for the daily pulse
    prompt: "Inbox triage."
  - cron: "0 17 * * 5"
    model: claude-opus-4-7     # opus for the weekly synthesis
    prompt: "Friday wrap."

Routing rules

In .cabinet you can set global routing:

providers:
  defaults:
    lead: claude-opus-4-7
    specialist: claude-sonnet-4-6
    triage: gpt-4.1
  fallbacks:
    claude-opus-4-7:
      - claude-sonnet-4-6
      - gpt-4.1
  budgets:
    daily:
      maxCostUsd: 25
    perTask:
      maxCostUsd: 5

If a request hits a budget cap or a 429, Cabinet falls back to the next model in the list and writes a note in the run transcript.

What Cabinet never does

  • Forwards your prompts to a Cabinet-hosted backend.
  • Caches your prompts for "model improvement."
  • Charges you for inference.
  • Locks you to a provider after install.

If you delete your API key, the provider disappears. Your prompts and outputs stay in your cabinet folder, on your disk.

Local-only mode

Want every call to stay on your machine? Set providers.localOnly: true in .cabinet:

providers:
  localOnly: true
  defaults:
    lead: ollama/llama-3.3-70b
    specialist: ollama/qwen-2.5-coder

Cabinet refuses any non-local provider. If an agent's persona points at claude-opus-4-7, the run errors with a clear "local-only" message instead of silently falling through.

Read on