Skip to content

Model Routing

Model cost tier routing — light mode

Model cost tier routing — dark mode

Model IDs are dynamic. Never hardcode display names or model IDs in source code. Use role classes (free, cheap, strong, reviewer) in code; the router resolves the actual model at runtime from orchestration.toml.

// ✅ Correct — use role class
const model = modelRouter.resolveForClass("strong");
// ❌ Wrong — hardcoded model name
const model = "claude-sonnet-4-6"; // will break when model roster changes
ClassUse caseCost multiplier
freeBroad drafting, fan-out, template fill, triage
cheapAggregation, fast classification, merge, routing~0.33×
strongSynthesis, physics, security, final judgment
reviewerCross-model audit, independent critique

The orchestration.toml file at .mcp-ai-agent-guidelines/config/orchestration.toml is the single source of truth for:

  • Which physical model maps to which role class
  • Which capability tags are required/preferred per workload profile
  • Fan-out counts per profile
  • Human-in-the-loop toggles

The built-in defaults (src/config/orchestration-defaults.ts) are an explicit fallback only — they are not the normal runtime authority. Strict mode fails fast if the primary file cannot be loaded.

flowchart TD
    A([Request]) --> B[Workload Profile\nfrom workflow spec]
    B --> C[Capability Tags\nrequired / preferred]
    C --> D[Model Candidates\nfrom orchestration.toml]
    D --> E[Filter by Available Models\nfrom model-discover]
    E --> F[Select by Class + Availability]
    F --> G([Physical Model ID])
    style A fill:#334155,color:#e2e8f0,stroke:#475569
    style G fill:#1e40af,color:#fff,stroke:#1d4ed8

From the project’s model roster (.copilot-models):

TierRole classExample modelsUsage
Zero-CostfreeGPT-4.1, GPT-5 miniSaturate first — fan-out, drafting, triage
EfficientcheapClaude Haiku 4.5, GPT-5.4 miniAggregation, merge, fast classification
AdvancedstrongClaude Sonnet 4.6, Claude Opus 4.6, GPT-5.4Synthesis, physics, security, final judgment
Cross-ModelreviewerGemini 2.5 Pro, Gemini 3.1 ProCross-model audit only

Core rule: saturate the free tier first. Pay exactly once for synthesis/review. Never run strong end-to-end on a task where free lanes can draft.

Two strong models are kept as peers, not primary/backup:

DimensionStrong model AStrong model B
Long-context coherenceExcellentExcellent
Independent adversarial critiqueMay confirm own prior planPreferred — lower self-agreement bias
Physics / math symbolic reasoningPreferred for qm-*Also strong
Security threat modelingStrongPreferred as first-pass gov-* reviewer
Tie-breaking escalationFinal callFirst escalation

The value is in the disagreement surface between the two — whenever model A generates a plan, model B is the critique lane, not optional.

src/models/model-router.ts
// Resolve model for a role class
const model = await modelRouter.profileForClass("strong");
// Discover available models
const models = await modelRouter.discoverAvailableModels();
// Check if a capability tag is available
const canDoPhysics = modelRouter.supportsCapability("math_physics");

To add a new model:

  1. Add it to the model registry in orchestration.toml under the appropriate class
  2. Call model-discover to verify it is advertised by the host
  3. The ModelRouter picks it up automatically on next initialization — no code changes needed