arch-scalability
Domain: arch · Model class: strong
Description
Section titled “Description”Use this skill when the user wants to work on Designing AI systems for inference-heavy scalability, latency budgets, and cost efficiency. Triggers include “how do I scale my AI system”, “inference scalability”, “latency budget design”. Do NOT use when design the initial system (use core-system-design).
Purpose
Section titled “Purpose”Designing AI systems for inference-heavy scalability, latency budgets, and cost efficiency. This skill provides structured guidance, references, and worked examples to help produce high-quality, actionable outputs.
Trigger Phrases
Section titled “Trigger Phrases”- “how do I scale my AI system”
- “inference scalability”
- “latency budget design”
- “cost-aware agent architecture”
Anti-Triggers
Section titled “Anti-Triggers”- design the initial system (use core-system-design)
- analyze runtime performance (use core-performance-review)
Intake Questions
Section titled “Intake Questions”- What is the user’s goal and current state?
- What constraints (time, team, compliance) apply?
- Are there existing artifacts (specs, code, benchmarks) to reference?
Output Contract
Section titled “Output Contract”- architecture recommendation
- tradeoff summary
- system component framing
- risk and next-step guidance