Soham Shah

Designing the interface between human intent and machine autonomy.

Building agent systems that compress context, coordinate specialists, and sustain coherence across long-horizon tasks.

Active / SF Bay / Agent Architecture
Research

Routing the Routers

A technical survey of where LLM model routing actually sits in production. Seven research streams (learned routers, cascades, capability matrices, self-routing, cache-aware scheduling, agent dispatch, vendor gateways) mapped onto one stack, with an off-policy estimation playbook for evaluating routers on logged traffic, an article-by-article EU AI Act mapping for the audit surface, and a closing list of experiments worth running.

researchroutingcascadescapability-matrixcausal-inferenceeu-ai-actinference
CODEMATHTOOLLCTXCOSTCLAUDEGPTGEMINIHAIKU0.910.840.760.69M[MODEL × SKILL]ROUTE ON THE COLUMN, NOT THE QUERY
Research

The Shape of a Working Multi-Agent System

"Should I use a multi-agent system?" is the wrong question. A long, opinionated technical synthesis from ~200 primary sources — eight parts on context, write authority, orchestration, protocols, control plane, and a reference architecture you can build today. Plus six staked bets for May 2026 → May 2027.

researchmulti-agentagent-systemscontext-engineeringcontrol-plane
VERIFYREFUTETESTEXTERNALIZED STATEWRITERORCHESTRATOR · SCHEDULER1 / TASK
Research

Self-Evolving Skills in Multi-Agent Systems

Skills became the npm of agentic AI in eight weeks. A survey of that explosion — and what an optimal architecture looks like.

researchagent-skillsmulti-agent
v0v1v2v3v4SELECTEDSKILL.v0 → EVOLVED VARIANTS
Research

Context Compaction for Long-Running Agents

Agent sessions degrade as context grows -- not at the limit, but continuously. A research-grounded design for keeping context lean and high-signal, synthesizing the latest work from 40+ papers.

researchcontext-engineeringagent-systems
COMPACT HERE0%CONTEXT FILL %100%HILOQUALITY
Architecture

Specialized Subagent Architecture

From generic delegation to context-aware multi-agent coordination -- how to decompose, delegate, persist, and train specialized agents.

architecturemulti-agentsystems-design
ORCHEXPLORECODEVERIFYFAST