Why spec-driven tools
The ICMobile API has 339 GET operations. A naive MCP server would expose each one as a separate tool. mcsinglewire deliberately doesn’t. This page explains why — and what it costs.
The naive design and why it fails
Section titled “The naive design and why it fails”If you walked the OpenAPI spec and emitted one MCP tool per operation, you’d get something like:
getAlarmsgetAlarmgetIpSpeakersgetIpSpeakergetNotifications- …339 of these, with parameter schemas of varying complexity
Three things go wrong:
1. Context flooding
Section titled “1. Context flooding”MCP tool definitions are sent to the LLM at session start as part of the system prompt. A typical generated tool definition for a GET with a path param and a couple of query params is on the order of 100–200 tokens once you include the parameter descriptions and types. At 339 tools that’s roughly 35–70k tokens before the user’s first message, and that’s before any other MCP server you have registered (most Claude Code users have several). On a model with a 200k context window that’s 18–35% of the total budget, gone, every session.
Worse, most of those tools are never called in any given session. The spec-driven approach pays the spec-search cost only when a tool is actually used.
2. Discovery is harder, not easier
Section titled “2. Discovery is harder, not easier”You’d think more tools means more capability. In practice, the LLM
has to scan a long list to find the right one. Naming conventions are
not uniform across the 339 — some endpoints are named after the
resource (getAlarms), some after the action
(listIncidentsByLocation), some after both. A free-text search over
a structured spec beats LLM-side string matching against a flat list.
3. Audit and safety surface explodes
Section titled “3. Audit and safety surface explodes”Every tool is a place a regression can hide. Every parameter schema is a place a typo can become a runtime error. Every tool definition is a place the read-only constraint needs to be re-asserted (or, worse, forgotten). With 339 tools, the audit work scales linearly; with 5, it’s bounded.
The spec-driven alternative
Section titled “The spec-driven alternative”mcsinglewire bundles the OpenAPI 3.1 spec as a 3.3 MB JSON file inside
the package (src/mcsinglewire/data/openapi.json) and exposes five
generic tools that operate on it:
| Tool | Reads the spec | Talks to Singlewire |
|---|---|---|
health | no | no |
list_tags | yes | no |
openapi_search | yes | no |
openapi_describe | yes | no |
api_call | yes | yes (GET) |
The natural workflow falls out of the design:
list_tags() # → ["Alarms", "Sites", "Users", ...]
openapi_search("alarms") # → [{"operation_id": "getAlarms", ...}, # {"operation_id": "getAlarm", ...}]
openapi_describe("getAlarm") # → {"parameters": [{"name": "alarmId", "in": "path", "required": True, ...}]}
api_call("getAlarm", path_params={"alarmId": "..."}) # → {"status": 200, "data": {...}}The LLM does the matching. It’s good at this — searching a spec is exactly the kind of fuzzy-but-structured retrieval LLMs handle well.
What this buys us
Section titled “What this buys us”Bounded context cost
Section titled “Bounded context cost”Five tool definitions, each with a small, well-typed signature. The full set fits in a few hundred tokens. The 3.3 MB spec lives on the server and is queried lazily; the LLM only sees the relevant slice for each operation it inspects.
Bounded audit surface
Section titled “Bounded audit surface”Read-only enforcement, denylist enforcement, path-quoting, and
query-parameter validation all live in one function
(_validate_call). New endpoints added to the spec inherit all of
those guarantees automatically — there’s no per-tool place to forget
to apply them.
Forward compatibility with make refresh-spec
Section titled “Forward compatibility with make refresh-spec”When Singlewire adds a new endpoint, refreshing the bundled spec is enough to make it reachable. No code changes, no schema authoring, no regenerated tool list. The refresh how-to covers the routine; the denylist how-to covers the audit step.
A natural denylist
Section titled “A natural denylist”_GET_DENYLIST is a dict[str, str] mapping operationId to
rationale. Adding an entry takes one line in code and one test.
Removing an entry takes the same. The audit trail lives in git log.
Compare to a tool-per-endpoint design, where denylisting means
deleting (or feature-flagging, or commenting out) a whole tool.
What we trade away
Section titled “What we trade away”This design has costs. They’re worth naming:
- Less type safety at the MCP boundary. Per-tool definitions
could enforce parameter types at the MCP schema level, with the
LLM seeing strict schemas at session start. Spec-driven means the
LLM must read
openapi_describebefore each unfamiliar call. In practice this is a small cost — the LLM does this naturally — but it’s a real tradeoff. - Worse static introspection for humans. Tools like Claude’s “list available functions” view show only the five generic tools, not the 339 underlying operations. Discoverability for humans is slightly degraded; for LLMs it’s improved (because they can search the spec textually).
- One coarse permission grain. Some MCP clients expose per-tool authorization. With five generic tools, there is one big switch (“can this client call api_call?”) rather than 339 small ones. We compensate with the OAuth scope set, which is the right place for fine-grained authorization anyway.
Why bundle the spec instead of fetching it
Section titled “Why bundle the spec instead of fetching it”The spec is committed to the repo
(src/mcsinglewire/data/openapi.json) instead of being fetched at
startup. Reasons:
- Reproducibility. A given commit of mcsinglewire serves a known, frozen set of operations. Auditors can review the spec and the denylist together at a single point in time.
- Resilience. The server starts even if Singlewire’s spec endpoint is down or changes shape.
- Determinism in CI. Tests run against the same spec the production server uses.
The cost is that refreshing requires make refresh-spec and a code
review. That cost is the feature: it forces an audit on every change
to the reachable surface area.