Skip to content

Why spec-driven tools

The ICMobile API has 339 GET operations. A naive MCP server would expose each one as a separate tool. mcsinglewire deliberately doesn’t. This page explains why — and what it costs.

If you walked the OpenAPI spec and emitted one MCP tool per operation, you’d get something like:

  • getAlarms
  • getAlarm
  • getIpSpeakers
  • getIpSpeaker
  • getNotifications
  • …339 of these, with parameter schemas of varying complexity

Three things go wrong:

MCP tool definitions are sent to the LLM at session start as part of the system prompt. A typical generated tool definition for a GET with a path param and a couple of query params is on the order of 100–200 tokens once you include the parameter descriptions and types. At 339 tools that’s roughly 35–70k tokens before the user’s first message, and that’s before any other MCP server you have registered (most Claude Code users have several). On a model with a 200k context window that’s 18–35% of the total budget, gone, every session.

Worse, most of those tools are never called in any given session. The spec-driven approach pays the spec-search cost only when a tool is actually used.

You’d think more tools means more capability. In practice, the LLM has to scan a long list to find the right one. Naming conventions are not uniform across the 339 — some endpoints are named after the resource (getAlarms), some after the action (listIncidentsByLocation), some after both. A free-text search over a structured spec beats LLM-side string matching against a flat list.

Every tool is a place a regression can hide. Every parameter schema is a place a typo can become a runtime error. Every tool definition is a place the read-only constraint needs to be re-asserted (or, worse, forgotten). With 339 tools, the audit work scales linearly; with 5, it’s bounded.

mcsinglewire bundles the OpenAPI 3.1 spec as a 3.3 MB JSON file inside the package (src/mcsinglewire/data/openapi.json) and exposes five generic tools that operate on it:

ToolReads the specTalks to Singlewire
healthnono
list_tagsyesno
openapi_searchyesno
openapi_describeyesno
api_callyesyes (GET)

The natural workflow falls out of the design:

list_tags()
# → ["Alarms", "Sites", "Users", ...]
openapi_search("alarms")
# → [{"operation_id": "getAlarms", ...},
# {"operation_id": "getAlarm", ...}]
openapi_describe("getAlarm")
# → {"parameters": [{"name": "alarmId", "in": "path", "required": True, ...}]}
api_call("getAlarm", path_params={"alarmId": "..."})
# → {"status": 200, "data": {...}}

The LLM does the matching. It’s good at this — searching a spec is exactly the kind of fuzzy-but-structured retrieval LLMs handle well.

Five tool definitions, each with a small, well-typed signature. The full set fits in a few hundred tokens. The 3.3 MB spec lives on the server and is queried lazily; the LLM only sees the relevant slice for each operation it inspects.

Read-only enforcement, denylist enforcement, path-quoting, and query-parameter validation all live in one function (_validate_call). New endpoints added to the spec inherit all of those guarantees automatically — there’s no per-tool place to forget to apply them.

Forward compatibility with make refresh-spec

Section titled “Forward compatibility with make refresh-spec”

When Singlewire adds a new endpoint, refreshing the bundled spec is enough to make it reachable. No code changes, no schema authoring, no regenerated tool list. The refresh how-to covers the routine; the denylist how-to covers the audit step.

_GET_DENYLIST is a dict[str, str] mapping operationId to rationale. Adding an entry takes one line in code and one test. Removing an entry takes the same. The audit trail lives in git log. Compare to a tool-per-endpoint design, where denylisting means deleting (or feature-flagging, or commenting out) a whole tool.

This design has costs. They’re worth naming:

  • Less type safety at the MCP boundary. Per-tool definitions could enforce parameter types at the MCP schema level, with the LLM seeing strict schemas at session start. Spec-driven means the LLM must read openapi_describe before each unfamiliar call. In practice this is a small cost — the LLM does this naturally — but it’s a real tradeoff.
  • Worse static introspection for humans. Tools like Claude’s “list available functions” view show only the five generic tools, not the 339 underlying operations. Discoverability for humans is slightly degraded; for LLMs it’s improved (because they can search the spec textually).
  • One coarse permission grain. Some MCP clients expose per-tool authorization. With five generic tools, there is one big switch (“can this client call api_call?”) rather than 339 small ones. We compensate with the OAuth scope set, which is the right place for fine-grained authorization anyway.

Why bundle the spec instead of fetching it

Section titled “Why bundle the spec instead of fetching it”

The spec is committed to the repo (src/mcsinglewire/data/openapi.json) instead of being fetched at startup. Reasons:

  • Reproducibility. A given commit of mcsinglewire serves a known, frozen set of operations. Auditors can review the spec and the denylist together at a single point in time.
  • Resilience. The server starts even if Singlewire’s spec endpoint is down or changes shape.
  • Determinism in CI. Tests run against the same spec the production server uses.

The cost is that refreshing requires make refresh-spec and a code review. That cost is the feature: it forces an audit on every change to the reachable surface area.