Why spec-driven tools

The ICMobile API has 339 GET operations. A naive MCP server would expose each one as a separate tool. mcsinglewire deliberately doesn’t. This page explains why — and what it costs.

The naive design and why it fails

If you walked the OpenAPI spec and emitted one MCP tool per operation, you’d get something like:

getAlarms
getAlarm
getIpSpeakers
getIpSpeaker
getNotifications
…339 of these, with parameter schemas of varying complexity

Three things go wrong:

1. Context flooding

MCP tool definitions are sent to the LLM at session start as part of the system prompt. A typical generated tool definition for a GET with a path param and a couple of query params is on the order of 100–200 tokens once you include the parameter descriptions and types. At 339 tools that’s roughly 35–70k tokens before the user’s first message, and that’s before any other MCP server you have registered (most Claude Code users have several). On a model with a 200k context window that’s 18–35% of the total budget, gone, every session.

Worse, most of those tools are never called in any given session. The spec-driven approach pays the spec-search cost only when a tool is actually used.

2. Discovery is harder, not easier

You’d think more tools means more capability. In practice, the LLM has to scan a long list to find the right one. Naming conventions are not uniform across the 339 — some endpoints are named after the resource (getAlarms), some after the action (listIncidentsByLocation), some after both. A free-text search over a structured spec beats LLM-side string matching against a flat list.

3. Audit and safety surface explodes

Every tool is a place a regression can hide. Every parameter schema is a place a typo can become a runtime error. Every tool definition is a place the read-only constraint needs to be re-asserted (or, worse, forgotten). With 339 tools, the audit work scales linearly; with 5, it’s bounded.

The spec-driven alternative

mcsinglewire bundles the OpenAPI 3.1 spec as a 3.3 MB JSON file inside the package (src/mcsinglewire/data/openapi.json) and exposes five generic tools that operate on it:

Tool	Reads the spec	Talks to Singlewire
`health`	no	no
`list_tags`	yes	no
`openapi_search`	yes	no
`openapi_describe`	yes	no
`api_call`	yes	yes (GET)

The natural workflow falls out of the design:

list_tags()
  # → ["Alarms", "Sites", "Users", ...]

openapi_search("alarms")
  # → [{"operation_id": "getAlarms", ...},
  #    {"operation_id": "getAlarm", ...}]

openapi_describe("getAlarm")
  # → {"parameters": [{"name": "alarmId", "in": "path", "required": True, ...}]}

api_call("getAlarm", path_params={"alarmId": "..."})
  # → {"status": 200, "data": {...}}

The LLM does the matching. It’s good at this — searching a spec is exactly the kind of fuzzy-but-structured retrieval LLMs handle well.

What this buys us

Bounded context cost

Five tool definitions, each with a small, well-typed signature. The full set fits in a few hundred tokens. The 3.3 MB spec lives on the server and is queried lazily; the LLM only sees the relevant slice for each operation it inspects.

Bounded audit surface

Read-only enforcement, denylist enforcement, path-quoting, and query-parameter validation all live in one function (_validate_call). New endpoints added to the spec inherit all of those guarantees automatically — there’s no per-tool place to forget to apply them.

Forward compatibility with `make refresh-spec`

When Singlewire adds a new endpoint, refreshing the bundled spec is enough to make it reachable. No code changes, no schema authoring, no regenerated tool list. The refresh how-to covers the routine; the denylist how-to covers the audit step.

A natural denylist

_GET_DENYLIST is a dict[str, str] mapping operationId to rationale. Adding an entry takes one line in code and one test. Removing an entry takes the same. The audit trail lives in git log. Compare to a tool-per-endpoint design, where denylisting means deleting (or feature-flagging, or commenting out) a whole tool.

What we trade away

This design has costs. They’re worth naming:

Less type safety at the MCP boundary. Per-tool definitions could enforce parameter types at the MCP schema level, with the LLM seeing strict schemas at session start. Spec-driven means the LLM must read openapi_describe before each unfamiliar call. In practice this is a small cost — the LLM does this naturally — but it’s a real tradeoff.
Worse static introspection for humans. Tools like Claude’s “list available functions” view show only the five generic tools, not the 339 underlying operations. Discoverability for humans is slightly degraded; for LLMs it’s improved (because they can search the spec textually).
One coarse permission grain. Some MCP clients expose per-tool authorization. With five generic tools, there is one big switch (“can this client call api_call?”) rather than 339 small ones. We compensate with the OAuth scope set, which is the right place for fine-grained authorization anyway.

Why bundle the spec instead of fetching it

The spec is committed to the repo (src/mcsinglewire/data/openapi.json) instead of being fetched at startup. Reasons:

Reproducibility. A given commit of mcsinglewire serves a known, frozen set of operations. Auditors can review the spec and the denylist together at a single point in time.
Resilience. The server starts even if Singlewire’s spec endpoint is down or changes shape.
Determinism in CI. Tests run against the same spec the production server uses.

The cost is that refreshing requires make refresh-spec and a code review. That cost is the feature: it forces an audit on every change to the reachable surface area.

Tools reference The exact signatures and return shapes of all five tools.

Why read-only The constraint that motivates the spec-driven design's bounded audit surface.

Why prompts The same "less is more" reasoning applied a layer up: 14 deliberate prompts over the five tools, not 339 operation-named ones.