Skip to content

Trusting the interpreter

Every call this tool makes is recorded. Every write is refused at three independent layers. Every question is bounded by what the logged-in user is allowed to see. None of that extends to what the AI says about the data it got back.

This page sits with that gap honestly — what’s locked down, what isn’t, and why we still ship.

┌─────────────────────────────────────────────────────────────┐
│ Your Singlewire system │
│ Trusted: Yes (your vendor's product) │
│ Returns: the actual data │
└─────────────────────────────────────────────────────────────┘
│ read-only lookup
┌─────────────────────────────────────────────────────────────┐
│ mcsinglewire (this tool) │
│ Trusted: Yes (small, audited, three-layer read-only) │
│ Records: every call, every refusal │
└─────────────────────────────────────────────────────────────┘
│ question + answer
┌─────────────────────────────────────────────────────────────┐
│ The AI (Claude or similar) │
│ Trusted: No, structurally │
│ Can: misread, miscount, paraphrase wrong, hallucinate │
│ Cannot: change anything (the layers below catch that) │
└─────────────────────────────────────────────────────────────┘
│ chat
┌─────────────────────────────────────────────────────────────┐
│ You │
│ Final say on what the answer means │
└─────────────────────────────────────────────────────────────┘

The middle band — the AI — is the part we can’t audit cleanly. It takes your question, looks up some data, reads what came back, and writes you a summary. The summary is the AI’s own interpretation of the underlying numbers, and the interpretation step is the part we can’t see into.

In the course of a routine audit, the AI might:

  • Misread a field. “Last seen 2 hours ago” is the AI’s math on a raw timestamp. Get the timezone wrong or hit a daylight-saving edge and the answer’s off, even if the underlying data was correct.
  • Miss a page. Big result sets come back in chunks. If the AI stops at the first chunk when there are five, the count is wrong even though each line it showed was real.
  • Confuse similar entities. Two devices with similar names, one stale and one healthy. The summary says one thing; a careful read would say another.
  • Confidently invent context. “The site is undergoing scheduled maintenance” — even though that’s nowhere in the actual data. Pure hallucination.
  • Drop a result it found inconvenient. Less common with modern models, but worth being aware of.

None of this is hypothetical. All of it can happen on any given day, and the audit trail won’t catch it — the trail records the calls, not the summary.

The system tolerates an unreliable interpreter because the architecture bounds what unreliability can do:

  • It can’t change anything. Three independent layers refuse writes. Even if the AI concludes “we should clear this alarm”, it can’t.
  • It can’t lie about which calls were made. The audit trail is authoritative. If the AI claims it pulled fresh data and the trail shows no recent call, you have evidence.
  • It can’t suppress a refusal. When a call is refused, the refusal lands in the trail independently of what the AI says about it.
  • It can’t see more than the logged-in user can. Your existing Singlewire permissions bound what the AI can ask for.

So the worst case isn’t “the AI changed something in a hospital system” — that case is structurally impossible. The worst case is “the AI gave a wrong summary of read-only data”. That’s a regular software-correctness problem, not a safety one, and it’s recoverable: re-ask, look at the raw record, or check the InformaCast admin console.

The architecture doesn’t relieve you of judgement. For anything that will inform a real decision:

  • Spot-check. Pick one device the AI flagged and verify it in the InformaCast admin console. Over time you’ll get a feel for where it’s reliable and where it isn’t.
  • Trust raw records, doubt soft summaries. If the AI shows you a structured row of values, that came from the system. If it gives you a sentence interpretation on top, that’s the AI’s read.
  • Be more sceptical near edges. Time math, pagination, “no results found” — classic failure modes.
  • Compare across runs. If Offline devices returns three results today and 47 tomorrow with no obvious reason, ask why. The recorded call has the answer.

The honest summary: we trust the AI as little as we need to, and the architecture pays for the rest.

You can see this pattern through the whole design:

  • We don’t trust the AI to refuse writes — three independent layers do.
  • We don’t trust the AI to log its own actions — the trail does, before any data leaves.
  • We don’t trust the AI to scope itself — the Singlewire login does.

What’s left for the AI is navigating the system and summarising results, which is what AIs are genuinely good at. The structure around it covers what they aren’t.