Why generic AI fails at oncology reporting
Large language models are fluent, confident, and wrong in ways that matter clinically. Ask a general-purpose LLM about a BRAF V600E mutation and it will correctly cite dabrafenib + trametinib. Ask it about BRAF V600K — a different variant at the same position — and it will often give the same answer. In reality, the drug class is similar but the approved indications and combinations differ.
This failure mode is what destroyed trust in IBM Watson for Oncology. It's what makes lab CTOs and pathologists rightly skeptical of any vendor saying “we use AI.”
Generic LLMs vs. domain-specific oncology AI
| Capability | Generic LLM (ChatGPT, Claude, etc.) | UNMIRI GraphRAG |
|---|---|---|
| Clinical reasoning source | Training data (opaque) | Knowledge graph (OncoKB, ClinVar, openFDA) |
| Variant-level precision | Conflates near-miss variants | Each variant is a distinct graph node |
| Citation fidelity | Fabricates plausible citations | Every claim → specific KB entry |
| Contraindication detection | Inconsistent | Rule-based traversal, deterministic |
| Auditability | Black-box output | Full reasoning chain per output |
| BAA availability | Consumer tier: no; enterprise: varies | Yes — enterprise-tier LLM APIs only |
| Zero-retention | Only on enterprise tiers | Enforced on every API call |
| CAP/CLIA alignment | Not built for it | Version-pinned KBs, audit logs, override support |
What “domain-specific” actually means in our architecture
The LLM never makes a clinical call. It only formats prose. All reasoning — variant → drug → evidence tier → contraindication — happens in a graph traversal over structured oncology knowledge bases. The LLM sees the traversal result and writes it up in readable language. If the graph returns no match, the LLM says so; it does not invent one.
The full technical explanation is in our engineering post — Why Vector RAG Fails for Oncology — and What to Build Instead.
What this buys you operationally
- Explainability your inspectors accept. Every tier, recommendation, and contraindication traces back to a specific OncoKB entry, FDA label, or published trial.
- Oncologist trust. When UNMIRI recommends “osimertinib first-line,” the report shows the FLAURA trial data that supports it. Oncologists verify in seconds.
- Institutional continuity. Lab-specific overrides persist across reports. Your interpretation committee's decisions apply automatically.
Compliance posture
BAA available. Zero-retention enforced via Anthropic's HIPAA-ready API tier — the only LLM UNMIRI uses, and only for narrow extraction and long-tail variant fallback. US-only data residency. Full compliance architecture on the security page, and the practical build-out is covered in Building a HIPAA-Ready Architecture for Clinical Decision Support.
How UNMIRI actually does this
UNMIRI extracts structured variant data, traverses a typed knowledge graph grounded in OncoKB, ClinVar, ClinicalTrials.gov, and openFDA drug labels, and renders the 2-page output through deterministic templates. LLMs assist only with extraction edge cases and long-tail variant fallback. Every claim is cited, and every citation resolves to a specific KB entry. More on the architecture.