Credibility kit — reliability, citations, human review (anti–“AI slop” / hallucination)
Date: 2026-04-16
Audience: CSOs, solutions architects, procurement-facing leads
Purpose: Give buyers and internal teams a repeatable proof story against fear of generic outputs, wrong citations, and “vibe-coded” demos—especially in forums where users complain that models ignore or invent fine-grained rules.
A. One-page “how we know it’s right”
| Pillar | What we do | What the client sees |
|---|---|---|
| Scope | Task and corpus boundaries per agent | “This agent only answers from approved sources and cannot take these actions.” |
| Grounding | Retrieval design, freshness SLAs, source priority rules | Citations or insufficient evidence responses—not invention. |
| Evaluation | Offline suites + periodic regression; domain-specific negative tests | Eval report at onboarding + on major releases |
| Production monitoring | Drift, refusal rate, tool-error rate, human override rate | Dashboard + alert thresholds tied to runbooks |
| Human policy | Human-in-the-loop for high-risk classes; escalation paths | RACI on exceptions |
| Incident response | Rollback, prompt/model version pinning, forensic replay | Postmortem template and SLA |
B. Minimum artifact checklist
- Model / agent card — Intended use, prohibited use, data classes handled, retention, known failure modes, owner.
- Source registry — Approved corpora, business owners, refresh cadence, PII and licensing posture.
- Eval bible — Dozens to hundreds of golden cases: expected answers, must-cite, must-refuse, and adversarial prompts.
- Citation standard — When to cite, chunk granularity, behavior when sources disagree or are stale.
- Release gate — No production change without eval delta within agreed tolerance.
- Live demo script — Three adversarial scenarios for joint review (contradictory docs, out-of-date policy, policy jailbreak-lite).
- Pilot success criteria — Written before pilot kickoff; signed by workflow owner and security/designee.
C. Buyer-forum language (short, non-defensive)
- “We do not optimize for plausible; we optimize for verifiable against your approved sources.”
- “If the evidence is not there, the correct behavior is ‘I don’t know’—not a confident guess.”
- “Every automated action has a human owner, a budget, and a rollback.”
D. Optional “red team” half-day (high leverage)
- Morning: Joint session with IT/legal—adversarial prompts on real (sanitized) document bundles.
- Afternoon: Prioritized fix list: retrieval gaps vs policy vs model limits vs UX.
- Output: A limitations appendix procurement can file with the vendor packet—reduces surprises at go-live.
E. How this pairs with enterprise strategy work
Use alongside:
knowledge/executive/strategy/client-ai-strategy-pressure-test-bougie-gartner-pattern.md— Moves buyers from chatbots to embedded agents with a data and org plan.knowledge/executive/strategy/positioning-memo-ai-replaces-consulting-brainforge.md— Clarifies irreplaceable vs productized services.
Notes for delivery teams
- Do not promise “zero hallucinations.” Promise measurement, containment, and accountability.
- Prefer evidence-plane investments (logging, replay, citations) early—they amortize across workflows.