Data service line — documentation suite blueprint
Status: Proposal (ideation output) Prepared by: ce-ideate (grounded in repo templates, Google Drive deliverables, vault transcripts) Last updated: 2026-05-01
Suite architecture
Every Data service line deliverable should originate from a canonical template stored in this repo. The template is the source of truth; Google Docs are downstream convenience copies. Each template is paired with:
- A playbook — agent-executable instructions for filling the template (where data comes from, how to validate)
- A scaffold skill —
{doc-type}-initskill that copies the template, sets metadata, and asks for inputs - A QA checklist — embedded in the template (Appendix B pattern) as a pre-handoff gate
Guiding principle (from Uttam, Apr 2026 transcript)
“There shouldn’t be a document that comes out without coming out of some template. I feel like the types of documents we produce are pretty defined. And if they’re not, we should define them.”
Document taxonomy
14 document types organized into 4 categories. Each entry: current state, gap, priority, template status.
Category A: Discovery
| # | Document type | Frequency | Current state | Gap | Template |
|---|---|---|---|---|---|
| A1 | Data Source Discovery Memo (with Changelog + Schema Audit appendix + SLA) | Every new source | Has canonical template + init skill (PR #1026). Absorbed A2 (changelog), A4 (schema audit), and SLA/Data Contract into one document. | Changelog section, Schema Audit appendix, and SLA table need to be added to A1 template | 🟡 Upgrade |
| A3 | Executive-style Data Source Memo (one-pager) | Leadership briefs | Has playbook (data-source-memo-executive-style-playbook.md), no template | No canonical template for the compressed format | 🔴 Gap |
Category B: Investigation & quality
| # | Document type | Frequency | Current state | Gap | Template |
|---|---|---|---|---|---|
| B1 | Data Findings Memo (accuracy/quality investigation) | Per issue | Has template (data-findings-memo-template.md, 117 lines) | Missing “About this document” conventions, filename standard, QA checklist, init skill. Must prioritize readability and self-service. | 🟡 Upgrade |
| B2 | RCA Memo (incidents + KPI anomalies) | Per incident / anomaly | Has template (rca-memo-template.md). Consolidated B4 (KPI anomaly) as a subtype within B2 using Type metadata field. | Upgrade needed: conventions, QA checklist, plus expand to cover anomaly investigations | 🟡 Upgrade |
| B3 | Data Quality Assessment Report (periodic health check) | Monthly/quarterly | Nothing | Entirely ad-hoc | 🔴 Gap |
Category C: Technical design & architecture
| # | Document type | Frequency | Current state | Gap | Template |
|---|---|---|---|---|---|
| C1 | Modeling Design Doc (technical model design) | Per project | Nothing canonical | Ad-hoc Google Docs; no consistent format across clients. Vendor-agnostic. | 🔴 Gap |
| C2 | Semantic View Design Doc [Snowflake] | Per view | Has skill (data-improve-semantic-view), no template | Skill has instructions but no standalone fill-in template. Must be client-facing. Lives under vendor/snowflake/. | 🔴 Gap |
| C3 | Warehouse Architecture Assessment (platform recommendation) | Per assessment | Nothing canonical | Mixed formats; e.g., LMNT Warehouse Platform Assessment in Drive follows no repo template | 🔴 Gap |
| C4 | Technical Design Document (TDD) — Data | Per project | Has template (Technical Design Document Template (TDD) Data Template.md) | Obscure path; not in templates registry; needs upgrade to dashboard-spec quality | 🟡 Upgrade |
Category D: Definition & evaluation
| # | Document type | Frequency | Current state | Gap | Template |
|---|---|---|---|---|---|
| D1 | Metrics Definition Document | Per metric | Has template (metrics-definition-doc-template.md, 125 lines) | Missing “About this document” conventions, filename standard, QA checklist, init skill | 🟡 Upgrade |
| D2 | Golden Dataset Spec (evaluation datasets) | Per NL2SQL project | Has skill (data-create-golden-dataset), no template | Skill has question catalog structure but no canonical fill-in template | 🔴 Gap |
Category E: Migration & operations
| # | Document type | Frequency | Current state | Gap | Template |
|---|---|---|---|---|---|
| E1 | ETL Migration Plan (tool-to-tool migration) | Per migration | Nothing canonical | Migration plans differ per client; no template for cutover checklists, rollback plans | 🔴 Gap |
| E2 | Data Warehouse Migration Plan (platform-to-platform) | Per migration | Nothing canonical | Larger scope than ETL — schemas, permissions, historical data, consumers | 🔴 Gap |
| E3 | Data Platform Documentation (Google Sheet + companion markdown) | Every client | Has guides + skills (data-platform-doc). Hub that links to all other Data deliverables. | Sheet-based; needs a companion repo markdown template for the locked-metrics file | 🟡 Upgrade |
Consolidation notes
- A2 (Discovery Memo Update) → merged into A1 as a Changelog section (date-stamped entries for source changes)
- A4 (Schema Audit) → merged into A1 as Appendix C (column-level per-table catalog)
- B4 (KPI Anomaly Detection) → consolidated into B2 (RCA) as a subtype, differentiated by a Type metadata field
[Incident / KPI Anomaly] - D2 (Data Contract/SLA) → consolidated into A1 §2 (Source and lineage) and §6 (Joins and caveats). No standalone SLA template.
- E2 (Data SOW) → removed from Data suite. Belongs to Sales/GTM. Referenced only in Related Artifacts tables.
- E1/E2 after consolidation: ETL migration and warehouse migration are distinct enough for separate templates under
migration/subfolder.
Gap summary
| Status | Count | Types |
|---|---|---|
| ✅ Done (canonical template + init skill) | 1 | A1 (Discovery Memo — needs Changelog, Schema appendix, SLA upgrade → 🟡 temporary) |
| 🟡 Upgrade needed (has template, needs conventions/QA/skill) | 4 | A1 (partial), B1, B2, C4, D1, E3 |
| 🔴 Gap (no template at all) | 9 | A3, B3, C1, C2, C3, D2, E1, E2, (plus E3 companion) |
Template quality principles
Every template in the suite should satisfy these checks. The principles are rooted in three proven traditions: the Minto Pyramid Principle (McKinsey’s framework for structured communication), MBB consulting slide/doc standards (action titles, SCR, MECE), and Stripe/Amazon developer docs (example-first, error docs as top-level content).
- About this document — titling convention, filename pattern, single source of truth declared
- When to use vs when NOT to use — differentiation from adjacent templates
- Document metadata — status, warehouse/platform, prepared for/by, last updated
- Related artifacts table — Data Platform, Linear, legacy, parent docs
- Numbered sections with action-title headings — stable, append-only renumbering
- Fill-in-the-blank —
[placeholders]with example values showing what good looks like - Dual-audience calibration — executive summary at top, technical detail in dedicated sections
- Tables for structured data — scannable reference, not dense paragraphs
- Agent guardrails appendix — banned phrases, style rules, anti-fluff
- Pre-handoff QA checklist appendix — 8-12 checkboxes to verify before stakeholder delivery
Template directory structure
knowledge/delivery/service-lines/data/templates/
├── README.md # registry
├── DATA_DOCUMENTATION_SUITE_BLUEPRINT.md # this file
├── data-source-discovery-memo-template.md # A1 (with changelog + schema appendix + SLA)
├── executive-discovery-memo-template.md # A3
├── data-findings-memo-template.md # B1
├── rca-memo-template.md # B2 (incidents + anomalies)
├── data-quality-assessment-template.md # B3
├── modeling-design-doc-template.md # C1
├── warehouse-architecture-assessment-template.md # C3
├── tdd-data-template.md # C4
├── metrics-definition-doc-template.md # D1
├── golden-dataset-spec-template.md # D2
├── data-platform-locked-metrics-template.md # E3 companion
├── vendor/
│ └── snowflake/
│ └── semantic-view-design-doc-template.md # C2 [Snowflake]
└── migration/
├── etl-migration-plan-template.md # E1
└── warehouse-migration-plan-template.md # E2
Implementation plan (5 phases, 17 units)
Phase 1 — Foundation (U1–U3)
Blueprint finalization + consolidation, SLA/changelog/schema-audit in discovery memo, full 14-doc README registry rewrite. Gates Phases 2–5.
Phase 2 — Priority gaps (U5, U6, U9)
Modeling Design Doc (C1), Data Quality Assessment (B3), Executive Discovery Memo (A3) — the 3 most frequently produced gap types. Each with template + init skill.
Phase 3 — Upgrade existing (U10–U12)
Data Findings Memo + Metrics Definition Doc — bring to quality bar with init skills. RCA Memo — upgrade + absorb KPI anomaly subtype. Plus legacy Technical Assessment Memo upgrade.
Phase 4 — Technical & contract (U13–U16)
Semantic View (vendor/snowflake/, client-facing), Warehouse Assessment, Golden Dataset, TDD Data upgrade.
Phase 5 — Migration + cross-cutting (U17–U20)
ETL Migration, Warehouse Migration, Data Platform companion, 8 remaining init skills.
Connectivity
Data Platform Documentation (E3) — hub
└─ links to ─┐
├── A1: Discovery Memo (with changelog, schema audit, SLA)
├── A3: Executive Discovery Memo
├── B1: Data Findings Memos
├── B2: RCA Memos (incidents + anomalies)
├── B3: DQ Assessments
├── C1: Modeling Design Docs
├── C2: Semantic View Docs [Snowflake]
├── C3: Warehouse Assessments
├── C4: TDDs
├── D1: Metric Definitions
├── D2: Golden Datasets
├── E1: ETL Migration Plans
└── E2: Warehouse Migration Plans
Related
- Dashboard specification template — the exemplar (
strategy-analytics/templates/dashboard-specification-template.md) - Discovery memo template — first Data template following the pattern (
data/templates/data-source-discovery-memo-template.md) - Transcript evidence:
2026-04-08_brainforge_dashboard_spec_review,2026-04-15_sls_-_playbooks,2026-04-29_eom_service_line_review - Implementation plan:
docs/plans/2026-05-01-001-feat-data-documentation-suite-plan.md