Data service line — documentation suite blueprint

Status: Proposal (ideation output) Prepared by: ce-ideate (grounded in repo templates, Google Drive deliverables, vault transcripts) Last updated: 2026-05-01


Suite architecture

Every Data service line deliverable should originate from a canonical template stored in this repo. The template is the source of truth; Google Docs are downstream convenience copies. Each template is paired with:

  1. A playbook — agent-executable instructions for filling the template (where data comes from, how to validate)
  2. A scaffold skill{doc-type}-init skill that copies the template, sets metadata, and asks for inputs
  3. A QA checklist — embedded in the template (Appendix B pattern) as a pre-handoff gate

Guiding principle (from Uttam, Apr 2026 transcript)

“There shouldn’t be a document that comes out without coming out of some template. I feel like the types of documents we produce are pretty defined. And if they’re not, we should define them.”


Document taxonomy

14 document types organized into 4 categories. Each entry: current state, gap, priority, template status.

Category A: Discovery

#Document typeFrequencyCurrent stateGapTemplate
A1Data Source Discovery Memo (with Changelog + Schema Audit appendix + SLA)Every new sourceHas canonical template + init skill (PR #1026). Absorbed A2 (changelog), A4 (schema audit), and SLA/Data Contract into one document.Changelog section, Schema Audit appendix, and SLA table need to be added to A1 template🟡 Upgrade
A3Executive-style Data Source Memo (one-pager)Leadership briefsHas playbook (data-source-memo-executive-style-playbook.md), no templateNo canonical template for the compressed format🔴 Gap

Category B: Investigation & quality

#Document typeFrequencyCurrent stateGapTemplate
B1Data Findings Memo (accuracy/quality investigation)Per issueHas template (data-findings-memo-template.md, 117 lines)Missing “About this document” conventions, filename standard, QA checklist, init skill. Must prioritize readability and self-service.🟡 Upgrade
B2RCA Memo (incidents + KPI anomalies)Per incident / anomalyHas template (rca-memo-template.md). Consolidated B4 (KPI anomaly) as a subtype within B2 using Type metadata field.Upgrade needed: conventions, QA checklist, plus expand to cover anomaly investigations🟡 Upgrade
B3Data Quality Assessment Report (periodic health check)Monthly/quarterlyNothingEntirely ad-hoc🔴 Gap

Category C: Technical design & architecture

#Document typeFrequencyCurrent stateGapTemplate
C1Modeling Design Doc (technical model design)Per projectNothing canonicalAd-hoc Google Docs; no consistent format across clients. Vendor-agnostic.🔴 Gap
C2Semantic View Design Doc [Snowflake]Per viewHas skill (data-improve-semantic-view), no templateSkill has instructions but no standalone fill-in template. Must be client-facing. Lives under vendor/snowflake/.🔴 Gap
C3Warehouse Architecture Assessment (platform recommendation)Per assessmentNothing canonicalMixed formats; e.g., LMNT Warehouse Platform Assessment in Drive follows no repo template🔴 Gap
C4Technical Design Document (TDD) — DataPer projectHas template (Technical Design Document Template (TDD) Data Template.md)Obscure path; not in templates registry; needs upgrade to dashboard-spec quality🟡 Upgrade

Category D: Definition & evaluation

#Document typeFrequencyCurrent stateGapTemplate
D1Metrics Definition DocumentPer metricHas template (metrics-definition-doc-template.md, 125 lines)Missing “About this document” conventions, filename standard, QA checklist, init skill🟡 Upgrade
D2Golden Dataset Spec (evaluation datasets)Per NL2SQL projectHas skill (data-create-golden-dataset), no templateSkill has question catalog structure but no canonical fill-in template🔴 Gap

Category E: Migration & operations

#Document typeFrequencyCurrent stateGapTemplate
E1ETL Migration Plan (tool-to-tool migration)Per migrationNothing canonicalMigration plans differ per client; no template for cutover checklists, rollback plans🔴 Gap
E2Data Warehouse Migration Plan (platform-to-platform)Per migrationNothing canonicalLarger scope than ETL — schemas, permissions, historical data, consumers🔴 Gap
E3Data Platform Documentation (Google Sheet + companion markdown)Every clientHas guides + skills (data-platform-doc). Hub that links to all other Data deliverables.Sheet-based; needs a companion repo markdown template for the locked-metrics file🟡 Upgrade

Consolidation notes

  • A2 (Discovery Memo Update) → merged into A1 as a Changelog section (date-stamped entries for source changes)
  • A4 (Schema Audit) → merged into A1 as Appendix C (column-level per-table catalog)
  • B4 (KPI Anomaly Detection) → consolidated into B2 (RCA) as a subtype, differentiated by a Type metadata field [Incident / KPI Anomaly]
  • D2 (Data Contract/SLA) → consolidated into A1 §2 (Source and lineage) and §6 (Joins and caveats). No standalone SLA template.
  • E2 (Data SOW) → removed from Data suite. Belongs to Sales/GTM. Referenced only in Related Artifacts tables.
  • E1/E2 after consolidation: ETL migration and warehouse migration are distinct enough for separate templates under migration/ subfolder.

Gap summary

StatusCountTypes
Done (canonical template + init skill)1A1 (Discovery Memo — needs Changelog, Schema appendix, SLA upgrade 🟡 temporary)
🟡 Upgrade needed (has template, needs conventions/QA/skill)4A1 (partial), B1, B2, C4, D1, E3
🔴 Gap (no template at all)9A3, B3, C1, C2, C3, D2, E1, E2, (plus E3 companion)

Template quality principles

Every template in the suite should satisfy these checks. The principles are rooted in three proven traditions: the Minto Pyramid Principle (McKinsey’s framework for structured communication), MBB consulting slide/doc standards (action titles, SCR, MECE), and Stripe/Amazon developer docs (example-first, error docs as top-level content).

  1. About this document — titling convention, filename pattern, single source of truth declared
  2. When to use vs when NOT to use — differentiation from adjacent templates
  3. Document metadata — status, warehouse/platform, prepared for/by, last updated
  4. Related artifacts table — Data Platform, Linear, legacy, parent docs
  5. Numbered sections with action-title headings — stable, append-only renumbering
  6. Fill-in-the-blank[placeholders] with example values showing what good looks like
  7. Dual-audience calibration — executive summary at top, technical detail in dedicated sections
  8. Tables for structured data — scannable reference, not dense paragraphs
  9. Agent guardrails appendix — banned phrases, style rules, anti-fluff
  10. Pre-handoff QA checklist appendix — 8-12 checkboxes to verify before stakeholder delivery

Template directory structure

knowledge/delivery/service-lines/data/templates/
├── README.md                                    # registry
├── DATA_DOCUMENTATION_SUITE_BLUEPRINT.md        # this file
├── data-source-discovery-memo-template.md       # A1 (with changelog + schema appendix + SLA)
├── executive-discovery-memo-template.md         # A3
├── data-findings-memo-template.md               # B1
├── rca-memo-template.md                         # B2 (incidents + anomalies)
├── data-quality-assessment-template.md          # B3
├── modeling-design-doc-template.md              # C1
├── warehouse-architecture-assessment-template.md # C3
├── tdd-data-template.md                         # C4
├── metrics-definition-doc-template.md           # D1
├── golden-dataset-spec-template.md              # D2
├── data-platform-locked-metrics-template.md     # E3 companion
├── vendor/
│   └── snowflake/
│       └── semantic-view-design-doc-template.md # C2 [Snowflake]
└── migration/
    ├── etl-migration-plan-template.md           # E1
    └── warehouse-migration-plan-template.md     # E2

Implementation plan (5 phases, 17 units)

Phase 1 — Foundation (U1–U3)

Blueprint finalization + consolidation, SLA/changelog/schema-audit in discovery memo, full 14-doc README registry rewrite. Gates Phases 2–5.

Phase 2 — Priority gaps (U5, U6, U9)

Modeling Design Doc (C1), Data Quality Assessment (B3), Executive Discovery Memo (A3) — the 3 most frequently produced gap types. Each with template + init skill.

Phase 3 — Upgrade existing (U10–U12)

Data Findings Memo + Metrics Definition Doc — bring to quality bar with init skills. RCA Memo — upgrade + absorb KPI anomaly subtype. Plus legacy Technical Assessment Memo upgrade.

Phase 4 — Technical & contract (U13–U16)

Semantic View (vendor/snowflake/, client-facing), Warehouse Assessment, Golden Dataset, TDD Data upgrade.

Phase 5 — Migration + cross-cutting (U17–U20)

ETL Migration, Warehouse Migration, Data Platform companion, 8 remaining init skills.


Connectivity

Data Platform Documentation (E3) — hub
  └─ links to ─┐
               ├── A1: Discovery Memo (with changelog, schema audit, SLA)
               ├── A3: Executive Discovery Memo
               ├── B1: Data Findings Memos
               ├── B2: RCA Memos (incidents + anomalies)
               ├── B3: DQ Assessments
               ├── C1: Modeling Design Docs
               ├── C2: Semantic View Docs [Snowflake]
               ├── C3: Warehouse Assessments
               ├── C4: TDDs
               ├── D1: Metric Definitions
               ├── D2: Golden Datasets
               ├── E1: ETL Migration Plans
               └── E2: Warehouse Migration Plans
  • Dashboard specification template — the exemplar (strategy-analytics/templates/dashboard-specification-template.md)
  • Discovery memo template — first Data template following the pattern (data/templates/data-source-discovery-memo-template.md)
  • Transcript evidence: 2026-04-08_brainforge_dashboard_spec_review, 2026-04-15_sls_-_playbooks, 2026-04-29_eom_service_line_review
  • Implementation plan: docs/plans/2026-05-01-001-feat-data-documentation-suite-plan.md