Data Platform Projects Technical Design Document Template (TDD)

0 Document Control

Version	Date	Author	Reviewer(s)	Linear Project	Status
01	2025_07_23	Caio

Artifact	Link	Purpose in this TDD
Data Platform Documentation	Here	Canonical platform context, current-state diagrams, data contracts
Project Charter		Business goals, scope, success metrics, stakeholders
Project Management Plan		Timeline, resourcing, risks, comms/rituals
Linear Project / Team		Ticket breakdown, execution tracking, status

Stakeholders

Name	Department

2. Overview

Purpose – why does this project exist?
Scope & Goals – what’s in / out, success metrics.
Stakeholders & Roles – engineering, product, client, ops.
Context diagram – one-slide “where this fits” view.

3. Problem Statement & Requirements

Business Problem / User Stories – narrative of current pain; tie to measurable outcomes.
Analytics Use Cases – questions dashboards/AI agents must answer; link to PRDs/OKRs.
Functional Requirements – e.g., ingest events, model customer identity, expose metrics API.
Non-Functional Requirements (NFRs) – latency/throughput, security, compliance, observability, cost constraints, SLAs.

4. Current State Assessment

Data Sources & Ingestion – existing ETL/ELT (Fivetran, Airbyte, custom), frequencies, volumes.
Warehouses / Lakes – current tech (Snowflake, BigQuery, Redshift, Databricks, etc.), schemas, pain points.
Modeling Layer – dbt project structure, test coverage, naming conventions.
BI / Activation Layer – tools (Looker, Power BI, Tableau, Hightouch, Segment, Rudderstack, mParticle, etc.) and gaps.
Data Quality & Governance – checks, lineage, ownership, documentation gaps.
Known Constraints / Debt – legacy models, brittle pipelines, permissions issues.
[ARCHITECTURE DIAGRAM]

5. Research & Discovery

Goal: expose all due diligence so reviewers trust the design decisions.

Link	Type (spike, RFC, benchmark, article)	Key takeaway
…	…	…

Competitive / prior-art summary
Assumptions & constraints
Open questions

6. Target Architecture & Design

High-level Architecture Diagram – ingestion → storage → modeling → serving/activation layers.
Component Breakdown – purpose, inputs/outputs, dependencies (ETL jobs, dbt models, marts, APIs).
Data Model – ERD / star schemas / semantic layer definitions (link to dbt docs).
Sequence / Activity Diagrams – critical flows (identity stitching, metric computation).
Interfaces & Contracts – schemas (JSON, Avro), API specs (OpenAPI/GraphQL), file formats/locations.
Deployment View – environments, IaC repos, CI/CD for dbt/ETL, feature flags.

A. Key Questions & Answers (KQA) Matrix

Business Question	Metric(s) / Output	Needed Grain & Dimensions	Data Path (Src → Model → Surface)	Freshness / Latency SLA	Accuracy Rule(s) & Test(s)
…	…	…	…	…	…

7. Data Modeling Strategy

Modeling Paradigm – dimensional (Kimball), Data Vault, medallion/lakehouse, semantic layer, etc.
Naming & Folder Conventions – e.g., staging/intermediate/marts in dbt.
Metric Definitions – canonical list tied to business goals.

Metric	Business Definition	Formula / SQL Logic	Grain	Source of Truth Table
…	…	…	…	…

8. Tooling & Technology Decisions Matrix

Evaluate and justify tools for:

Ingestion / ETL
Vendor lock in
Transformation / Modeling (dbt, Python notebooks)
Storage / Warehouse / Lakehouse
BI / Visualization
CDP / Activation / Reverse ETL
Orchestration & Scheduling
Observability & Cost Monitoring

Domain	Option(s) Considered	Selected	Criteria / Score	Notes
…	…	…	…	…

9. Decision Log & Rationale (ADR Index)

Record each significant trade-off as an Architecture Decision Record (ADR). Link them here or inline.

#	Choice	Options Considered	Decision	Rationale	Impact / Risks
…	…	…	…	…	…

10. Implementation & Migration Plan

Milestones / phases (POC → pilot → full migration)
Workstream ownership
Linear linkage to Project
Migration / cutover strategy – data backfill, parallel run, validation windows, rollback plans
Decommission plan – what legacy assets retire and when

11. Testing, Data Quality & Validation

Test matrix – unit (dbt tests), integration (pipeline), end-to-end (dashboards), load/perf, security
Data quality rules – null checks, schema drift, freshness, anomalies (define monitors/alerts)
Validation plan – sampling queries, reconciliations vs. legacy numbers
Observability – logs, metrics, lineage tools (OpenLineage, Monte Carlo, Databand, etc.)

12. Decision Log & Rationale

For every significant trade-off, include an Architecture Decision Record (ADR) reference or inline table:

#	Choice	Options considered	Decision	Rationale	Impact

Capturing these as ADRs keeps institutional memory intact and is an emerging best practice .

13. Implementation Plan

Milestones / phases
Work-stream ownership
Roll-back & migration notes (if replacing legacy)

B. Before / After Impact Scorecard

Dimension	Current State	Target State	% / Δ Improvement	Evidence / Benchmark Plan
Time to answer “X?”	2 days manual SQL	<15 min self-serve	90% faster	Query log baseline + post-rollout tracking
Accuracy of metric Y	±5% variance vs. finance	<±0.5%	10× better	Reconciliation scripts, dbt tests
Data freshness	Daily	Hourly	24×	Pipeline SLA

15. Operational Considerations

Runbooks / on-call playbooks
SLA/SLO targets
Cost estimates & governors (if cloud usage matters)

16. Review Meetings, RFC & Sign-offs

Design Review Meeting – date/time, attendees, key outcomes
RFC Process – distribution list, comment window, approval criteria (link to RFC doc/thread)
Sign-off Checklist – required approvals (Architecture Lead, Data Lead, Security, Product)

17. Meeting Notes

Paste minutes or link to recordings for every design review.

Action-item table drives accountability.

18. Agent Handoff Brief

Purpose: Condense the TDD into agent-ready context for AI coding agents (Cursor Plan Mode, Codex, etc.). Fill this out after the design is approved. For each implementation workstream or milestone, create one of these briefs. You can paste just this section as agent context, or pair it with the relevant architecture section.

Codebase Orientation

Repo(s): <repo name and path>
Key files / entry points:
- path/to/relevant/file — what it does
- path/to/another/file — what it does
Reference implementations: Point to existing features or patterns in the codebase that this work should follow.

Scope for This Workstream

What specifically is being built in this phase. Reference the milestone from the Implementation Plan.

Implementation Sequence

Ordered list of what to build, in what order. Each step should be independently testable.

Acceptance Criteria

Concrete, verifiable statements. An agent (or a human) should be able to check each one off.

Key Decisions to Carry Forward

Pull the most important decisions from the ADR Index and Tooling Matrix that the agent needs to respect. Don’t make the agent read the whole TDD — surface the decisions that matter for implementation.

Constraints and Guardrails

Don’t touch: files, services, or areas that are out of scope for this workstream
Must use: specific libraries, frameworks, patterns, or conventions required
Must not: hard constraints (e.g., don’t modify production schemas without migration, don’t change shared dbt macros)
Style / conventions: naming patterns, file organization, dbt/SQL style to follow
Data-specific: warehouse to target, schema/dataset naming, test coverage requirements

19. Appendices

Glossary
Reference material
Change log

Appendix A — Section 6 Variants (Target Architecture & Design)

Replace your standard Section 6 with one

6A. Data Warehouse / Lakehouse Migration or Re-Architecture

Context & Drivers: Current platform, pain points (cost, performance, governance).
Target Stack Diagram: Source → Ingestion → Staging → Modeling → Serving/BI.
Environment Strategy: Dev / QA / Prod, branch strategy, CI/CD for dbt/ETL.
Migration Path: Parallel run vs. big bang, data backfill plan, cutover criteria.
Workload Segmentation: What runs where (compute clusters, warehouses, job queues).
Cost/Performance Guardrails: Warehouse sizing, auto-suspend rules, caching strategy.
Decommission Plan: Legacy assets, timelines, owners.

6B. CDP / Activation (Reverse ETL) Implementation

Identity & Audience Architecture: Identity graph, stitching logic, primary keys.
Event & Profile Schemas: Required attributes, PII handling, consent flags.
Activation Flows: Data paths from warehouse → CDP → downstream tools (email, ads).
Segmentations & Audiences: Definition storage (dbt, CDP UI), refresh cadence.
Governance & Opt-Out: Compliance hooks, audit logs, data retention.
Performance & Sync SLAs: Latency windows, retry/backoff patterns.

6C. dbt-Centric Modeling & Semantic Layer Build

Project Structure: Staging / intermediate / marts folders, packages, macros.
Model Dependency Graph: Key lineage blocks; link to dbt docs site.
CI/CD & Testing: Gate merges on dbt test success, code owners.
Semantic Layer / Metrics: Tooling (MetricFlow, dbt metrics, LookML), how metrics are exposed.
Deployment & Scheduling: Orchestrator, run ordering, recovery on failure.

6D. ETL/ELT Pipeline (New or Redesign)

Ingestion Patterns: Batch vs. streaming, connectors (Fivetran, Airbyte, custom).
Transformation Strategy: Where transforms occur (warehouse vs. Spark vs. Python).
Error Handling / Replay: Dead-letter queues, idempotency, late-arriving data.
Data Contracts & Schemas: Versioning, schema drift detection, contract enforcement.
Observability: Metrics (freshness, row counts), alert routing, lineage capture.

6E. BI / Visualization Tool Rollout or Migration

Semantic Layer Strategy: Central model vs. tool-native models.
Dashboard & Report Taxonomy: Core dashboards, ownership, refresh SLAs.
Access & Governance: Roles, row-level security, certified vs. ad-hoc content.
Performance Optimization: Extracts, caching, aggregate tables, query governance.
Change Management & Enablement: Training plan, office hours, documentation hub.

6F. Large Technical Migration / Platform Consolidation

Scope of Migration: Systems/processes impacted; what stays vs. moves.
Interim Architecture: Transitional states, feature flags, phased cutovers.
Data Parity Strategy: Reconciliation processes, golden datasets, sign-off criteria.
Risk Matrix: Technical/operational risks, mitigations, contingency plans.
Sunset Timeline: Dependencies, communication plan, audit requirements.

6.X Question-Driven Design Lens

Primary Questions Enabled: (reference KQA matrix rows this section covers)
Speed/Accuracy Impact: What improves and how it’s measured (link to Scorecard).
Failure Mode on Questions: If this component fails/degrades, which questions break and what’s the fallback?

6.Y Question-to-Component Mapping

Component / Service	Questions Unlocked	Data / Metric Dependencies	SLA / Perf Target	Owner
…	…	…	…	…

Appendix B — Section 7 Variants (Data Modeling Strategy)

Pick the paradigm that matches the project. You can also combine (e.g., Kimball marts on top of a Data Vault).

7A. Dimensional (Kimball) / Star Schema Refresh

Design Principles: Conformed dimensions, slowly changing dimensions (SCD type selection).
Grain Definition: For each fact table (event, order, session), specify the grain explicitly.
Dim & Fact Mapping Table:

Fact/Dim	Business Purpose	Grain	Keys	Source Tables	Notes

SCD Handling: Type 1/2 logic, change capture sources.
Metric Layer Alignment: How facts feed canonical metrics.

7D. Metrics Layer / Semantic Model First

Canonical Metrics Inventory: Table of metric name, owner, formula, grain, dimensions.

Metric	Owner	Formula	Grain	Dimensions	Source	Test(s)

Tooling Choice & Governance: dbt metrics, Transform, MetricFlow, LookML, etc.
Change Control: Versioning metrics, review process, backfill impact.
Exposure Mechanisms: SQL, APIs, BI tools, AI agents.

7E. Identity Graph / Customer 360 (CDP/Data Activation)

Entity Definitions: Person, account, device, session; how they relate.
Resolution Rules: Deterministic vs. probabilistic matching, priority of identifiers.
Golden Record Strategy: Where the truth lives, refresh cadence, conflict resolution.
Privacy Bucketing & Consent Flags: Data classification embedded in the model.
Downstream Contract: What fields are required by activation tools, update SLAs.

7F. Event-Centric Product Analytics Model

Event Taxonomy: Naming, required properties, common dimensions (user_id, session_id).
Sessionization & Attribution Logic: Windowing rules, referrers, campaign joins.
Aggregations & Rollups: Daily active users, retention cohorts, funnels.
Schema Evolution Plan: Adding properties, deprecating events, versioning.
Testing & QA: Event coverage, volume anomalies, property null rates.

Appendix A — Section 6 Variants (Target Architecture & Design for AI)

Pick one

6A. Retrieval-Augmented Generation (RAG) Pipeline

Context & Drivers – why pure LLM won’t meet accuracy/compliance; need grounded answers.
High-Level Diagram – Query → Retriever → Ranker → LLM (w/ context) → Post-processor.
Knowledge Store – vector DB choice, chunking strategy, embeddings model, update cadence.
Latency Budget – e.g., 300 ms retrieval, 700 ms generation (90-percentile).
Grounding & Citations – how sources are surfaced, confidence scoring, fallback if < threshold.
Failure Modes on Questions – stale index, long-tail queries, retrieval miss; mitigation.
Eval Hooks – automated factuality/hallucination tests every nightly index build.

6B. Agentic System with Tool Invocation

Agent Loop Diagram – Planner → Tool-calling → Memory Update → Critic → Next Action.
Tool Registry – JSON schema of actions (name, args, guardrails, auth scope).
Memory & Long-Term Context – vector store vs. relational store; eviction strategy.
Orchestration Runtime – LangChain, LlamaIndex, custom; concurrency, timeout rules.
Safety Layer – output filtering, rate limiting, red-teaming hooks.
Question-to-Component Map – which tools answer which business questions.
Observability – token/sec, cost per call, success/rollback metrics.

6C. Embedded AI Copilot in SaaS Product

User Journey Diagram – entry point, context capture, backend calls, UI surfacing.
Context Window Assembly – telemetry, recent actions, role/permissions.
Personalization Logic – per-user embeddings, on-device vs. server storage.
Real-Time Constraints – ≤ 1 s perceived latency; streaming vs. single shot.
Fallback UX – graceful degradation when model confidence low.
Telemetry for Product Questions – how “Time-to-Answer” & adoption are logged.

6D. Fine-Tuned / Custom LLM Service

Base Model Selection – criteria (license, capability, cost).
Fine-Tuning Dataset Flow – collection, filtering, dedup, weighting, holdout split.
Training Infra – GPUs, parameter-efficient tuning (LoRA, QLoRA), MLOps pipeline.
Serving Stack – quantization, tensor parallelism, autoscaling triggers.
Versioning & Rollback – traffic shadowing, canary %, safety nets.
Evaluation Harness – regression suite on KQA, toxicity, bias, latency, cost.

Input Modalities – text, image, voice; pre-processing pipelines.
Fusion Strategy – encoder sharing, late fusion, routing logic.
Real-Time Transcription / OCR – latency and cost gates.
Output Rendering – rich snippets, highlighted source images.
Accessibility Compliance – captions, alt-text generation.
Question Failure Map – how missing modality affects specific questions.

Appendix B — Section 7 Variants (Model & Knowledge Strategy)

Pick one

7A. RAG Knowledge-Base Design

Corpus Scope & Owners – docs, tickets, wikis, db records; TTL rules.
Chunking & Embeddings Strategy – overlap, window size, model choice, re-embed frequency.
Metadata Schema – source, confidence, permissions tag.
Index Update Workflow – CDC, webhook triggers, nightly rebuild.
Evaluation – recall@k, answer faithfulness, citation coverage.

7B. Prompt & System-Message Library

Prompt Catalog – table of prompt ID, purpose, target model, guardrails.
Parameterization Rules – slots, variable escape/escaping, defaults.
Versioning & A/B Testing – traffic split strategy, success metrics.
Observability – prompt-level latency, token usage, cost.

7C. Tool / Action Schema for Agents

JSON/YAML Contract – name, description, args, required auth, rate limits.
Registration & Discovery – dynamic loading vs. static registry.
Safety Scopes – sandboxing, dry-run, audit logging.
Mapping to Business Questions – which tool enables which KQA row.

7D. Evaluation & Feedback Dataset Design

Golden Dataset – representative user questions, expected answers, edge cases.
Scoring Rubrics – factuality, helpfulness, brevity, style.
Automated Graders – LLM-as-judge vs. deterministic scripts.
Human-in-the-Loop Pipeline – sampling %, UI for raters, adjudication.
Continuous Learning Loop – when data promotes to fine-tuning set.

7E. Vector-Store & Embedding Governance

Embedding Model Lifecycle – upgrade cycle, backfill policy.
Similarity Metrics – cosine vs. IP vs. dot-product; threshold tuning.
Namespace & ACL Strategy – per-tenant isolation, encryption at rest.
Cost Controls – shard sizing, cold-storage tiers, deletion hooks.

7F. Fine-Tuning / RLHF Dataset Curation

Source Mix – customer chats, docs, synthetic Q&A.
Filtering – PII removal, toxicity filter, language detection.
Label Schema – preference pairs, scorecards, multi-choice.
Data Weighting & Sampling – boost rare intents, down-weight low quality.
Ethics & Bias Review – checklists, external audit steps.

Key Questions & Answers (KQA) Matrix & Impact Scorecard

(Mandatory in every variant – paste near the top of Section 6 or 7)

Business Question	Metric / Output	Latency SLA	Accuracy Target	Component Path	Eval Test
…	…	…	…	…	…

Dimension	Current	Target	Δ Improvement	Evidence Plan
Time-to-Answer	…	…	…	…
Answer Accuracy	…	…	…	…
Cost / 1k Tokens	…	…	…	…

Brainforge Knowledge

Explorer

Technical Design Document Template (TDD) Data Template

Data Platform Projects Technical Design Document Template (TDD)

0 Document Control

1. Related Artifacts

2. Overview

2. Overview

3. Problem Statement & Requirements

4. Current State Assessment

5. Research & Discovery

6. Target Architecture & Design

A. Key Questions & Answers (KQA) Matrix

7. Data Modeling Strategy

8. Tooling & Technology Decisions Matrix

9. Decision Log & Rationale (ADR Index)

10. Implementation & Migration Plan

11. Testing, Data Quality & Validation

12. Decision Log & Rationale

13. Implementation Plan

B. Before / After Impact Scorecard

15. Operational Considerations

16. Review Meetings, RFC & Sign-offs

17. Meeting Notes

18. Agent Handoff Brief

Codebase Orientation

Scope for This Workstream

Implementation Sequence

Acceptance Criteria

Key Decisions to Carry Forward

Constraints and Guardrails

19. Appendices

Appendix A — Section 6 Variants (Target Architecture & Design)

6A. Data Warehouse / Lakehouse Migration or Re-Architecture

6B. CDP / Activation (Reverse ETL) Implementation

6C. dbt-Centric Modeling & Semantic Layer Build

6D. ETL/ELT Pipeline (New or Redesign)

6E. BI / Visualization Tool Rollout or Migration

6F. Large Technical Migration / Platform Consolidation

6.X Question-Driven Design Lens

6.Y Question-to-Component Mapping

Appendix B — Section 7 Variants (Data Modeling Strategy)

7A. Dimensional (Kimball) / Star Schema Refresh

7D. Metrics Layer / Semantic Model First

7E. Identity Graph / Customer 360 (CDP/Data Activation)

7F. Event-Centric Product Analytics Model

Appendix A — Section 6 Variants (Target Architecture & Design for AI)

6A. Retrieval-Augmented Generation (RAG) Pipeline

6B. Agentic System with Tool Invocation

6C. Embedded AI Copilot in SaaS Product

6D. Fine-Tuned / Custom LLM Service

6E. Multi-Modal AI Assistant

Appendix B — Section 7 Variants (Model & Knowledge Strategy)

7A. RAG Knowledge-Base Design

7B. Prompt & System-Message Library

7C. Tool / Action Schema for Agents

7D. Evaluation & Feedback Dataset Design

7E. Vector-Store & Embedding Governance

7F. Fine-Tuning / RLHF Dataset Curation

Key Questions & Answers (KQA) Matrix & Impact Scorecard

Graph View

Table of Contents

Backlinks