[Client] — Golden dataset spec — [Domain / scope]

About this document (Brainforge)

Internal conventions for how this file works in the repo. Strip or export without this section when sharing with a client.

Titling and filename

Use [Client] — Golden dataset spec — [Domain or use case] for the document title. Examples: LMNT — Golden dataset spec — Omnichannel revenue · Acme — Golden dataset spec — Wholesale order-to-cash.

Filename: {client}-golden-dataset-{domain}.md under knowledge/clients/{client}/resources/.

When to use this template

Use this when building an evaluation dataset for AI-powered natural-language querying (e.g., Cortex Analyst, custom NL2SQL). The golden dataset defines a set of questions with verified correct answers that serve as the acceptance test for the semantic layer.

Do not use this template when:

designing the semantic view itself (use the Semantic View Design Doc)
profiling a new data source (use the Discovery Memo)

Document metadata

Status: [Draft / In review / Approved / Locked] Warehouse: [platform] — Account/region: [details] Semantic view: [view name or path to C2 doc] Version: [1.0 / increment when questions are added or answers change] Prepared by: Brainforge Last updated: [YYYY-MM-DD]

Artifact	Link / path	Notes
Semantic View Design Doc	`[path to C2 doc]`	The semantic view this dataset tests
Discovery Memo	`[path to A1 memo]`	Source profiling reference
Data Platform Documentation	`[Google Sheet link]`	Source catalog, metric definitions

1. Dataset purpose

[2–4 sentences. What questions should this golden dataset cover? What use cases or user personas does it represent? What makes a passing vs. failing result?]

2. Question catalog

Each row is one test case. Placeholder values are shown; replace with actual questions and answers.

#	Natural language question	Expected SQL logic	Expected result type	Result value	Tolerance	Status
1	`[e.g., "What was total revenue last month?"]`	`SUM(revenue) WHERE month = CURRENT_MONTH`	`[Scalar / Row / Table]`	`[$X]`	`[±% or exact]`	`[Active]`
2	`[e.g., "Which product had the most growth?"]`	`TOP 1 product ORDER BY growth DESC`	`[Row]`	`[Product name]`	`[exact]`	`[Active]`
3	`[e.g., "Show me revenue by state for last quarter"]`	`SELECT state, SUM(revenue) WHERE quarter = PREVIOUS`	`[Table]`	`[State: X, Revenue: Y; ...]`	`[±5%]`	`[Active]`
4	`[Edge case: "What was revenue last month?" when table is empty]`	—	`[Scalar]`	`[null or 0]`	`[exact]`	`[Active]`
5	`[Edge case: date range with no data]`	—	`[Scalar]`	`[0 or empty]`	`[exact]`	`[Active]`

3. Source tables

Table	Role	Verified answers query source
`[database.schema.table]`	`[fact / dimension / reference]`	`[SQL used to generate verified answers]`
`[database.schema.table]`	`[fact / dimension / reference]`	`[SQL used to generate verified answers]`

4. Edge cases

[Edge case] — [What the edge case is. How the golden dataset handles it. What the expected behavior of the NLQ system should be.]
[Edge case] — [...]

5. Known limitations

[Limitation] — [What the golden dataset does not cover. Why. What follow-up work would address it.]

6. Runner manifest

Attribute	Value
Question count	`[N] active questions`
Last run	`[YYYY-MM-DD]`
Pass rate	`[N / N] (XX%)`
Runner command	`[e.g., python scripts/golden_audit.py --dataset {path}]`

Appendix — Pre-handoff QA checklist

Every question has a verified answer from the warehouse (not hand-crafted)
Questions cover: simple aggregations, filters, group-bys, time ranges, comparisons
Edge cases are included (empty results, null handling, ambiguous phrasing)
Tolerance is defined per question and justified
Expected SQL logic is documented (directional, not executable — the NLQ engine may generate different SQL for the same answer)
Runner manifest tracks pass rate over time
Questions are written in the vocabulary the client actually uses

Brainforge Knowledge

Explorer

golden-dataset-spec-template

[Client] — Golden dataset spec — [Domain / scope]

About this document (Brainforge)

Titling and filename

When to use this template

Document metadata

1. Dataset purpose

2. Question catalog

3. Source tables

4. Edge cases

5. Known limitations

6. Runner manifest

Appendix — Pre-handoff QA checklist

Graph View

Table of Contents

Brainforge Knowledge

Explorer

golden-dataset-spec-template

[Client] — Golden dataset spec — [Domain / scope]

About this document (Brainforge)

Titling and filename

When to use this template

Document metadata

Related artifacts

1. Dataset purpose

2. Question catalog

3. Source tables

4. Edge cases

5. Known limitations

6. Runner manifest

Appendix — Pre-handoff QA checklist

Graph View

Table of Contents