Self-Learning Data Agents Integration for Brainforge
Date: 2026-03-01
Owner: Data + AI Platform
Purpose
Capture the self-learning design patterns that should be added to Brainforge’s data-agent architecture, and define how to operationalize them safely.
Design Principles
1) Supervised autonomy over full autonomy
Use staged autonomy levels. Start with agent-assisted execution and move toward supervised autonomy only after quality and safety metrics are stable.
2) Trace-first learning
Every worker run should produce a structured trace:
- task type
- input artifacts
- decisions made
- outputs generated
- validation outcomes
- human feedback
No learning updates should be generated without trace evidence.
3) Evaluator-driven updates
Use evaluators to score run quality before promoting learned behavior:
- correctness and data quality
- policy and compliance checks
- cost and runtime efficiency
- reviewer acceptance
Learned updates are promotion candidates, not direct production changes.
4) Human-gated policy promotion
For high-impact workflows (dbt model logic, Snowflake grants, production ETL behavior), learned changes require explicit human approval and canary release.
5) Reversible learning pipeline
All promoted behavior updates must be:
- versioned
- attributable to trace evidence
- reversible by rollback config or rule revert
Brainforge Architecture Additions
A) Learning Controller
Add a control component that:
- ingests run traces
- computes evaluator scores
- proposes rule/PRD/template updates
- routes updates through approval gates
- tracks post-promotion performance
B) Worker Memory Model
Separate memory into:
- run memory: short-term context for current execution
- pattern memory: stable, reviewed patterns from repeated evidence
This prevents one noisy run from changing default behavior.
C) Safety Guardrails
- hard block unsupervised production writes from learned behavior
- enforce dual control for grants and destructive changes
- require rollback artifact before promotion
dbt and Data PR Integration
Extend dbt PR checks with a learning loop:
- execute changed-scope compile/run/test
- generate smart data diff
- score output quality and risk
- record reviewer disposition (accepted, modified, rejected)
- update pattern confidence
Promotion rule:
- low confidence: observe only
- medium confidence: propose update in PR
- high confidence: enable default suggestion path
Analyst Workflow Integration
Learning signals for analyst agents:
- insight accepted vs rejected
- evidence sufficiency score
- revision count before stakeholder acceptance
- narrative clarity score from reviewer rubric
Use these signals to improve:
- hypothesis templates
- evidence packaging
- deck structure and messaging order
Operating Metrics
Track:
- recommendation acceptance rate
- policy-violating recommendation rate
- rollback rate for learned updates
- time from pattern detection to safe promotion
- reviewer override rate
Implementation Sequence
- Start with one high-volume workflow: dbt PR impact.
- Add structured trace capture and evaluator scoring.
- Add manual review queue for learned update proposals.
- Add canary rollout path for approved updates.
- Expand to Snowflake governance and analyst workflows.
Source Alignment and Traceability
To keep this integration auditable across the team:
- keep the research report in a canonical repo location:
knowledge/engineering/data-platform/plans/research/self-learning-data-agents-source.md
- maintain a mapping table from research concepts → Brainforge implementation components
- update this doc and rollout docs when new findings materially change autonomy, safety, or evaluator strategy