Self-Learning Data Agents Integration for Brainforge

Date: 2026-03-01
Owner: Data + AI Platform


Purpose

Capture the self-learning design patterns that should be added to Brainforge’s data-agent architecture, and define how to operationalize them safely.


Design Principles

1) Supervised autonomy over full autonomy

Use staged autonomy levels. Start with agent-assisted execution and move toward supervised autonomy only after quality and safety metrics are stable.

2) Trace-first learning

Every worker run should produce a structured trace:

  • task type
  • input artifacts
  • decisions made
  • outputs generated
  • validation outcomes
  • human feedback

No learning updates should be generated without trace evidence.

3) Evaluator-driven updates

Use evaluators to score run quality before promoting learned behavior:

  • correctness and data quality
  • policy and compliance checks
  • cost and runtime efficiency
  • reviewer acceptance

Learned updates are promotion candidates, not direct production changes.

4) Human-gated policy promotion

For high-impact workflows (dbt model logic, Snowflake grants, production ETL behavior), learned changes require explicit human approval and canary release.

5) Reversible learning pipeline

All promoted behavior updates must be:

  • versioned
  • attributable to trace evidence
  • reversible by rollback config or rule revert

Brainforge Architecture Additions

A) Learning Controller

Add a control component that:

  1. ingests run traces
  2. computes evaluator scores
  3. proposes rule/PRD/template updates
  4. routes updates through approval gates
  5. tracks post-promotion performance

B) Worker Memory Model

Separate memory into:

  • run memory: short-term context for current execution
  • pattern memory: stable, reviewed patterns from repeated evidence

This prevents one noisy run from changing default behavior.

C) Safety Guardrails

  • hard block unsupervised production writes from learned behavior
  • enforce dual control for grants and destructive changes
  • require rollback artifact before promotion

dbt and Data PR Integration

Extend dbt PR checks with a learning loop:

  1. execute changed-scope compile/run/test
  2. generate smart data diff
  3. score output quality and risk
  4. record reviewer disposition (accepted, modified, rejected)
  5. update pattern confidence

Promotion rule:

  • low confidence: observe only
  • medium confidence: propose update in PR
  • high confidence: enable default suggestion path

Analyst Workflow Integration

Learning signals for analyst agents:

  • insight accepted vs rejected
  • evidence sufficiency score
  • revision count before stakeholder acceptance
  • narrative clarity score from reviewer rubric

Use these signals to improve:

  • hypothesis templates
  • evidence packaging
  • deck structure and messaging order

Operating Metrics

Track:

  • recommendation acceptance rate
  • policy-violating recommendation rate
  • rollback rate for learned updates
  • time from pattern detection to safe promotion
  • reviewer override rate

Implementation Sequence

  1. Start with one high-volume workflow: dbt PR impact.
  2. Add structured trace capture and evaluator scoring.
  3. Add manual review queue for learned update proposals.
  4. Add canary rollout path for approved updates.
  5. Expand to Snowflake governance and analyst workflows.

Source Alignment and Traceability

To keep this integration auditable across the team:

  1. keep the research report in a canonical repo location:
    • knowledge/plans/agent-powered-data-environment/research/self-learning-data-agents-source.md
  2. maintain a mapping table from research concepts Brainforge implementation components
  3. update this doc and rollout docs when new findings materially change autonomy, safety, or evaluator strategy