Agent Feedback Loop: Run → Prompt → Auto-Log → Analyze → Impact

Purpose: Automated feedback loop that prompts you after each agent run, captures feedback, auto-generates logs, analyzes patterns, and shows the impact of learnings. Uses “thinking to summary” approach from learn-extraction skill.

Related: CONTEXT_GRAPH_APPROACH.md, FEEDBACK_LOOP_PROCESS.md, RUN_LOG.md, AGENT_REGISTRY.md


Covered Agents

All agents in brainforge-vault are wired to this feedback loop. See AGENT_REGISTRY.md for the complete list.

Key agents:

  • Design-Ready Copy Agent
  • Campaign Brief Intake Agent
  • Ticket Creation Agent
  • Message Sequence Agent
  • Campaign Post Agent (via CC content system)
  • Slack Deployment Agent
  • Event Follow-Up Agent
  • LinkedIn Sequence Agent
  • ICP Analysis Agent
  • Metrics Teardown Agent
  • VP Partnerships Agent

Behavior: Every time you run any of these agents, you will be prompted for structured feedback (2-3 min), which auto-generates run logs and pattern analysis.


The Flow

1. You run an agent
   ↓
2. System prompts for feedback (structured questions)
   ↓
3. You provide feedback (2-3 min)
   ↓
4. System auto-generates run log entry
   ↓
5. System analyzes patterns (thinking → summary)
   ↓
6. System shows impact of learnings
   ↓
7. System suggests agent improvements (when pattern is clear)

Step 1: Agent Run Completion

After an agent finishes, the system captures:

  • Run metadata: Agent name, timestamp, input files, output files
  • Decisions made: Archetype selected, sections included/excluded, any overrides
  • Output generated: File paths, content summary

Auto-captured (no user input):

run_id: design-ready-copy-2026-02-04-insurance-broker
agent: design-ready-copy-agent
timestamp: 2026-02-04T14:32:00Z
input: gtm/campaign-launch/campaigns/insurance-broker-lead-intake.md
output: gtm/marketing-assets/design-ready-copy/insurance-broker-lead-intake-2pager.md
archetype_selected: service_2pager
decisions: Single service → Service 2-pager (not Sprint, not Seasonal)

Step 2: Feedback Prompt (Structured Questions)

System prompts you with domain-specific questions based on agent type:

For Design-Ready Copy Agent

1. Outcome questions:

  • Was the output used? (Yes / No / Partially)
  • If used: Who used it? (Designer name, role)
  • If not used: Why not? (Wrong format, missing sections, quality issues)

2. Quality questions:

  • Rate the output quality (1-10)
  • What worked well? (Free text)
  • What didn’t work? (Free text)
  • Any sections that were deleted/ignored? (Which ones?)

3. Decision questions:

  • Was the archetype selection correct? (Yes / No / Not sure)
  • If incorrect: What archetype should it have been?
  • Any sections that should have been included/excluded?

4. Process questions:

  • How long did it take to review/edit? (Minutes)
  • Would you use this agent again for similar work? (Yes / No / Maybe)
  • What would make it better? (Free text)

Time: 2-3 minutes to answer.


Step 3: Auto-Generate Run Log

System automatically creates log entry from run metadata + your feedback:

| Run ID | Campaign/Context | Archetype/Type | Input | Output | Decisions | Outcome | Quality | Time | Date |
|--------|------------------|----------------|-------|--------|-----------|---------|---------|------|------|
| design-ready-copy-2026-02-04-insurance-broker | insurance-broker-lead-intake | service_2pager | brief.md | 2pager.md | Single service → 2-pager | Used by Hannah, no edits | 9/10 | 5 min review | 2026-02-04 |

Saved to: gtm/agents/RUN_LOG.md (appended automatically)


Step 4: Pattern Analysis (Thinking → Summary)

System analyzes run log using “thinking to summary” approach:

Thinking Phase (Internal Analysis)

Load context:

  • Read RUN_LOG.md (all past runs)
  • Read agent PRD (current rules)
  • Read taxonomy (archetype definitions)
  • Read PATTERNS.md (existing patterns)

Analyze for patterns:

  • Archetype patterns: Which archetypes get used most? When?
  • Outcome patterns: Which runs succeed? Which fail? What’s different?
  • Quality patterns: What scores high? What scores low?
  • Deviation patterns: When do we deviate from expected path? Why?

Match to existing patterns:

  • Does this reinforce an existing pattern? (Confidence increase)
  • Is this a new pattern? (New entry, LOW confidence)
  • Is this a variation? (Add as variant)

Summary Phase (Output to User)

Pattern Summary Report:

# Pattern Analysis: Design-Ready Copy Agent
 
> Analyzed: 10 runs | Date: 2026-02-04
 
## Patterns Identified
 
### ✅ Reinforced Patterns (Confidence Increased)
 
**Pattern: Single service → Service 2-pager**
- **Confidence:** MEDIUM → HIGH (5th example reached)
- **Evidence:** 5/5 single-service campaigns used Service 2-pager
- **Impact:** Agent can auto-select Service 2-pager for single-service campaigns
 
**Pattern: Designers delete "Trusted by" when no case study**
- **Confidence:** LOW → MEDIUM (3rd example reached)
- **Evidence:** 3/3 runs with no matching case study → designer deleted section
- **Impact:** Agent should skip "Trusted by" section if no case study matches
 
### 🆕 New Patterns (LOW Confidence)
 
**Pattern: Insurance campaigns prefer Service 2-pager over Sprint**
- **Confidence:** LOW (2 examples)
- **Evidence:** 2/2 insurance campaigns used Service 2-pager (despite "sprint" language in brief)
- **Impact:** Agent should suggest Service 2-pager for insurance campaigns
- **Needs:** 3 more examples to reach MEDIUM confidence
 
### 📊 Quality Insights
 
- **Average quality score:** 8.2/10 (5 runs)
- **High-quality runs (9+):** 3/5 (60%)
- **Common quality issues:** "Trusted by" section noise (3 mentions)
 
### ⏱️ Time Savings
 
- **Average review time:** 5 minutes (vs 30 min manual drafting)
- **Time saved per run:** 25 minutes
- **Total time saved:** 125 minutes (5 runs)
 
## Recommended Agent Improvements
 
Based on patterns, suggest PR updates:
 
1. **Make "Trusted by" optional** (Pattern: Designers delete it 3/3 times when no case study)
   - Update: Taxonomy → "Trusted by: optional; hide if no matching case study"
   - Impact: Cleaner output, less designer editing
 
2. **Auto-select Service 2-pager for insurance campaigns** (Pattern: 2/2 insurance → Service 2-pager)
   - Update: Agent logic → "If campaign contains 'insurance' → suggest Service 2-pager"
   - Impact: Faster archetype selection
 
## Next Steps
 
- **Run 3 more insurance campaigns** → Pattern reaches MEDIUM confidence
- **Monitor "Trusted by" deletion** → If continues, make it optional
- **Track quality scores** → If average drops below 7, investigate

Step 5: Impact Summary

System shows cumulative impact of learnings:

# Learning Impact Summary
 
## Agent: Design-Ready Copy Agent
 
**Runs analyzed:** 10  
**Patterns identified:** 3 (2 reinforced, 1 new)  
**Confidence promotions:** 1 (MEDIUM → HIGH)
 
### Time Impact
- **Time saved:** 250 minutes (10 runs × 25 min saved per run)
- **Time invested:** 30 minutes (feedback prompts)
- **ROI:** 8.3x time savings
 
### Quality Impact
- **Average quality:** 8.2/10 (trending up from 7.5)
- **High-quality runs:** 60% (up from 40%)
- **Common issues fixed:** "Trusted by" noise (3 runs → pattern identified)
 
### Process Impact
- **Archetype selection accuracy:** 90% (9/10 correct)
- **Designer satisfaction:** High (no major edits needed)
- **Agent improvement suggestions:** 2 actionable PRs ready
 
### Knowledge Impact
- **Patterns added to memory:** 1 new pattern (insurance → Service 2-pager)
- **Patterns reinforced:** 2 patterns (confidence increased)
- **Agent PRD updates suggested:** 2 improvements
 
## Suggested PRs
 
1. **Make "Trusted by" optional** (High impact, low effort)
2. **Auto-suggest Service 2-pager for insurance** (Medium impact, low effort)

Implementation: Agent Wrapper

When you run an agent, wrap it with feedback capture:

# Pseudo-code for agent wrapper
 
def run_agent_with_feedback(agent_name, inputs):
    # Step 1: Run agent
    output = run_agent(agent_name, inputs)
    
    # Step 2: Capture metadata
    run_metadata = {
        'run_id': f"{agent_name}-{timestamp}-{context}",
        'agent': agent_name,
        'timestamp': now(),
        'input': inputs,
        'output': output,
        'decisions': extract_decisions(output)  # From agent logs
    }
    
    # Step 3: Prompt for feedback
    feedback = prompt_feedback(agent_name, run_metadata)
    
    # Step 4: Auto-generate log entry
    log_entry = create_log_entry(run_metadata, feedback)
    append_to_run_log(log_entry)
    
    # Step 5: Analyze patterns (if N runs reached)
    if run_count >= 5:
        pattern_analysis = analyze_patterns(agent_name)
        show_pattern_summary(pattern_analysis)
        suggest_improvements(pattern_analysis)
    
    # Step 6: Show impact summary
    impact = calculate_impact(agent_name)
    show_impact_summary(impact)
    
    return output

Feedback Prompt Templates (By Agent Type)

Design-Ready Copy Agent

## Feedback: Design-Ready Copy Agent Run
 
**Run:** design-ready-copy-2026-02-04-insurance-broker  
**Output:** insurance-broker-lead-intake-2pager.md
 
**1. Outcome:**
- [ ] Used as-is
- [ ] Used with edits
- [ ] Not used
- If not used: Why? _______________
 
**2. Quality (1-10):** [___]
 
**3. What worked well?**
_______________
 
**4. What didn't work?**
_______________
 
**5. Sections deleted/ignored?**
- [ ] Trusted by
- [ ] Case study highlight
- [ ] Other: _______________
 
**6. Archetype selection correct?**
- [ ] Yes
- [ ] No → Should have been: _______________
 
**7. Review time:** [___] minutes
 
**8. Would use again?**
- [ ] Yes
- [ ] No
- [ ] Maybe
 
**9. What would make it better?**
_______________

Pattern Analysis Logic (Thinking → Summary)

Thinking Phase

Load:

  • RUN_LOG.md (all runs for this agent)
  • Agent PRD (current rules)
  • PATTERNS.md (existing patterns)

Analyze:

  • Group runs by outcome (used/not used, quality scores)
  • Group runs by archetype (which archetypes succeed?)
  • Group runs by campaign type (insurance vs dbt vs other)
  • Identify deviations (when do we deviate from expected?)

Match:

  • Check if pattern exists in PATTERNS.md
  • If exists → reinforcement (increase confidence)
  • If new → create pattern (LOW confidence)
  • If variation → add as variant

Summary Phase

Output:

  • Pattern summary (reinforced, new, variations)
  • Quality insights (average scores, trends)
  • Time impact (saved vs invested)
  • Suggested improvements (PRs to create)

Format: Markdown report (see Step 4 example above)


Confidence Progression (Following Learn-Extraction Pattern)

ConfidenceCriteriaAction
LOW1-2 examplesNew pattern, track in PATTERNS.md
MEDIUM3-4 examplesPattern is reliable, update agent PRD
HIGH5+ examplesPattern is proven, auto-apply in agent

Promotion rules:

  • LOW → MEDIUM: After 3rd successful example
  • MEDIUM → HIGH: After 5th successful example

Demotion rules:

  • If pattern fails 2 consecutive times → Investigate
  • If pattern not used in 3 months → Archive

Integration with Existing Learn-Extraction

The agent feedback loop uses the same pattern extraction logic as your learn-extraction skill:

  1. Load context (run log, PRD, patterns)
  2. Extract patterns (what worked, what didn’t, reusable approaches)
  3. Classify (reinforcement, new, variation, no pattern)
  4. Update semantic memory (PATTERNS.md with confidence levels)
  5. Create episodic memory (detailed run record in RUN_LOG.md)
  6. Provide summary (pattern report + impact summary)

Difference: Agent feedback loop is triggered after each agent run (not daily), and focuses on agent-specific patterns (archetype selection, section usage, quality scores).


Example: Full Cycle

Run 1-4 (Baseline)

Runs logged: 4 insurance campaigns → Service 2-pager → All used, quality 8-9/10

Pattern analysis (after 4 runs):

  • Pattern: “Single service → Service 2-pager” (4/4, MEDIUM confidence)
  • No issues identified

Run 5 (Pattern Emerges)

Run: dbt campaign → Service 2-pager → Designer deleted “Trusted by” section

Feedback:

  • Quality: 8/10
  • Issue: “Trusted by section had no matching case study, deleted it”

Pattern analysis (after 5 runs):

  • New pattern: “Designers delete ‘Trusted by’ when no case study” (1/5, LOW confidence)
  • Impact: 1 run affected, but pattern identified

Suggestion: “Monitor this pattern. If it happens 2 more times, make ‘Trusted by’ optional.”


Run 6-7 (Pattern Confirms)

Runs: 2 more campaigns → Designer deleted “Trusted by” (no case study)

Pattern analysis (after 7 runs):

  • Pattern confirmed: “Designers delete ‘Trusted by’ when no case study” (3/7, MEDIUM confidence)
  • Impact: 3 runs affected, pattern is reliable

Suggestion: “PR ready: Make ‘Trusted by’ optional; hide if no matching case study.”


Run 8-10 (Pattern Applied)

After PR merge: Agent skips “Trusted by” if no case study

Runs: 3 campaigns → No “Trusted by” section (no case study) → Designer happy, no edits needed

Pattern analysis (after 10 runs):

  • Pattern proven: “Skip ‘Trusted by’ if no case study” (3/3 post-PR, HIGH confidence)
  • Impact: 0 designer edits needed (vs 3 edits before PR)
  • Time saved: 15 minutes (3 runs × 5 min saved per run)

Summary: Pattern identified → PR created → Agent improved → Time saved


Next Steps

  1. Build agent wrapper that captures run metadata and prompts for feedback
  2. Create feedback prompt templates for each agent type
  3. Build pattern analysis script (thinking → summary logic)
  4. Create impact summary generator (time, quality, knowledge metrics)
  5. Integrate with PR creation (auto-suggest PRs when patterns are clear)

For now: Start with manual feedback prompts (you answer questions) → system auto-generates logs → you review patterns monthly → create PRs when clear.

Future: Full automation (agent wrapper → auto-prompt → auto-log → auto-analyze → auto-suggest PRs).