Implementation Discipline: Execute, Meta-Reasoning & Save States
This guide explains how to implement from plans reliably, evolve your system through meta-reasoning, and build trust for faster iteration — covering the /execute command design, the Navigate→Implement→Verify pattern, research validation, save states, and speed progression — going deeper than the always-loaded overview in sections/02_piv_loop.md.
1. What This Guide Covers
The Journey from Methodology to Discipline
The System Foundations guide established the why — the system gap and learning architecture. The PIV Loop Practice guide taught the how — the PIV Loop in practice. The Global Rules Optimization guide taught how to build — modular CLAUDE.md and strategic context loading. The Command Design Framework guide taught how to automate — slash commands and the INPUT→PROCESS→OUTPUT framework. The Planning Methodology guide taught how to plan — the 6-phase planning methodology that produces implementation-ready plans. This guide teaches execution discipline — how to implement from plans reliably and evolve the system when things go wrong.
This is deliberately the shortest guide in the series. The brevity is the point — it proves the PIV Loop thesis: when planning is thorough, implementation becomes simple.
What You’ll Learn
- The execute command design — two critical components that prevent lazy AI behavior
- Navigate → Implement → Verify — the per-task pattern for reliable execution
- Custom workflow injection — extending the execute command with project-specific integrations
- PIV loops in action — two complete loops demonstrating deliberate vs fast execution
- Meta-reasoning — the 5-step system evolution technique for permanent fixes
- The WHERE-to-fix framework — deciding which part of the system needs the fix
- Research validation — verifying research completeness before building plans on it
- Save states & git workflow — commit strategy with three rollback scenarios
- Generic vs customized commands — how commands evolve per-project through meta-reasoning
- Speed & trust progression — when and how to accelerate your PIV loops
The Core Insight
“If you invest heavily in planning, implementation becomes simple.”
This guide’s length directly reflects this. All the intelligence lives in the structured plan — the execute command just orchestrates how tasks are handled.
The Sandwich Metaphor
PLANNING (thick layer — research, patterns, context)
↓
IMPLEMENTATION (thin layer — execute command)
↓
VALIDATION (thick layer — testing, review, system evolution)
When you get planning right, implementation becomes almost automatic. The execute command is simple because the structured plan contains all context the AI needs.
2. The Execute Command Deep Dive
Purpose
The execute command is the bridge between planning and code. The structured plan defines what tasks to execute; the execute command specifies how to handle each task. See .claude/commands/execute.md for the full implementation.
Two Critical Components
| Component | Purpose | Without It |
|---|---|---|
| Thoroughness Instructions | Force AI to read the entire plan, run all validation | AI reads part of plan, skips validation, asks permission at each step |
| Task Handling Process | Define Navigate→Implement→Verify per-task pattern | AI implements haphazardly, misses patterns, doesn’t verify |
Thoroughness Instructions
AI coding assistants have quirks where they try to be lazy. Without explicit instructions, they may:
- Only read part of the plan
- Set up a “phased approach” asking permission at each step
- Execute some validation commands and ask you to do the rest
The execute command addresses this with explicit directives:
- “Read the ENTIRE plan carefully”
- “Execute ALL validation commands from the plan in order”
- “Don’t skip validation steps”
This makes a considerable difference for feature implementations.
Task Handling Process: Navigate → Implement → Verify
For each task in the plan:
- Navigate — Identify the file and action required, read existing related files if modifying
- Implement — Follow the detailed specifications exactly, maintain consistency with existing patterns
- Verify — After each file change, check syntax, ensure imports are correct, verify types
This per-task cycle prevents the AI from implementing everything in a rush and then discovering issues at the end.
Current Execute Command Structure
The command follows seven sections (see .claude/commands/execute.md):
| Step | Purpose |
|---|---|
| 1. Read & Understand | Parse entire plan, note dependencies and validation commands |
| 1b. memory.md warnings (optional) | Check past gotchas before implementing |
| 1.5 Archon init (optional) | Create project and tasks for Kanban visibility |
| 2. Execute tasks | Navigate→Implement→Verify per task, with Archon status updates |
| 3. Testing strategy | Create test files, implement test cases from plan |
| 4. Run validation | Execute ALL validation commands; fix failures before continuing |
| 5. Final verification | Confirm all tasks complete, tests passing, conventions followed |
| 5.5 Archon completion (optional) | Mark all tasks done, update project status |
| 6. Update plan checkboxes | Check off acceptance criteria and completion checklist |
Output Report
The execute command specifies a structured output: completed tasks (files created/modified), tests added (files and results), validation results (command output), and readiness for /commit.
Golden Nuggets for the Execute Command
- Check environment changes: After implementation, ask “How can I run everything? Do I need to change anything in my environment?” — catches deployment requirements
- Concise summary: Include a summary request in the output format (what was implemented, files changed, next steps)
- Don’t skip validation: Add as an explicit instruction — AI assistants will try to shortcut validation if not told otherwise
3. Custom Workflow Injection
The WHAT vs HOW Distinction
The structured plan outlines what tasks to knock out. The execute command specifies how to handle them. Customizing the HOW is where you inject project-specific workflows.
Integration Examples
| Workflow Integration | What It Does | Where in Execute |
|---|---|---|
| Archon MCP | Status updates per task (doing → review → done) | Steps 1.5, 2a.5, 5.5 |
| memory.md warnings | Check past gotchas before implementing | Step 1b |
| Notion/Jira | Update external task boards | After each task completion |
| Research per task | Run tool-specific research before implementing | Start of each task |
| Progress tracking | Custom logging/reporting | Throughout |
Backward Compatibility Pattern
Use conditional checks — “If Archon available, use it” — for backward compatibility. Not everyone has MCP servers configured. The command works with or without integrations.
The execute command is the most customized command in your system. Every project evolves it differently based on workflows, task management tools, and validation requirements.
4. PIV Loops in Action
The Full Workflow
A complete PIV loop runs through six prompts:
/prime— Load context, orient the AI to the codebase- “Based on the PRD, what tool should we implement first?” — PRD-guided feature selection; even if you know what to build, let the PRD guide ordering
- Vibe planning research — “No implementation. Research only.” Reference on-demand context, fetch external docs, explore codebase patterns
/planningwith final considerations — Feed vibe planning context into the structured plan command- Meta-reasoning on plan quality — “This plan is WAY too long (1752 lines). Do some meta reasoning…” Leads to system evolution
/execute [plan-path]— Implement from the validated, right-sized plan
Golden nugget: After vibe planning, ask “Let me know what context you already have in this conversation” to verify all on-demand context was loaded.
Anatomy of Each Prompt
Prompt 2 — PRD-guided selection: The AI analyzes the PRD and recommends the safest starting tool (one that validates the entire integration stack). Even when you already know what to build, letting the PRD guide ordering ensures logical feature sequencing.
Prompt 3 — Vibe planning research: The key directive is “No implementation. Research only.” This prompt loads on-demand context (tool designs, VSA patterns), fetches external documentation (library docs with specific URLs), and explores the existing codebase. It gets both you and the AI on the same page before creating the structured plan.
Prompt 5 — Plan quality meta-reasoning: This is where system evolution happens. Rather than just compressing the plan, you ask WHY it happened and WHERE to fix it. The result: a permanent constraint added to the planning command that prevents the issue from recurring.
First Loop vs Second Loop
| Aspect | First Loop (Deliberate) | Second Loop (Fast) |
|---|---|---|
| Vibe planning | Extensive research, multiple prompts | Skipped entirely |
| Reason | First-ever tool implementation | Similar to previous tool |
| System evolution | Plan length constraint added | System already evolved |
| AI reference | No prior work to reference | Read previous implementation |
| Speed | Multiple hours, many prompts | Minutes, 3 prompts |
The AI References Its Own Work
A powerful pattern emerges in the second loop: when implementing similar features, the AI will:
- Read through the PRD to understand the next tool
- Find the previous tool implementation in the codebase
- Use it as a reference pattern for the new feature
This is why Layer 1 planning (PRD, on-demand context) is so powerful — it creates a compounding knowledge base that the AI leverages automatically.
Key Takeaway
The second loop was faster because the system had evolved (hard constraint in place), the AI could reference previous work, and trust was earned through the first loop. Speed is a consequence of system maturity, not a shortcut.
5. Meta-Reasoning: System Evolution Technique
What Meta-Reasoning Is
Asking the AI to analyze why something went wrong and where in the system the fix belongs — rather than just fixing the immediate problem. It’s the difference between patching a symptom and curing the disease.
The 5-Step Process
- Identify the problem — e.g., “This plan is 1700 lines, way too long”
- Ask for analysis — prompt: “Do some meta reasoning. Don’t make any changes yet. Help me understand WHY this happened and how I can adjust the system to avoid this in the future. Look at my commands, rules, and templates.”
- AI examines the system — analyzes global rules, on-demand context, commands, templates, vibe planning prompts to locate the root cause
- You decide which suggestion to implement (you may override the AI’s idea)
- Apply the system fix FIRST, then fix the immediate output
Step 4 is critical — don’t accept all AI suggestions blindly. Human judgment selects the right fix granularity.
The WHERE-to-Fix Decision Framework
| Fix Location | When to Use | Example |
|---|---|---|
| Global rules (CLAUDE.md/sections) | Convention applies to ALL tasks | ”Always use structured logging” |
| On-demand context (reference/) | Task-type-specific guidance | Streaming patterns guide |
| Commands (planning, execute) | Process/workflow issue | Add plan length constraint |
| Templates (structured plan, PRD) | Output format/structure issue | Add new section to plan template |
| Vibe planning (your prompts) | Research was incomplete or scope wrong | Ask better research questions |
One-Off Fix vs System Evolution
| Approach | Scope | Outcome |
|---|---|---|
| One-off fix | Fixes immediate output | Same issue recurs next time |
| System evolution | Updates command/template | Issue prevented for all future runs |
Always prefer system evolution. One-off fixes solve today; system updates solve forever. Every mistake is an opportunity to make the system better.
Concrete Example: Plan Length
Problem: Plan was 1,752 lines (target: 700-1000).
Meta-reasoning revealed: No hard constraint on plan length existed in the planning command.
AI suggestions: (1) Add hard constraint, (2) reduce detail in task subsections, (3) compress research sections, (4) limit validation detail.
Human decision: Accept #1 (add constraint), reject #2 (keep detailed tasks — they’re valuable). Simple fix: just add the line count constraint.
System fix applied:
CRITICAL CONSTRAINT: The final structured plan must be between 700-1000 lines long.Then fix the immediate output: “Make this plan between 500 and 700 lines long.”
System fix first. Immediate fix second. Now all future plans respect the constraint automatically.
The Streaming Bug Example (Human Override)
During the second PIV loop, streaming issues (missing first token, missing newlines) triggered meta-reasoning. The AI suggested creating extensive on-demand context for streaming patterns. The human chose a simpler, more general fix: research validation — validating research documents during planning before building plans on them.
Why the human override was better:
- More general (applies to all research, not just streaming)
- Won’t clutter
reference/with topics that may not recur - Addresses the systemic issue (incomplete research)
- Simpler to maintain
Common Anti-Patterns
| Anti-Pattern | Why It’s Harmful |
|---|---|
| One-off fix without system evolution | Solves today but not tomorrow |
| Applying all AI suggestions blindly | Sometimes simpler is better; human judgment matters |
| Ignoring plan length issues | Plans >1000 lines indicate insufficient granularity |
| Skipping research validation | Gaps discovered during implementation are expensive |
| Jumping to fix before analyzing | Miss the root cause; fix symptoms not disease |
When implementation fails repeatedly, use meta-reasoning to identify whether the root cause is in the plan, the system rules, or the execution approach. Ask: “Do some meta reasoning. Don’t make any changes yet. Help me understand WHY this happened and how I can adjust the system to avoid this in the future.”
6. Research Validation
The Problem
During the second PIV loop (note manager tool), implementation had streaming issues — missing first token, missing newlines. Meta-reasoning revealed the root cause was in Layer 2 planning: a research document created during vibe planning (pydantic-ai-streaming-research.md) was incomplete and didn’t cover critical streaming patterns.
The Solution
Validate research documents during the planning phase, before building the structured plan on top of them. This was added to the planning command as Phase 3b: Research Validation.
What Was Added
After creating any research reports or technical documents during vibe planning:
- Read each document thoroughly
- Verify completeness — Are all critical patterns covered? Are code examples complete? Are edge cases addressed?
- Flag contradictions between research sources
- If critical research is missing, do targeted follow-up before creating the structured plan
Why This Is Better Than On-Demand Context
| Creating on-demand context | Research validation |
|---|---|
| Specific to one topic (streaming) | General (applies to all research) |
Clutters reference/ with niche guides | No new files created |
| Maintenance burden | Self-maintaining (part of the planning flow) |
| Addresses symptom | Addresses systemic issue (incomplete research) |
Philosophy: Extend research to include research validation. Don’t just create documents — verify they’re complete before building the plan on top of them.
7. Save States & Git Workflow
Commit Strategy
Every stage gets its own commit, creating save points for rollback:
git commit -m "plan: {feature} structured plan" # After planning
git commit -m "feat: {feature} implementation" # After execution
Three Rollback Scenarios
| Scenario | Rollback | Then |
|---|---|---|
| Bad plan | git stash or git checkout . | Evolve command/template, retry planning |
| Bad implementation | git stash or git checkout . | Tweak plan, retry with /execute |
| After system evolution | Go to any save point | Retry to verify the fix works |
Value of Committing “Bad” Implementations
Sometimes commit buggy code intentionally:
- Creates a save state for teaching validation (the Validation Discipline guide uses the streaming bug)
- Shows before/after comparison for system evolution
- Demonstrates the retry workflow
Not all commits need to be perfect. The commit history shows exactly what changed and when. After evolving the system (fixing a command or template), you can retry from any save point to see if the fix works.
8. Generic vs Project-Customized Commands
The Evolution Pattern
As you evolve commands through meta-reasoning, they often become project-specific. The plan length constraint is universal; the streaming research validation example references Pydantic AI specifically.
Classification
| Evolution Type | Scope | Example | Keep in Template? |
|---|---|---|---|
| Universal fix | All projects | Plan length constraint (700-1000 lines) | Yes |
| Domain fix | Similar projects | Pydantic AI streaming validation | No (project-specific) |
| Project fix | Single project | Specific file naming convention | No (project-specific) |
Strategy
- Template repo: Generic commands, widely applicable
- Project repo: Customized commands, evolved through meta-reasoning for that specific codebase
The more you evolve, the more customized — which is good for that project but may not transfer to others. This divergence is healthy and expected. Maintain this template as your generic baseline; create project-customized versions in each project’s .claude/commands/.
9. Speed & Trust Progression
First PIV Loop (Deliberate)
Prime → extensive vibe planning → plan → validate plan → execute. Multiple prompts, hours of work. Encountered issues (plan too long), applied meta-reasoning, evolved the system.
Second PIV Loop (Fast)
Prime → plan → execute. Skipped vibe planning entirely. Similar feature, AI referenced previous implementation, system already evolved. Three prompts, minutes of work.
When to Speed Up
| Criterion | Why It Matters |
|---|---|
| 3+ similar features implemented successfully | Patterns are proven |
| Commands produce consistent output | System is reliable |
| System evolved through meta-reasoning | Known issues are fixed |
| AI can reference previous implementations | Patterns available in codebase |
What “Speeding Up” Actually Means
Speeding up doesn’t mean cutting corners — it means earning the right to skip redundant steps:
| Step | First Loop | After Trust Built |
|---|---|---|
/prime | Always | Always |
| Vibe planning | Extensive (multiple prompts) | Skip if similar feature |
| Research validation | Full verification | Quick scan |
/planning | Full 6-phase process | Full 6-phase process |
| Plan review | Detailed manual review | Quick scan |
/execute | Watch closely | Trust the process |
| Validation | Run everything | Run everything (never skip) |
Note: you never skip /planning or validation. The time savings come from reduced vibe planning and faster plan review — not from skipping safety gates.
Trust Progression
Manual Prompts → Commands → Chained Commands
↑ trust & verify ↑ ↑ trust & verify ↑
Warning: “Don’t skip stages. Prematurely speeding up leads to unpredictable results and difficult debugging.” Speed is earned through repetition and system evolution, not by skipping steps.
The Caveat
“I wouldn’t recommend going this fast. It’s a bit of an exaggeration, but you get the point I’m trying to drive home here.”
The second loop’s speed was partly for demonstration. In practice, even fast loops benefit from a brief vibe planning conversation to catch edge cases that “similar” features might not share.
10. Practical Exercises
Exercise 1: Run Your First PIV Loop
Challenge: Take a simple feature from your PRD and complete the full loop: prime → vibe plan → structured plan → execute → commit.
Focus: Note what works and what doesn’t. Identify one issue and apply meta-reasoning to determine where in the system the fix belongs.
Expected outcome: A committed feature AND at least one system improvement (command update, template change, or rule addition).
Key takeaway: The first loop is always slow — that’s by design. You’re building trust and evolving the system.
Exercise 2: Meta-Reasoning Practice
Challenge: After encountering an implementation issue, write the meta-reasoning prompt. Let the AI analyze. Don’t make changes yet.
The prompt to use:
Do some meta reasoning. Don't make any changes yet.
Help me understand WHY this happened and how I can adjust
the system to avoid this in the future. Look at my commands,
rules, and templates.
Focus: Which AI suggestion do you accept? Which do you override? Where in the system does the fix belong? Use the WHERE-to-fix framework from Section 5.
Key takeaway: Human judgment selects the right fix granularity — simpler is often better.
Exercise 3: System Evolution Audit
Challenge: Review your last 3 implementations. For each, ask: what went wrong? Was it fixed one-off or systemically?
Audit template:
| Implementation | Issue Found | Fix Type | System Location | Action Needed |
|---|---|---|---|---|
| Feature 1 | (describe) | One-off / System | (which part) | (what to update) |
| Feature 2 | (describe) | One-off / System | (which part) | (what to update) |
| Feature 3 | (describe) | One-off / System | (which part) | (what to update) |
Focus: If any were one-off fixes, apply the system fix now. Update the relevant command, template, or rule.
Key takeaway: One-off fixes solve today; system updates solve forever. Every missed system fix is compounding debt.
FAQ: Common Questions
”Is this really the shortest guide?”
Short answer: Yes, by design. Long answer: The brevity proves the PIV Loop thesis — thorough planning makes implementation simple. The execute command is straightforward because all the intelligence is in the structured plan. Implementation is the thin layer in the sandwich metaphor (thick planning → thin execution → thick validation).
”When should I use meta-reasoning vs just fixing the issue?”
Short answer: Always for recurring or systemic issues. Long answer: One-off fixes are acceptable for truly unique, unrepeatable problems. But if you’ve seen the issue before, or it feels like a category of problem (e.g., “plans are always too long”), that’s a system bug. Meta-reasoning finds the root cause and fixes it permanently.
”Should I commit buggy implementations?”
Short answer: Sometimes, intentionally. Long answer: Committing a known-buggy implementation creates a save state for comparison, validation practice (the Validation Discipline guide uses the streaming bug for this), and teaching. But don’t commit security issues or broken interfaces others depend on. Label intentional bug commits clearly.
”How do I know when to speed up (skip vibe planning)?”
Short answer: When you’ve done 3+ similar features with reliable results. Long answer: Speed is earned through repetition. Your commands must produce consistent output, the AI must be able to reference previous implementations, and your system must be evolved through meta-reasoning. If in doubt, do the vibe planning — thoroughness is always safer than speed.
”What if my execute command produces poor results?”
Short answer: Apply meta-reasoning to find the root cause. Long answer: The issue is rarely the execute command itself. Check: Is the plan quality sufficient? (Go back to /planning.) Are patterns documented with code examples? (Go back to research.) Are validation commands specified? (Update the plan template.) The execute command is only as good as the plan it consumes.
Next Steps
- Read
sections/02_piv_loop.mdfor the always-loaded PIV Loop overview - Read
.claude/commands/execute.mdto see the current execute command implementation - Try Exercise 1 — run a complete PIV loop on a small feature
- Apply meta-reasoning to one issue you encounter (Exercise 2)
- Review the planning-methodology skill for the planning phase that feeds into execution
- Move to the Validation Discipline guide — validation strategies, the 5-level validation pyramid,
/code-reviewand/execution-reportcommands, divergence analysis with/system-review, and how validation feeds back into system evolution
Related Resources
- PIV Loop overview:
sections/02_piv_loop.md— always-loaded overview of Plan → Implement → Validate - Command design:
reference/command-design-framework.md— on-demand command design with meta-reasoning overview - Execute command:
.claude/commands/execute.md— the actual execute command implementation - Structured Plan Template:
templates/STRUCTURED-PLAN-TEMPLATE.md— the plan format that/executeconsumes - Planning Methodology skill (prerequisite):
.claude/skills/planning-methodology/— planning phase that feeds into execution - Validation Discipline guide (next):
reference/validation-discipline.md— validation phase that follows execution
That’s the Implementation Discipline guide! You now understand:
- The execute command’s two critical components (thoroughness + task handling)
- Navigate → Implement → Verify per-task pattern
- Custom workflow injection (Archon, memory.md, task management)
- Meta-reasoning: the 5-step system evolution technique
- The WHERE-to-fix decision framework (5 locations)
- Research validation — verify before building the plan
- Save states and three rollback scenarios
- Generic vs project-customized command evolution
- Speed and trust progression — when to accelerate
Ready for the Validation Discipline guide? Learn validation discipline — the 5-level validation pyramid, /code-review and /execution-report commands, divergence analysis with /system-review, and how validation feeds back into system evolution.