ABC Andi April + May Project Plan
Canonical spine + LOE merge:
knowledge/clients/abchomeandcommercial/resources/andi-ai-project-plan.md— PR #854. This export stays as the long-form narrative; edit the resources file first for structural changes, then refresh this copy if you still need a standalone markdown mirror.
1. Engagement Overview
- SOW: https://drive.google.com/file/d/1KA71lPzoEYwrvoKfNg85m7N8bXIWz1XB/view?usp=drive_link
- Client: ABC
- CSO: Pranav
- Engagement period: Dec 1st 2025 - May 31st 2026
- Total revenue: Tiers -
- Tier 1: Up to 2,000 sessions/month for 4.00 per session)
- Tier 2: Up to 5,000 sessions/month for 2.40 per session)
- Tier 3: Up to 10,000 sessions/month for 1.50 per session)
- Tier 4: Up to 15,000 sessions/month for 1.33 per session)
- Total est. hours: 20hrs/week
- Linear team: https://linear.app/brainforge/view/migration-ecb05c72b3f0
Business Outcomes (Executive Frame)
- Current business pain: long hold times, cancellations, and low trust in Andi due to prior downtime/inaccuracy periods.
- Primary outcomes for Apr-May:
- Increase trusted Andi adoption across CSR teams by combining technical reliability with trainer-led behavior change.
- Improve central doc update quality and speed through a structured triage → approval → batch-apply workflow.
- Convert transcript data into weekly, department-level adoption actions (not just usage reporting).
- Value narrative for sponsor/renewal:
- ABC investment is justified by lower support friction and higher CSR productivity.
- We should be able to tie outcomes to measurable levers: lower handling/hold time, fewer avoidable cancellations, and increased successful Andi-assisted resolutions.
- Exact ROI math and session-tier business case should be explicitly defended from SOW assumptions and updated baseline metrics before renewal discussions.
- ROI:
- At 10k is 500 hours/month saved across the org.
- With 50 CSRs, that’s only 10 hours per CSR/month (~30 min/day) to break even on labor alone.
- Everything from lower cancellations is upside on top of that.
2. Initiatives
Streamline updates to the central doc
- Business unlock: ABC and Brainforge gets time back from regular updates to central docs. Also automation will streamline the updates and docs will retain structure.
- Delivery date: April 17th
- Services used: AI
Transcript Analysis
- Business unlock: Deep analysis of transcript data will give ABC trainers actionable insights about what CSRs are being asked for in calls, driving better Andi usage and accuracy
- Delivery date: May 22nd
- Services used: AI
3. Projects
Streamline updates to the central doc
- Parent initiative: Manual Reduction of Central Doc Updates
- Service line: AI
- Service Lead: @Sam Roberts
- Start date: March 30th
Department based insights
- Parent initiative: Transcript Analysis
- Service line: AI
- Service Lead: @Sam Roberts
- Start date: March 30th
4. Milestones
M1: Data foundation
- Parent project: Department based insights
- What the client receives: Short readout (slide or doc): where transcripts live, how you identify a call / CSR / department, retention and PII constraints, and proof you can pull one complete week for a pilot group.
- Target Date: April 3rd
M2: TRIAGE Workflow Operationalized
- Parent project: Central Doc Copilot
- What the client receives: SOP published, team trained, and first live batch of tickets executed end-to-end via the new flow.
- Target Date: April 3rd
M3: Placement Automation Live
- Parent project: Central Doc Copilot
- What the client receives: The system recommends the exact insertion path (existing heading/subheading) or a new subheading when needed, and posts a preview of the proposed placement in Linear activity.
- Target Date: April 10th
M4: Duplicate/Conflict Engine Live
- Parent project: Central Doc Copilot
- What the client receives: Automation running in production, with duplicate and conflict results (including evidence and severity where applicable) pasted into Linear activity on the relevant tickets.
- Target Date: April 10th
M5: End-of-day memo show-and-tell (final process)
- Parent project: Central Doc Copilot
- What the client receives: A walkthrough of the full operational loop: how the end-of-day trigger runs the process that builds the change memo and sends it to the client so stakeholders see the same rhythm they will use in production: approved updates land in the central doc, then the memo is generated and distributed as the single record of what entered the doc that day.
- Target Date: April 17th
M6: Weekly ingest live
- Parent project: Department based insights
- What the client receives: Table or screenshot: transcript counts by week and department, plus first filtered queue for high-hold and repeat-call cohorts.
- Target Date: April 24th
M7: Questions + Andi answers (first end-to-end slice)
- Parent project: Department based insights
- What the client receives: Questions extracted from transcripts → Andi answer → match to past approved Q&A if you have it.
- Target Date: May 1st
M8: Trainer loop + early quality signal
- Parent project: Department based insights
- What the client receives: Results after trainers have reviewed a batch of questions and they have answered: % correct / incorrect. The Results will be of the same format from M7 but now after the trainer approved questions, we are getting a lot more “matches” to answered questions.
- Target Date: May 8th
M9: “Easy wins” by department and topic
- Parent project: Department based insights
- What the client receives: Ranked list or chart: top question themes by department, where trainer labels and Andi answers align and CSRs can route this to “Andi now” list.
- Target Date: May 15th
M10: Weekly operating rhythm + Rill dashboard
- Parent project: Department based insights
- What the client receives: Live Rill walkthrough: trends over weeks, going over top 10 areas per department where CSRs can be using Andi right now.
- Target Date: May 22nd
KPI Framework (Hold Time + Cancellations)
Why these KPIs
- Yvette called out hold-time variance and cancellation risk as high-priority experience and retention signals.
- Transcript analysis should prioritize these operational pain points before broad exploratory analytics.
KPI definitions (first pass)
- KPI 1: Total Hold Time (monthly, by CSR and department)
- Definition: Sum of hold-time minutes each CSR places customers on hold in the month.
- Directional goal: Decrease from baseline in high-hold cohorts.
- KPI 2: Hold Events per Call
- Definition: Number of hold events / total calls handled (same period, same cohort).
- Directional goal: Decrease where call mix is comparable.
- KPI 3: Repeat Call Rate for Same Resolution
- Definition: Percent of customer issues requiring follow-up calls because first call did not resolve the issue.
- Directional goal: Decrease through better first-call answer quality.
- KPI 4: Cancellation-Linked Call Pattern
- Definition: Share of cancellation outcomes where transcripts show unresolved question loops, repeated holds, or handoff deferrals before cancellation.
- Directional goal: Decrease through targeted playbook updates.
- KPI 5: Andi Usage in High-Hold / Repeat-Call Segments
- Definition: Andi invocation rate for cohorts with highest hold-time and repeat-call incidence.
- Directional goal: Increase, with corresponding reduction in KPI 1-4 in those cohorts.
Data dependencies
- Yvette scorecard data: hold counts, total hold-time, AHT context.
- Transcript corpus + Andi logs: intent extraction, answerability analysis, and usage correlation.
Milestone-to-KPI mapping
- M1-M6 (Data foundation + ingest): establish baseline for KPI 1-5 and cohort filters.
- M7-M8 (Q/A + trainer loop): convert high-impact intents into trainer-reviewed guidance for high-hold cohorts.
- M9 (Easy wins): publish top intent actions specifically tied to hold-time and repeat-call reduction.
- M10 (Dashboard): weekly trendline view for KPI 1-5 by department and CSR cohort.
5. Technical Approach
Central Doc Copilot
Relevant Playbook/s:
- N/A
Target Process (Step-by-Step)
Purpose: Evolve the triage-to-Linear pipeline so that proposed central-doc changes from ABC trainers are captured in a structured Linear comment, checked for duplicates and conflicts against the live central doc, approved by Janiece and Yvette, and batched into a daily human update — with an end-of-week memo summarizing what was added.
- Step 1 — Define Linear workflow: issue states and labels for ABC-assigned triage tickets, structured trainer comment template, lock/unlock behavior, and where Janiece/Yvette approval fits in the state machine, define when automations trigger or not.
- Step 2 — Wire automation trigger to the existing triage pipeline: listen for the structured comment submission event (webhook or polling) and kick off the analysis job; post flags, severity rankings, and evidence quotes back to the Linear ticket as activity.
- Step 3 — Establish central doc access and indexing: define how the doc is exported or read via Google Docs API /
gws, set the refresh cadence for the index, and document freshness expectations for the daily batch. - Step 4 — Build duplicate detection: implement “already stated elsewhere” matching against the indexed doc; output must include quotable excerpts from both the doc and the proposal so reviewers can judge without re-reading the full doc.
- Step 5 — Build conflict detection (advisory): implement contradiction and tension signals with severity ranking and evidence quotes; ensure output is human-readable inside a Linear activity comment.
- Step 6 — Build placement and heading logic: resolve trainer-supplied heading (pass-through) vs model-suggested heading (ranked candidates); include logic to propose a new heading when no existing section fits.
- Step 7 — Build preview generation: define and render how approvers see the proposed insertion in context before approving.
- Step 8 — Build daily batch and memo: create the runbook for the human-led weekly doc update (with optional
gwsassisted edits), generate the change memo, and distribute to Janiece and Yvette.- @2pm CT, all TRIAGE tickets that are scheduled to be added into the central doc are locked. Then of those batch of tickets, we run a check to see if there is any duplicates or conflicts among the TRIAGE’s themselves. If there is, Janiece/Yvette make a final decision to keep or toss.
- Step 9 — Run pilot on first ticket set: execute the first cohort of tickets end-to-end on the new flow, document gaps, and iterate.
Tools & Stack
- Primary tools: Linear (system of record), Google Docs (central doc), Mastra agent (duplicate/conflict/placement engine)
- Integrations: Linear webhook or polling trigger, Google Docs API /
gwsCLI for doc read and (optionally) write, email or Slack for memo distribution - Infrastructure: Lightweight automation script triggered from Linear events on a cloud function (ABC’s GCP?); indexed representation of central doc stored for analysis
Architecture Notes / Dependencies
- Central doc must be indexed or exported on a known cadence (e.g. daily refresh) so duplicate and conflict checks reflect current content; freshness lag is a known constraint.
- Automation is additive only to the existing triage pipeline — no changes to how issues are created or how Janiece routes work.
- Linear is the single system of record for proposals, flags, discussion thread, and approval state; no parallel tracking system.
- Publishing the central doc remains a deliberate human step — automation only assists via
gws; it does not auto-publish. - Memo format and distribution channel (Doc vs email vs Slack) to be confirmed with Janiece and Yvette before build.
Performance / Accuracy Targets
- Duplicate detection precision ≥ 85% on pilot ticket set (false negatives are worse than false positives — a missed duplicate pollutes the doc; a false flag is caught at review).
- Conflict detection false-positive rate < 25% after pilot calibration; severity ranking must be consistent enough that P1 flags are actioned before P2/P3 in Janiece/Yvette review.
- Placement heading match rate: target ≥ 80% agreement between model-suggested heading and human-selected heading, measured over the first 20 approved proposals.
- Linear write-back latency: analysis results posted to the ticket within 60 seconds of structured comment submission.
- Targets for duplicate precision and placement match rate to be re-evaluated and tightened after the pilot ticket set (Milestone 1).
- Daily memo sent on schedule every day
- Copilot adoption % is not a KPI; Copilot is the mandated workflow.
Resourcing
- 1-2 AI engineers: 20 total hours/week (Copilot-heavy April, transcript-heavy May)
Risks
- Trainer adoption of the structured comment format is a dependency → if trainers do not follow the template, automation cannot trigger reliably.
- Central doc indexing freshness: daily refresh means very recent doc edits may not be reflected in duplicate/conflict checks until the next index cycle.
- Conflict detection false-positive rate could create friction for Janiece/Yvette at the approval gate if too noisy → severity calibration needed during pilot.
- Memo distribution channel and format not yet confirmed with client.
- CSR behavior change can lag even when system quality improves; adoption risk is organizational, not only technical.
Department based insights
Relevant Playbook/s:
- N/A
Target Process (Step-by-Step)
Purpose: Ingest CSR call transcripts weekly, extract and normalize customer questions, evaluate each question against Andi, and produce a structured trainer review package that surfaces where Andi is already reliable (“easy wins”) and where it needs improvement.
- Step 1 — Run data foundation spike: document where transcripts live, how calls are keyed (by CSR and department), retention policy, PII handling constraints, and prove a complete weekly slice can be pulled for a pilot group. Define pilot group( possibly based on hold time if accessible)
- Step 2 — Build weekly ingest pipeline: automate a job that loads all relevant transcripts for the prior week, partitions by department (and optionally CSR), and writes structured records to the warehouse.
- Step 3 — Implement question extraction: identify customer questions and clear informational requests from each transcript; apply intent normalization to deduplicate and canonicalize near-duplicate questions across the weekly batch.
- Step 4 — Implement Andi evaluation: for each extracted question, invoke the same Andi path CSRs use, capture the answer text, and record a first-cut answerability heuristic (valid response / empty+error / similarity to past trainer-approved Q&A).
- Step 5 — Build trainer review report: assemble a structured export (department, question, Andi answer, verdict fields) and deliver in a format trainers can use without additional tooling (CSV or shared Google Sheet).
- Step 6 — Build categorization and “easy win” views: classify questions by department, topic, and ticket type; surface clusters where Andi answers align with trainer labels as candidates for “CSRs can ask Andi this now.”
- Step 7 — Schedule full weekly automation and build Rill dashboard: wire all steps into a scheduled job run, refresh warehouse datasets, and build Rill views for trends, volumes, trainer verdict rates, and backlog of low-confidence or incorrect answers.
Not in Scope
- Real-time transcript analysis and insights
- Every call’s transcript is analyzed and an individual report created. Only weekly reports are generated
Tools & Stack
- Primary tools: Mastra (question extraction, normalization, answerability heuristic), Rill (dashboard and trend visualization), Big Query (data warehouse)
- Integrations: Transcript source / storage system (TS scripts/job runner to process 8x8 and ingest data, store in BQ), Andi API (same path CSRs use), trainer review export (CSV or structured doc)
- Infrastructure: GCP cloud run, cron for weekly scheduled pipeline; warehouse tables partitioned by week and department
Architecture Notes / Dependencies
- Transcript storage location, keying scheme, retention policy, and PII handling must be confirmed in the data foundation spike before pipeline build begins — this is a hard prerequisite. Some of this is known, not all.
- Weekly pipeline partitions data by department (and optionally CSR) and writes to the warehouse; Rill connects to warehouse for live dashboard updates.
- Answerability heuristic is a first-cut signal; ground truth comes from trainer labels; the heuristic improves over time as labeled data accumulates.
- Trainer review report format must be usable by ABC trainers without additional tooling, likely CSV or shared Google Sheet.
- “Easy wins” classification logic depends on alignment between Andi answer quality and trainer labels; first meaningful signal expected at Milestone 4 (~45-hour mark).
- Brainforge runs weekly transcript ingest and Rill refreshes for department-based Andi insights through May 31, 2026; anything after that needs a renewal or change order.
Performance / Accuracy Targets
- Question extraction recall ≥ 90% on pilot transcript set
- Answerability heuristic accuracy ≥ 80% agreement with trainer verdicts after the first reviewed batch; re-evaluated and tightened at Milestone 4 (~45-hour mark).
- Weekly pipeline SLA: full run complete and Rill dataset refreshed within 4 hours of transcript pull.
- Analyze 500 transcripts/week
Resourcing
- 1-2 AI engineers: 20 total hours/week (Copilot-heavy April, transcript-heavy May)
Risks
- PII restrictions on customer transcript text are not yet confirmed — this could constrain question extraction, storage, and trainer review exports. Must be resolved in spike.
- Customers commonly ask about multiple services in a single call — multi-intent transcripts increase normalization complexity and may inflate “easy wins” counts if not handled carefully.
- Andi API reliability under batch load has not been tested; LLM rate limits (an issue in past) or latency at scale could extend pipeline runtime.
- Trainer review participation rate affects the quality of labeled signal; low participation delays meaningful “easy wins” and accuracy reporting.
Operating Cadence (Client Touchpoints)
- Daily (async artifact):
- End-of-day change memo that summarizes what was applied, what was rejected, and what needs follow-up.
- Purpose: create confidence in system behavior and avoid hidden state in internal threads.
- Weekly (live sync):
- Working session with Janiece/Yvette focused on operational blockers, rejected/appealed items, and next-week priorities.
- Transcript/adoption insight review with trainer-facing recommendations by department.
- Adoption-focused fieldwork:
- Ongoing interviews with high-usage and low-usage CSR cohorts to identify “why not Andi” drivers that transcripts alone cannot explain.
- Findings fed into trainer enablement and process updates each week.
Delivery Model (20 hrs/week) and Capacity Strategy
- Target model: 20 hrs/week sustained delivery with AI-first execution.
- How this is feasible:
- AI-assisted coding for implementation-heavy tasks (workflow scripts, analysis routines, formatting, and integration glue code).
- Clear ticket sequencing and milestone gates to reduce idle handoff time.
- Reuse of existing infrastructure and retrieval paths where possible, with explicit fallback paths to avoid rebuild loops.
- Capacity guardrails:
- Treat 20 hrs/week as a cap with buffer for incidents;
- Weekly review of planned vs actual hours and root causes for overage.
- SL + CSO alignment each week on whether scope, pace, or staffing needs adjustment.
6. Open Questions
CSO Questions (Client / Scope)
-
What (if any) are the limitations to using customer text (PII) from the transcripts?
- This must be answered by April 17th.
- Pranav to reach out to Yvette
-
Preferred memo format and channel (Doc vs email vs Slack) for Janiece and Yvette.
- Email or Slack message will suffice
-
Defend the business case behind pricing tiers and expected ROI using current usage/adoption baseline.
- Owner: Pranav (with Uttam context + prior SOW math)
- First pass: Frame pricing with a simple value equation:
(avoidable call minutes + avoidable cancellations + trainer/QA time savings) x monthly volumecompared against tier cost. - First pass assumptions to validate: baseline monthly call volume, current Andi-assisted resolution rate, average CSR fully loaded hourly cost, cancellation-to-revenue impact, and expected adoption lift from transcript + trainer loop.
- First pass sponsor line: “We are pricing on measurable outcome lift, not implementation effort; each tier corresponds to a tested usage/adoption band.”
-
Clarify executive outcomes narrative for sponsor conversations (lead with outcomes, not implementation detail).
- Owner: Pranav
- fewer avoidable escalations/cancellations, and a repeatable trainer-led adoption system by department.
- By end of May, ABC gets a stable Andi operating loop that improves CSR decision speed and turns transcript data into weekly adoption actions.
-
Define measurable adoption KPIs and target deltas for Apr-May (e.g., successful Andi-assisted resolutions, avoidable non-use patterns, trainer-led behavior changes).
- Added KPI Section above
SL Questions (Technical Blockers / Unknowns)
-
Should we prioritize using ABC’s GCP services, or since these are processes we’re running is Railway ok?
- Resolution: ABC’s GCP for all
-
Validate the 20 hrs/week delivery model against actual staffing plan and role split (including Casey allocation assumptions).
- prioritize automation-heavy build in first half of week; reserve explicit buffer for support incidents, trainer enablement, and PR/review overhead.
- if actuals exceed 20 hrs/week for two consecutive weeks, raise flag to CSO and HoD + potentially de-scope lower-priority items
Head of Delivery
- Confirm approval conditions to avoid rework before sponsor renewal narrative.
- Owner: Uttam + Pranav
- Due: Next HoD follow-up
- First pass approval conditions: outcomes-first narrative is clear, KPI targets are explicit, technical path and fallback are documented, and capacity plan is realistic at 20 hrs/week.
7. Sign-offs
- CSO - Business scope, milestones, and client accountability
- Signed off?
- SL - Technical approach, effort estimates, and delivery accuracy
- Signed off?
- Head of Delivery - Feasibility, business case, and leadership alignment
- Signed off?
- Client sponsor name - Client alignment and acceptance of scope + timeline
- Signed off?