CTA Data Operations Q1 2026 — Statement of Work

Date: December 19, 2025
Version: 1.1
Client: Consumer Technology Association (CTA)
Author: Brainforge AI


1. Overview

This Statement of Work defines Brainforge’s continued engagement to scale and expand CTA’s modern DataOps platform throughout Q1 2026. Building on the foundation established in 2025 (dbt staging models, Snowflake infrastructure, and initial marts), this SOW focuses on expanding data coverage, building business-ready analytics datasets, and enabling self-service analytics across the organization. The work will continue scaling out data that’s landing, expanding the marts layer, and making data more accessible to CTA’s growing analytics needs.


2. Objectives

  • Expand Data Coverage: Complete staging models for all priority data sources and establish pipelines for new data sources
  • Build Business-Ready Marts: Create dimensional models and fact tables that enable self-service analytics for key business entities
  • Enable Team Productivity: Onboard CTA team members (Kyle, Kai, and others) to dbt and Snowflake, enabling them to build and maintain models independently
  • Improve Data Accessibility: Establish role-based access control, documentation, and training to make data discoverable and usable
  • Support CES 2026: Deliver analytics-ready datasets and reporting capabilities for CES event operations and post-event analysis
  • Establish Operational Excellence: Complete CI/CD pipelines, testing frameworks, and monitoring to ensure reliable data operations

3. Scope of Work

3.1 In-Scope

Data Source Expansion (Phase 1)

  • Complete Remembers staging models for all remaining modules (exhibit, speaker, award, app, etc.)
  • Build staging models for Salesforce Marketing Cloud (SFMC) data
  • Build staging models for Salesforce CRM data (post-CES, Feb+)
  • Ingest historical S3 archive data (~300-400 tables from legacy SQL Server for entity resolution)
  • Establish data pipelines for new sources as identified (e.g., Polytomic event data, Formstack data, scanner data)
  • Create monthly backup automation for Remembers data (drop and clone process)
  • Migrate webhooks database data to proper Snowflake structures
  • Katherine’s SFTP Workflow Migration: Convert daily CES invite Python/Postgres workflow to dbt (8-10 flat files → transformations → 5 views → FTP upload to Marketing Cloud)

Marts Layer Development (Phase 2)

  • Build dimensional models (dim_member, dim_organization, dim_event, dim_country, etc.)
  • Create fact tables (fct_registrations, fct_purchases, fct_email_engagement, etc.)
  • Develop business reports:
    • Active membership report (replaces full-day manual Excel process for membership team)
    • Member engagement report
    • Event performance reports
  • Build CES-specific analytics datasets:
    • Registration funnels (track 29+ min avg registration time)
    • Badge scan analytics (lead retrieval + foot traffic)
    • Session scanner data (session attendance patterns)
    • Exhibitor ROI analysis
  • Entity Resolution Implementation: Build DataOps ID as canonical identifier across systems, starting with CES vendor integration
  • Create intermediate models for complex business logic and reusable transformations

Infrastructure & Operations (Phase 3)

  • ETL Platform Finalization: Complete Polytomic evaluation and setup, or establish alternative approach for P0 data sources (SFMC, Merits FTP, etc.)
  • Orchestration Implementation: Evaluate and implement Snowflake native task orchestration (preferred) vs GitHub Actions for dbt execution
  • Complete GitHub Actions CI/CD pipeline for dbt (automated testing on PRs)
  • Set up Snowflake CLI integration for automated repo refresh on merges
  • Implement comprehensive dbt testing framework (data quality tests, schema tests, custom tests)
  • Establish Snowflake role-based access control (RBAC) with functional roles
  • Set up Snowflake warehouses for different workload types (ETL, Transform, BI/Reporting)
  • Create monitoring and alerting for pipeline health
  • Document Snowflake grants and access patterns
  • FTP Integration: Establish S3 → Snowflake → dbt → FTP workflow for Marketing Cloud data delivery

Team Enablement (Phase 4)

  • Onboard Kyle to dbt and Snowflake (training, access setup, workflow documentation)
  • Onboard Kai (new BI analyst) to Snowflake and reporting workflows
  • Record dbt/Snowflake onboarding sessions for future team members and “citizen data engineers”
  • Cursor Demo: Provide training on Cursor for Katherine and Jay (both expressed interest)
  • Create team training materials and documentation
  • Establish code review process and best practices
  • Enable team members to build marts from intermediate models
  • Support “citizen data engineers” (Anna P/K/R, Chris Deathloff, Quinn, JC) with safe environment to learn

Documentation & Knowledge Management (Phase 5)

  • Complete schema.yml documentation for all models
  • Create data dictionary and business glossary
  • Document naming conventions and project structure
  • Establish data lineage documentation
  • Create runbooks for common operations
  • Integrate Snowflake catalog with Glean (exploration and implementation)

3.2 Out-of-Scope

  • Okta authentication optimization (separate discovery workstream - identified as causing 80% of customer support requests)
  • Shopify digital asset store evaluation (separate discovery workstream - authentication loop and download failures)
  • CES Registration Security Fix (DataOps ID vendor integration for lead retrieval - Katherine managing separately)
  • Full BI tool implementation (deferred to Q2 - Power BI available for interim use)
  • Long-term managed services beyond Q1 2026
  • Custom application development outside of data workflows
  • Data source migrations or system replacements
  • Marketing content creation or business strategy
  • Email deliverability fixes (DMARC/DKIM/SPF configuration)

4. Requirements & Inputs

Access & Permissions

  • Continued access to Snowflake, dbt Cloud, and GitHub repository
  • Access to new data sources as they are identified (Polyatomic, Formstack, etc.)
  • Administrative permissions for Snowflake RBAC setup
  • Access to Glean for integration exploration

Documentation

  • Data source documentation and API specifications
  • Business requirements for marts and reports
  • Existing data dictionaries and business glossaries
  • Historical data quality issues and known data problems

Stakeholder Availability

  • Katherine Bayless (Data Operations): Bi-weekly strategic alignment, ad-hoc questions
  • Kyle (Data Analyst): Weekly onboarding sessions, model review, requirements gathering
  • Kai (Data Analyst): Training sessions, requirements for member engagement reports
  • Jay Heavner (IT): RBAC setup coordination, infrastructure approvals
  • Other CTA team members: Ad-hoc requirements gathering, testing, feedback

Data & Systems

  • Access to Remembers data (via Snowflake Share) - ✅ Active
  • Access to SFMC data (API key available in AWS Secrets Manager)
  • Access to Salesforce CRM data (post-CES, Feb+, currently in “do not disturb” mode)
  • Historical S3 archive access (CTA-DataOps-Archive bucket with ~300-400 tables from legacy SQL Server)
  • Historical CES scan data (S3 bucket access - session scanner files “too big to import manually”)
  • Formstack/webhooks data for migration - ✅ Working (webhook → S3 → Snowflake)
  • Merits registration data (FTP access, flat files only - no APIs)
  • EventPoint data (good APIs available, post-CES priority)
  • Polytomic connector availability (or alternative ETL approach if Polytomic doesn’t proceed)

5. Deliverables

Phase 1: Data Source Expansion

  • Complete staging models for all Remembers modules (accounting, app, award, crm, crm_v2, exhibit, purchase, shopping, speaker)
  • SFMC staging models with documentation (sends, opens, clicks, bounces, unsubscribes, jobs, lists)
  • Salesforce CRM staging models with documentation (post-CES, Feb+)
  • Historical S3 archive ingestion (one-time load of ~300-400 tables for entity resolution)
  • Monthly backup automation script and documentation
  • Webhooks data migration to proper Snowflake structures
  • Katherine’s SFTP workflow migrated to dbt (daily CES invite process automation)
  • New data source pipelines: Merits FTP, session scanners, lead retrieval data (as identified)

Phase 2: Marts Layer Development

  • Dimensional models (dim_member, dim_organization, dim_event, dim_country, etc.)
  • Fact tables (fct_registrations, fct_purchases, fct_email_engagement, fct_lead_scans, fct_session_attendance, etc.)
  • Business reports:
    • Active membership report (6-column spreadsheet replacement - “P Negative-1” priority per Katherine)
    • Member engagement report
    • Event performance reports
  • CES analytics datasets:
    • Registration funnels (track and optimize 29+ min registration time)
    • Badge scan analytics (lead retrieval + foot traffic for exhibitor ROI)
    • Session scanner data (attendance patterns, popular sessions)
    • Exhibitor engagement analysis
  • Entity resolution: DataOps ID implementation across historical + current data
  • Intermediate models for complex transformations and reusable business logic

Phase 3: Infrastructure & Operations

  • ETL platform setup (Polytomic or alternative approach) for P0 sources
  • Orchestration implementation (Snowflake native tasks preferred, or GitHub Actions backup)
  • GitHub Actions CI/CD pipeline (automated testing on PRs, repo → S3 deployment)
  • Snowflake CLI integration for automated repo refresh on merge
  • Comprehensive dbt testing framework (unique, not_null, relationships, custom tests)
  • Snowflake RBAC implementation with functional roles (data analyst, data engineer, read-only, etc.)
  • Snowflake warehouse configuration (ETL, Transform, BI/Reporting workloads)
  • FTP integration for Marketing Cloud data delivery workflow
  • Monitoring and alerting setup
  • Grants documentation

Phase 4: Team Enablement

  • Kyle onboarding complete (access, training, first models built independently)
  • Kai onboarding complete (Snowflake access, reporting workflows)
  • Cursor training delivered for Katherine and Jay
  • Recorded onboarding sessions for future team members and “citizen data engineers”
  • Training materials and documentation (dbt, Snowflake, best practices)
  • Code review process documentation
  • Best practices guide for building marts and maintaining models
  • “Calculated neglect” approach: enable team independence through safe environment to learn

Phase 5: Documentation & Knowledge Management

  • Complete schema.yml documentation for all models
  • Data dictionary and business glossary
  • Naming conventions and project structure documentation
  • Data lineage documentation
  • Operational runbooks
  • Glean integration assessment and implementation plan (if feasible)

6. Project Timeline

Q1 2026 (January - March 2026)

January 2026: Foundation & Expansion

  • Weeks 1-2: Complete Remembers staging models, begin SFMC staging
  • Weeks 3-4: SFMC and Salesforce CRM staging models, Kyle onboarding begins

February 2026: Marts & Infrastructure

  • Weeks 1-2: Begin dimensional models, complete RBAC setup, CI/CD pipeline
  • Weeks 3-4: Fact tables development, testing framework, team training

March 2026: Reports & Enablement

  • Weeks 1-2: Business reports, CES datasets, documentation completion
  • Weeks 3-4: Team enablement completion, Glean integration exploration, Q2 planning

Total Duration: 12 weeks (Q1 2026)


7. Assumptions

  • CTA stakeholders (Katherine, Kyle, Kai, Jay) are available for scheduled meetings and ad-hoc questions
  • Data sources remain accessible and stable throughout Q1
  • Remembers contract and data sharing agreement continues as expected
  • New data sources (Polyatomic, Formstack) can be accessed within Q1 timeline
  • CTA team members can dedicate time for training and onboarding
  • Business requirements for marts and reports can be gathered within Q1
  • No major organizational changes that would impact data operations
  • Snowflake capacity and performance remain adequate for growing data volumes
  • GitHub repository access and permissions remain stable

8. Risks

RiskImpactMitigation
Data source access delaysMediumIdentify access requirements early; establish backup plans for critical sources; prioritize sources with existing access; Salesforce CRM blocked until post-CES (Feb+)
Polytomic evaluation delaysMediumMaintain alternative approach (AWS Glue, manual pipelines) if Polytomic doesn’t proceed; have backup ETL plan ready
Team onboarding slower than expectedMediumStart onboarding early; provide recorded sessions; create self-service documentation; schedule regular check-ins; Katherine’s “calculated neglect” approach
Business requirements unclearMediumSchedule regular requirements gathering sessions; start with high-priority use cases (“P Negative-1” = Remembers); iterate based on feedback
CES timeline constraintsHighNo major changes before CES (Jan 2026); Katherine’s SFTP workflow must keep running; post-CES window for changes (Feb+)
Data quality issues discoveredMediumBuild comprehensive testing framework; document known issues (e.g., Remembers dashboard showing 1,658 vs 1,100 actual members); create data quality reports; prioritize critical fixes
Entity resolution complexityMediumStart with historical data profiling; build DataOps ID as canonical identifier; test with one vendor before expanding
Infrastructure capacity constraintsLowMonitor Snowflake usage; optimize queries; scale warehouses as needed; plan for growth; Katherine wants to be “poster child for leveraging all Snowflake features”
Scope creep from other workstreamsMediumMaintain clear boundaries with Okta/Shopify discovery work; Katherine’s security fix separate; prioritize Q1 deliverables; document dependencies
Finance scrutiny on costsMediumDemonstrate ROI early; leverage frugal approach (“squeeze value out of tools we have”); consolidate into Snowflake where possible; use AWS Marketplace for procurement ease

9. Acceptance Criteria

Phase 1 Acceptance:

  • All Remembers staging models complete with tests and documentation
  • SFMC and Salesforce CRM staging models operational
  • Monthly backup automation running successfully
  • Webhooks data migrated to proper structures

Phase 2 Acceptance:

  • Minimum 5 dimensional models delivered and tested
  • Minimum 3 fact tables delivered and tested
  • Active membership report and member engagement report operational
  • CES datasets available for event operations

Phase 3 Acceptance:

  • GitHub Actions CI/CD pipeline running on all PRs
  • Snowflake RBAC implemented with functional roles
  • Comprehensive dbt testing framework in place
  • Monitoring and alerting operational

Phase 4 Acceptance:

  • Kyle successfully onboarded and building models independently
  • Onboarding materials and recordings available
  • Code review process documented and in use

Phase 5 Acceptance:

  • All models have complete schema.yml documentation
  • Data dictionary and business glossary published
  • Operational runbooks available
  • Glean integration assessed (implementation may extend beyond Q1)

10. Communication Plan

Regular Meetings:

  • Weekly Technical Working Session (60 min): Kyle + Brainforge team (dbt development, model review)
  • Bi-weekly Strategic Alignment (30 min): Katherine Bayless + Uttam Kumaran
  • Monthly Infrastructure Review (30 min): Jay Heavner + Brainforge team (RBAC, infrastructure)
  • Ad-hoc sessions: As needed for requirements gathering, training, and issue resolution

Async Communication:

  • Dedicated Slack channel for daily questions and updates
  • GitHub repository for all code, documentation, and issues
  • Shared workspace (Google Drive or equivalent) for documentation and deliverables
  • Weekly status email summarizing progress, blockers, and upcoming milestones

Escalation Path:

  • Technical blockers escalated to Ashwini Sharma or Uttam Kumaran
  • Business requirements questions escalated to Katherine Bayless
  • Infrastructure/access issues escalated to Jay Heavner
  • Timeline risks communicated within 24 hours of identification

11. Open Questions

These questions will be addressed during Q1 execution:

  1. Polyatomic Integration: Has Polytomic responded with evaluation results? If not proceeding, what is backup ETL approach for SFMC, Merits FTP, EventPoint?
  2. Orchestration Decision: Snowflake native task orchestration vs GitHub Actions? Need to finalize for Katherine’s SFTP workflow.
  3. Historical S3 Data Priority: Which of the ~300-400 tables should be ingested first for entity resolution? What is the timeline?
  4. CES Scanner Data: Session scanner files “too big to import manually” - what is file size/format? When will data be available?
  5. BI Tool Selection: Deferred to Q2 per Katherine. Power BI available for interim use. When to revisit Sigma evaluation?
  6. Glean Integration: What are the technical requirements for Snowflake catalog integration? Priority post-Q1 based on Dec discussions.
  7. DataOps ID Scope: Is Katherine’s vendor integration proceeding? How does this affect broader entity resolution work?
  8. Citizen Data Engineers: Which additional team members (Anna P/K/R, Chris Deathloff, Quinn, JC, Tom Moschello) should be onboarded in Q1?
  9. CES Post-Event Analysis: What are the specific reporting needs for post-CES analysis (Feb-Mar)? Lead gen, foot traffic, exhibitor ROI?
  10. Salesforce CRM Timeline: When post-CES (Feb? March?) will CRM data access be available? What are integration priorities?

12. Pricing

Pricing for this SOW follows the hourly rate structure established in the BrainForge CTA Agreement dated November 12, 2025.

Estimated Effort for Q1 2026:

PhaseEstimated HoursDescription
Phase 1: Data Source Expansion100-130 hoursStaging models, pipelines, data migration, historical S3 ingestion, Katherine’s SFTP workflow
Phase 2: Marts Layer Development140-170 hoursDimensional models, fact tables, reports, entity resolution, CES analytics
Phase 3: Infrastructure & Operations80-100 hoursETL/orchestration setup, CI/CD, RBAC, testing, monitoring, FTP integration
Phase 4: Team Enablement50-70 hoursTraining, onboarding, Cursor demo, citizen data engineer support, documentation
Phase 5: Documentation & Knowledge40-60 hoursDocumentation, data dictionary, runbooks, Glean assessment
Total Estimated Hours410-530 hoursQ1 2026 Engagement

Updated Estimate Rationale:

  • Added historical S3 archive ingestion (~300-400 tables for entity resolution)
  • Added Katherine’s SFTP workflow migration (daily CES invite process)
  • Expanded CES analytics scope (session scanners, lead retrieval, exhibitor ROI)
  • Added orchestration decision and implementation (Snowflake native tasks evaluation)
  • Added Cursor training for Katherine and Jay
  • Expanded entity resolution scope (DataOps ID implementation)

Billing Structure:

  • Hours will be billed monthly based on actual time spent
  • Invoices will include detailed time logs by phase and deliverable
  • Any hours exceeding the estimated range will be discussed with Katherine Bayless before proceeding

Payment Terms:

  • Net 30 payment terms as per existing agreement
  • Monthly invoices submitted by the 5th of the following month

13. Sign-Off

By signing below, both parties acknowledge understanding and agreement with the scope, deliverables, timeline, and approach outlined in this Statement of Work.

Client (CTA):

Name: ___________________________
Title: ___________________________
Date: ___________________________
Signature: _______________________

Brainforge AI:

Name: Uttam Kumaran
Title: Managing Lead
Date: December 16, 2025
Signature: _______________________


Appendix A: Key Stakeholders

NameRoleInvolvement
Katherine BaylessSenior Director, Data EngineeringStrategic sponsor, bi-weekly alignment, executive liaison, decision maker, “P Negative-1” prioritization
KyleData Analyst (Market Research → Data)Primary model builder, weekly working sessions, requirements gathering, R/Python skills, eager to learn dbt
KaiBusiness Intelligence Analyst (New)Member engagement reports, training sessions, requirements gathering, strong governance background
Jay HeavnerVP of IT20+ years tenure, infrastructure approvals, RBAC coordination, access management, Okta/systems owner
Ashwini SharmaData Engineer (Brainforge)Primary technical lead, dbt development, infrastructure setup, orchestration implementation
Samuel RobertsFull-Stack Engineer (Brainforge)Okta/Shopify discovery support, integration work
Uttam KumaranManaging Lead (Brainforge)Client POC, strategic alignment, project oversight

Additional Stakeholders (Ad-hoc):

  • Anna P, Anna K, Anna R (Membership): Requirements for active member report
  • Chris Deathloff (Market Research): Analytics needs, data literacy champion
  • Quinn, JC (Business Intelligence): Survey analysis, reporting needs
  • Tom Moschello (ExpoCAD): Show floor data, data quality

Appendix B: Success Metrics

Brainforge will track these metrics during Q1 to measure progress and success:

Data Coverage Metrics:

  • Number of staging models completed
  • Number of data sources integrated
  • Percentage of Remembers modules with staging models
  • Data freshness (time from source to staging)

Marts Development Metrics:

  • Number of dimensional models delivered
  • Number of fact tables delivered
  • Number of business reports operational
  • Model test coverage percentage

Team Enablement Metrics:

  • Number of team members onboarded
  • Number of models built by CTA team members
  • Time to first model (for new team members)
  • Code review participation rate

Operational Excellence Metrics:

  • CI/CD pipeline success rate
  • Average time for PR review and merge
  • Number of data quality issues caught by tests
  • Pipeline uptime percentage

Business Impact Metrics:

  • Number of dashboards/reports using marts data
  • Number of active Snowflake users
  • Number of questions answered by data team
  • CES reporting capabilities delivered

Appendix C: Deliverable Checklist

Phase 1: Data Source Expansion

  • All Remembers staging models complete (accounting, app, award, crm, crm_v2, exhibit, purchase, shopping, speaker)
  • SFMC staging models complete
  • Salesforce CRM staging models complete (post-CES, Feb+)
  • Historical S3 archive ingested (~300-400 tables)
  • Katherine’s SFTP workflow migrated to dbt (daily CES invite process)
  • Monthly backup automation operational
  • Webhooks data migrated
  • Merits FTP pipeline established
  • Session scanner data ingestion
  • Lead retrieval data pipeline
  • New data source pipelines (as identified)

Phase 2: Marts Layer Development

  • dim_member model (unified member view)
  • dim_organization model
  • dim_event model
  • dim_country model
  • fct_registrations model
  • fct_purchases model
  • fct_email_engagement model
  • fct_lead_scans model (exhibitor lead retrieval)
  • fct_session_attendance model (session scanner data)
  • Active membership report (6-column spreadsheet replacement - “P Negative-1”)
  • Member engagement report
  • Event performance reports
  • CES registration funnel analysis (track 29+ min registration time)
  • Exhibitor ROI analysis (lead scans + foot traffic)
  • Session popularity analysis
  • Entity resolution: DataOps ID implementation

Phase 3: Infrastructure & Operations

  • ETL platform decision and setup (Polytomic or alternative)
  • Orchestration implementation (Snowflake native tasks or GitHub Actions)
  • GitHub Actions CI/CD pipeline (testing on PRs, repo → S3)
  • Snowflake CLI integration for automated refresh
  • dbt testing framework (unique, not_null, relationships, custom tests)
  • Snowflake RBAC implementation (functional roles)
  • Warehouse configuration (ETL, Transform, BI/Reporting)
  • FTP integration for Marketing Cloud delivery
  • Monitoring and alerting
  • Grants documentation

Phase 4: Team Enablement

  • Kyle onboarding complete (dbt, Snowflake, building models independently)
  • Kai onboarding complete (Snowflake, reporting workflows)
  • Cursor training for Katherine and Jay
  • Onboarding recordings available for future team members
  • Training materials published (dbt, Snowflake, best practices)
  • Code review process documented
  • Best practices guide
  • “Citizen data engineer” support framework established

Phase 5: Documentation & Knowledge

  • All models have schema.yml
  • Data dictionary published
  • Business glossary published
  • Naming conventions documented
  • Data lineage documented
  • Operational runbooks available
  • Glean integration assessment