CTA DataOps Project Kickoff Summary

Date: November 17, 2025
Meeting Type: Project Kickoff
Attendees: Uttam Kumaran, Samuel Roberts, Katherine Bayless


Meeting Purpose

Initial kickoff meeting to understand CTA’s systems landscape, data sources, team structure, and immediate priorities for the DataOps platform build.


Key Takeaways

1. Systems Landscape - 54 Systems Identified

Katherine provided comprehensive walkthrough of all CTA systems. Key categories:

Core Business Systems (P0):

  • Remembers (Impexium): AMS with Snowflake data share - “P Negative-1” priority
  • Salesforce Marketing Cloud: Email marketing, deliverability issues
  • Salesforce CRM: Sales tracking, in “do not disturb” mode until post-CES
  • Merits: CES registration (150K users), no APIs, FTP only

CES Technology Stack (12 systems):

  • Complex daisy-chain of vendors with no clear ownership
  • EventPoint, Map Your Show, ExpoCAD, Event Base, Pointer (lead retrieval)
  • Major pain point: entity resolution across systems using email addresses
  • Security concern: Pointer uses email as query parameter (anyone can intercept)

E-commerce & Forms:

  • Shopify: Digital assets + sponsorships, authentication loop issues, no clear owner
  • Formstack: 600-700 forms, Katherine has webhook → S3 working

Analytics:

  • Google Analytics/BigQuery setup by Orange Spark (good but underutilized)
  • Power BI scattered across teams, no governance
  • Glean “actively harming reputation of our data”

2. Team Structure

CTA Data Team:

  • Katherine Bayless: Senior Director, Data Engineering (primary POC)
  • Kyle: Moving from Market Research to Data team, R/Python skills
  • Kai: New BI Analyst starting soon

CTA IT:

  • Jay Heavner: VP IT, 20+ years tenure, owns Okta/systems

“Citizen Data Engineers” Identified:

  • Anna P, Anna K, Anna R (Membership)
  • Chris Deathloff (Market Research)
  • Quinn, JC (Business Intelligence)
  • Tom Moschello (ExpoCAD)

External Partners:

  • SDG Consulting: Parallel BI tool evaluation workstream

3. Pain Points & Opportunities

Immediate Pain:

  • Membership team spending full days creating Excel reports manually
  • Marketing team wants real-time email metrics (currently monthly batch)
  • Entity resolution critical - same customer across 7+ different IDs
  • CES tech stack ownership vacuum

Strategic Opportunities:

  • Predict M&A from CES floor traffic (Katherine’s idea)
  • Seniority scoring by market segment (C-suite vs IC attendance)
  • Product-level profitability analysis (avoid 37K revenue)
  • Lead retrieval + foot traffic = exhibitor ROI story

4. Technical Landscape

Current State:

  • Legacy Toad Data Point workflows (.tim templates)
  • SQL Server → migrated data to S3 (CSV and Parquet)
  • Katherine built webhook infrastructure (Formstack → S3 → Lambda)
  • Some AWS Glue jobs exist
  • Snowflake external stage created for S3

Target State:

  • Modern data stack: dbt + Snowflake + AWS
  • Automated pipelines replacing manual processes
  • Self-service analytics for “citizen data engineers”
  • Entity resolution enabling cross-system analysis

5. Priority Data Sources (Katherine’s Ranking)

P0 - Start Immediately:

  1. Remembers (Impexium) - “P Negative-1”
  2. SFMC Marketing Cloud
  3. Merits (CES registration)
  4. Formstack (already working!)
  5. Historical S3 archive (for entity resolution)

P1 - Next Phase:

  • EventPoint (good APIs)
  • Salesforce CRM (post-CES)
  • Lead Retrieval data
  • Session scanner data (too big to import manually)
  • ExpoCAD (before potential replacement)

P2 - Nice to Have:

  • Event Base mobile app
  • Event Co-Pilot chatbot
  • ZoomInfo, Pitchbook, Quorum
  • Concur/Ironclad integration

6. Organizational Context

Culture:

  • Nonprofit association, very cost-conscious
  • “Working out loud” well-received here
  • Data literacy varies widely
  • Procurement process lengthy but MSA approved
  • Finance team resistant to technology changes

Political Landscape:

  • CES tech stack needs ownership
  • Shopify caught between marketing/IT/membership
  • BI team has own infrastructure, needs incentivizing
  • Sales team “not gonna disturb them” until post-CES

Goals & Measurement:

  • Goals currently “2-page laminated to-do list”
  • Katherine wants to shift to OKR framework
  • 2026 goals just received - will drive metrics dictionary
  • Need visible public metrics for team

Decisions Made

  1. Approach: Focus on landing data and building marts before BI tool selection
  2. Starting Point: Remembers (Impexium) staging models
  3. ETL Strategy: Evaluate Polytomic vs Fivetran vs build internally
  4. Quick Win: Katherine’s SFTP workflow migration to dbt

Action Items

For Brainforge:

  • Map all 54 systems to data sources and owners (Uttam)
  • Evaluate ETL tools against P0 sources (team)
  • Begin dbt staging models for Remembers (Ashwini)
  • Set up project tracking in Asana or Linear (Uttam)
  • Create metrics glossary starting with 2026 goals (team)

For Katherine:

  • Share 2026 organizational goals document
  • Provide access to Remembers UI for QA
  • Share historical S3 archive schema/documentation
  • Add Brainforge team to relevant Slack channels
  • Get AWS and GitHub access provisioned (via Jay)

For Both:

  • Schedule regular weekly syncs
  • Plan dbt and Cursor training sessions
  • Identify first stakeholder meetings (membership team)
  • Clarify Q1 scope and deliverables

Next Steps

  1. Immediate (Week 1):

    • Complete access provisioning
    • Start Remembers staging models
    • Document systems inventory
    • Begin ETL tool evaluation
  2. Short-term (Weeks 2-4):

    • Complete Remembers staging layer
    • Ingest historical S3 data
    • Build first mart (unified member view)
    • Migrate Katherine’s SFTP workflow
  3. Medium-term (Q1 2026):

    • Entity resolution implementation
    • P0 data sources fully ingested
    • Multiple marts built
    • Citizen data engineers enabled

Meeting Notes

Systems Walkthrough Highlights

Remembers Deep Dive:

  • 9 modules: accounting, app, award, crm, crm_v2, exhibit, purchase, shopping, speaker
  • Bi-directional relationships (member/benefits) - need both entries
  • Record number is user-facing ID (not GUID)
  • Primary representative flag, start/end dates on relationships
  • Dashboards show incorrect counts (1,658 vs actual 1,100 members)
  • Okta logout bug removes admin role every time

CES Tech Stack Complexity:

  • EventPoint: Speakers register here
  • Map Your Show: Exhibitors register here
  • Merits: Everyone else registers here, all eventually end up here
  • Takes 29 min 36 sec average to complete registration
  • Email address is only consistent identifier (not immutable!)

Shopify Mystery:

  • No one knows why it was bought
  • Previous VP of marketing made the decision
  • Jay built it, marketing doesn’t maintain it
  • Ruby codebase by NZ vendor outsourcing to individual
  • Recently discovered it’s also used for CES sponsorships

Katherine’s Existing Work:

  • Daily CES invite process automation (Python/Postgres)
  • Webhook from Formstack working
  • S3 integration with Snowflake set up
  • CloudFormation templates for some infra
  • GitHub Actions for repo → S3 deployment

Memorable Quotes

Katherine on Remembers:

“If they just picked a better new name, right? It’s just hard to say ‘Remembers’ and make it sound like it makes any sense.”

Katherine on data priorities:

“P Negative-1” for Remembers - even more urgent than P0

Katherine on Shopify:

“We have a Slack channel called support-download-issues… at a certain point, the data suggests something different.”

Katherine on entity resolution:

“We have created 7 fields to track the potential exhibitor ID… someday this will just be one ID floating around between all the systems.”

Katherine on opportunities:

“We could predict mergers and acquisitions if we looked at the floor traffic at CES.”

Technical Details

AWS Setup:

  • Region: us-east-2 (Ohio)
  • S3 buckets: Data lake, ad-hoc, CTA DataOps Archive
  • Snowflake external stage configured
  • IAM roles for S3 access
  • Lambda functions for webhooks

Current Data Flow (Katherine’s SFTP Process):

Multiple sources → Download flat files → Import to Postgres → 
Run Python script → Export views → Upload to FTP → 
Marketing Cloud ingests → Populate data extensions

Historical Archive:

  • Old SQL Server dumped to S3 (both CSV and Parquet)
  • ~300-400 tables
  • 6 databases: archive, archive2, 2018, 2019, 2020, marketing
  • Referential integrity checks completed
  • Multiple backups (Azure, RDS, S3) before Katherine will close Azure

Resources Shared

In Meeting:

  • CTA Systems Inventory spreadsheet (54 systems)
  • Active member report example (6-column spreadsheet)
  • Remembers UI walkthrough

To Be Shared:

  • 2026 organizational goals document
  • S3 archive schema documentation
  • Historical SQL code from old system
  • Katherine’s Python SFTP workflow code

Follow-up Meeting Schedule

  • Weekly syncs: Fridays 10:30 AM (agreed)
  • Next meeting: Nov 24, 2025 - P0 sources and KPI discussion
  • dbt Training: TBD after initial models built
  • Cursor Demo: Katherine and Jay interested

Meeting notes compiled from transcript: brainforge_cta_meeting_11_17_2025.md