CTA DataOps Project Kickoff Summary

Date: November 17, 2025
Meeting Type: Project Kickoff
Attendees: Uttam Kumaran, Samuel Roberts, Katherine Bayless

Meeting Purpose

Initial kickoff meeting to understand CTA’s systems landscape, data sources, team structure, and immediate priorities for the DataOps platform build.

Key Takeaways

1. Systems Landscape - 54 Systems Identified

Katherine provided comprehensive walkthrough of all CTA systems. Key categories:

Core Business Systems (P0):

Remembers (Impexium): AMS with Snowflake data share - “P Negative-1” priority
Salesforce Marketing Cloud: Email marketing, deliverability issues
Salesforce CRM: Sales tracking, in “do not disturb” mode until post-CES
Merits: CES registration (150K users), no APIs, FTP only

CES Technology Stack (12 systems):

Complex daisy-chain of vendors with no clear ownership
EventPoint, Map Your Show, ExpoCAD, Event Base, Pointer (lead retrieval)
Major pain point: entity resolution across systems using email addresses
Security concern: Pointer uses email as query parameter (anyone can intercept)

E-commerce & Forms:

Shopify: Digital assets + sponsorships, authentication loop issues, no clear owner
Formstack: 600-700 forms, Katherine has webhook → S3 working

Analytics:

Google Analytics/BigQuery setup by Orange Spark (good but underutilized)
Power BI scattered across teams, no governance
Glean “actively harming reputation of our data”

2. Team Structure

CTA Data Team:

Katherine Bayless: Senior Director, Data Engineering (primary POC)
Kyle: Moving from Market Research to Data team, R/Python skills
Kai: New BI Analyst starting soon

CTA IT:

Jay Heavner: VP IT, 20+ years tenure, owns Okta/systems

“Citizen Data Engineers” Identified:

Anna P, Anna K, Anna R (Membership)
Chris Deathloff (Market Research)
Quinn, JC (Business Intelligence)
Tom Moschello (ExpoCAD)

External Partners:

SDG Consulting: Parallel BI tool evaluation workstream

3. Pain Points & Opportunities

Immediate Pain:

Membership team spending full days creating Excel reports manually
Marketing team wants real-time email metrics (currently monthly batch)
Entity resolution critical - same customer across 7+ different IDs
CES tech stack ownership vacuum

Strategic Opportunities:

Predict M&A from CES floor traffic (Katherine’s idea)
Seniority scoring by market segment (C-suite vs IC attendance)
Product-level profitability analysis (avoid $60 K p r o g r am s f or$ 37K revenue)
Lead retrieval + foot traffic = exhibitor ROI story

4. Technical Landscape

Current State:

Legacy Toad Data Point workflows (.tim templates)
SQL Server → migrated data to S3 (CSV and Parquet)
Katherine built webhook infrastructure (Formstack → S3 → Lambda)
Some AWS Glue jobs exist
Snowflake external stage created for S3

Target State:

Modern data stack: dbt + Snowflake + AWS
Automated pipelines replacing manual processes
Self-service analytics for “citizen data engineers”
Entity resolution enabling cross-system analysis

5. Priority Data Sources (Katherine’s Ranking)

P0 - Start Immediately:

Remembers (Impexium) - “P Negative-1”
SFMC Marketing Cloud
Merits (CES registration)
Formstack (already working!)
Historical S3 archive (for entity resolution)

P1 - Next Phase:

EventPoint (good APIs)
Salesforce CRM (post-CES)
Lead Retrieval data
Session scanner data (too big to import manually)
ExpoCAD (before potential replacement)

P2 - Nice to Have:

Event Base mobile app
Event Co-Pilot chatbot
ZoomInfo, Pitchbook, Quorum
Concur/Ironclad integration

6. Organizational Context

Culture:

Nonprofit association, very cost-conscious
“Working out loud” well-received here
Data literacy varies widely
Procurement process lengthy but MSA approved
Finance team resistant to technology changes

Political Landscape:

CES tech stack needs ownership
Shopify caught between marketing/IT/membership
BI team has own infrastructure, needs incentivizing
Sales team “not gonna disturb them” until post-CES

Goals & Measurement:

Goals currently “2-page laminated to-do list”
Katherine wants to shift to OKR framework
2026 goals just received - will drive metrics dictionary
Need visible public metrics for team

Decisions Made

Approach: Focus on landing data and building marts before BI tool selection
Starting Point: Remembers (Impexium) staging models
ETL Strategy: Evaluate Polytomic vs Fivetran vs build internally
Quick Win: Katherine’s SFTP workflow migration to dbt

Action Items

For Brainforge:

Map all 54 systems to data sources and owners (Uttam)
Evaluate ETL tools against P0 sources (team)
Begin dbt staging models for Remembers (Ashwini)
Set up project tracking in Asana or Linear (Uttam)
Create metrics glossary starting with 2026 goals (team)

For Katherine:

Share 2026 organizational goals document
Provide access to Remembers UI for QA
Share historical S3 archive schema/documentation
Add Brainforge team to relevant Slack channels
Get AWS and GitHub access provisioned (via Jay)

For Both:

Schedule regular weekly syncs
Plan dbt and Cursor training sessions
Identify first stakeholder meetings (membership team)
Clarify Q1 scope and deliverables

Next Steps

Immediate (Week 1):
- Complete access provisioning
- Start Remembers staging models
- Document systems inventory
- Begin ETL tool evaluation
Short-term (Weeks 2-4):
- Complete Remembers staging layer
- Ingest historical S3 data
- Build first mart (unified member view)
- Migrate Katherine’s SFTP workflow
Medium-term (Q1 2026):
- Entity resolution implementation
- P0 data sources fully ingested
- Multiple marts built
- Citizen data engineers enabled

Meeting Notes

Systems Walkthrough Highlights

Remembers Deep Dive:

9 modules: accounting, app, award, crm, crm_v2, exhibit, purchase, shopping, speaker
Bi-directional relationships (member/benefits) - need both entries
Record number is user-facing ID (not GUID)
Primary representative flag, start/end dates on relationships
Dashboards show incorrect counts (1,658 vs actual 1,100 members)
Okta logout bug removes admin role every time

CES Tech Stack Complexity:

EventPoint: Speakers register here
Map Your Show: Exhibitors register here
Merits: Everyone else registers here, all eventually end up here
Takes 29 min 36 sec average to complete registration
Email address is only consistent identifier (not immutable!)

Shopify Mystery:

No one knows why it was bought
Previous VP of marketing made the decision
Jay built it, marketing doesn’t maintain it
Ruby codebase by NZ vendor outsourcing to individual
Recently discovered it’s also used for CES sponsorships

Katherine’s Existing Work:

Daily CES invite process automation (Python/Postgres)
Webhook from Formstack working
S3 integration with Snowflake set up
CloudFormation templates for some infra
GitHub Actions for repo → S3 deployment

Memorable Quotes

Katherine on Remembers:

“If they just picked a better new name, right? It’s just hard to say ‘Remembers’ and make it sound like it makes any sense.”

Katherine on data priorities:

“P Negative-1” for Remembers - even more urgent than P0

Katherine on Shopify:

“We have a Slack channel called support-download-issues… at a certain point, the data suggests something different.”

Katherine on entity resolution:

“We have created 7 fields to track the potential exhibitor ID… someday this will just be one ID floating around between all the systems.”

Katherine on opportunities:

“We could predict mergers and acquisitions if we looked at the floor traffic at CES.”

Technical Details

AWS Setup:

Region: us-east-2 (Ohio)
S3 buckets: Data lake, ad-hoc, CTA DataOps Archive
Snowflake external stage configured
IAM roles for S3 access
Lambda functions for webhooks

Current Data Flow (Katherine’s SFTP Process):

Multiple sources → Download flat files → Import to Postgres → 
Run Python script → Export views → Upload to FTP → 
Marketing Cloud ingests → Populate data extensions

Historical Archive:

Old SQL Server dumped to S3 (both CSV and Parquet)
~300-400 tables
6 databases: archive, archive2, 2018, 2019, 2020, marketing
Referential integrity checks completed
Multiple backups (Azure, RDS, S3) before Katherine will close Azure

Resources Shared

In Meeting:

CTA Systems Inventory spreadsheet (54 systems)
Active member report example (6-column spreadsheet)
Remembers UI walkthrough

To Be Shared:

2026 organizational goals document
S3 archive schema documentation
Historical SQL code from old system
Katherine’s Python SFTP workflow code

Follow-up Meeting Schedule

Weekly syncs: Fridays 10:30 AM (agreed)
Next meeting: Nov 24, 2025 - P0 sources and KPI discussion
dbt Training: TBD after initial models built
Cursor Demo: Katherine and Jay interested

Meeting notes compiled from transcript: brainforge_cta_meeting_11_17_2025.md

Brainforge Knowledge

Explorer

kickoff_summary_11_17_2025