CTA DataOps Weekly Sync Notes (Nov-Dec 2025)
Period Covered: November 24 - December 12, 2025
Meeting Series: Weekly syncs
Participants: Katherine Bayless, Uttam Kumaran, Ashwini Sharma
Week of November 24, 2025
Progress Updates
dbt & Snowflake Setup:
- ✅ Snowflake access established
- ✅ dbt Core project initialized
- ✅ GitHub repository and version control set up
- 🔄 Remembers staging models in progress (Ashwini)
- ✅ Identified first test case: Active member report (6-column spreadsheet)
Remembers Data Modeling:
- Ashwini working on standardizing raw data into staging
- Exploring bi-directional relationships (members/benefits)
- Katherine to provide Remembers UI access for QA
Key Discussions
Priority Data Sources Refined:
Katherine’s definitive P0 list:
- Impexium (Remembers) - “P Negative-1” - Membership team desperate for help
- SFMC Marketing Cloud - High impact, APIs available
- Merits - CES registration data (FTP only, no APIs)
- Formstack - Already working (webhook → S3 → Snowflake)
- Historical S3 Archive - For entity resolution
Rationale for prioritization:
- Membership team “incredibly industrious” but stuck in Excel all day
- Quick wins build political capital
- Entity resolution requires historical data
- Focus on landing data before BI tool selection
Team Rituals & Processes:
Discussion on how to work with “citizen data engineers”:
- Kyle (moving from Market Research) - has R/Python skills
- Kai (new BI analyst) - starting soon, from buttoned-up governance background
- Anna P/K/R, Chris Deathloff, Quinn, JC - all have data literacy
Katherine’s philosophy:
“The more I show up, the less people will walk on their own… neglect is the most effective technique”
Approach:
- Brainforge to run working sessions with Kyle and others
- Enable them to build models and present findings
- Start with safe environment to fail
- Eventually they present directly to org
OKRs & Metrics:
Katherine wants to establish visible public metrics:
- Track # of manual processes eliminated (starting from 300+)
- Move org from “laminated to-do list” to outcomes-based goals
- Use as template for org-wide goal-setting change
Entity Resolution Scope:
Katherine’s approach:
- First pass: Go through all data and identify different identifiers
- Figure out how many people we can already flatten
- Then tackle ongoing ingestion basis
- Use historical S3 data (300-400 tables) to build context
S3 historical data:
- Already dumped from old SQL Server
- CSV and Parquet formats available
- Need to land in Snowflake for analysis
Week of December 5, 2025
Progress Updates
dbt Structure Established:
- Ashwini walked through 3-layer structure: staging → intermediate → marts
- Staging layer broken down by source (Remembers, eventually SFMC)
- Remembers staging organized by schema (accounting, app, award, CRM, etc.)
- Column names standardized to snake_case
ETL Tool Evaluation:
Polytomic emerging as preferred choice over Fivetran:
- Willingness to build custom connectors
- CEO (Galib) accessible via Slack
- Powers NFL, Okta, other large brands
- Pricing typically lower than Fivetran
- 4-5 Brainforge clients using successfully
Uttam to arrange intro call between Katherine and Galib.
Coverage assessment needed for:
- Salesforce Marketing Cloud (likely buildable)
- Merits (FTP only, need flat file approach)
- EventPoint (good APIs)
- Cvent (SOAP-only, challenging)
Katherine’s SFTP Workflow Deep Dive:
Current process (runs daily):
- Download flat files from 8-10 sources
- Import to Postgres
- Run Python transformation script
- Export 5 views
- Upload to Marketing Cloud FTP
- Upload to Merits vendor FTP
Goal: Migrate entire workflow to dbt + orchestration
Files currently land in Postgres, can instead go to S3 → Snowflake.
Ashwini reviewing Python script to understand transformations.
Key Decisions
BI Tool Selection: Deferred to Q2
Reasoning:
- Power BI available for interim use (SDG doing inventory)
- Focus Q1 on building marts and landing data
- Better to understand stakeholder needs before tool selection
- Kyle and Kai can put out Power BI reports as needed
- Don’t want to entrench in Power BI but need something now
Katherine’s concern:
“I don’t want people confused and think that’s where we’re going to stay… mostly thinking ahead to whenever I tell finance how much it’ll cost to buy Sigma”
Orchestration Approach: Snowflake Native (Exploring)
Options evaluated:
- Snowflake native task orchestration (preferred)
- GitHub Actions (Katherine has working setup)
- AWS Glue (expensive, poor UX)
- Dagster/Airflow (additional vendor/procurement)
Snowflake native benefits:
- No additional cost
- Python execution support (new in 2025)
- Katherine’s strategy: “Squeeze value out of tools we have”
- Avoid procurement delays
Ashwini to research Snowflake dbt execution capabilities.
Cursor & dbt Training Planned:
Katherine and Jay both want training:
- Cursor demo for productivity
- dbt walkthrough once first marts built
- Show naming conventions, folder structure, tagging
- Help them understand how to find models
Issues & Blockers
Remembers Data Questions:
Ashwini struggling to see bi-directional relationship examples in data. Katherine to:
- Share active member spreadsheet in S3 (not SharePoint!)
- Provide join key guidance
- Grant Remembers UI access for QA
Okta & Shopify Pain Points Surfacing:
Katherine mentions both as ongoing frustrations:
- Okta: “I’ve never had to enter my password as much”
- Shopify: Downloads failing after purchase
- Both need attention but not blocking main data workstream
Jay showing interest in getting help with these issues.
Bonus Topics
Email Deliverability Crisis:
Multiple issues:
- SFMC emails not passing DMARC
- Using dash-tech.org instead of ces.tech
- Panasonic emails recently blacklisted CTA
- Survey response rate <1% (likely due to deliverability)
- Gmail/Outlook warning users about CTA emails
Katherine planning to tackle in new year as everything flows through SFMC.
Access Provisioning:
Katherine to arrange with Jay:
- AWS access (via Okta)
- Asana access for project tracking
- Power BI access
- Remembers UI access
Week of December 12, 2025
Progress Updates
Gantt Chart Review:
Uttam shared project timeline:
- Kickoff ✅ Complete
- Snowflake setup ✅ Complete
- dbt initialization ✅ Complete
- 🔄 Tool evaluation (extended to January)
- ⏳ Data ingestion (pending Polytomic decision)
- ⏳ Entity resolution (pending data landing)
- ⏳ Marts development (after staging complete)
Parallel pathing better than expected - multiple workstreams active.
SFTP Workflow Migration Details:
Katherine explained current automation:
- Webhook from Formstack → S3 (Lambda)
- Some processing in SQL (Glue calls)
- Final mile not solved: FTP upload to Marketing Cloud
- Hoping to use APIs instead of FTP going forward
Technical requirements:
- Need to assign DataOps IDs (Katherine’s canonical identifier)
- Bulk upload existing attendees
- Real-time sync for new registrations
- Maintain through CES (January event)
S3 Historical Data Ready:
Katherine confirmed S3 archive details:
- Old marketing SQL Server dumped to both CSV and Parquet
- Available in buckets: CTA-DataOps-Archive, Data-Lake
- 6 databases: archive, archive2, 2018, 2019, 2020, marketing
- Majority of data in DBO schema
- ~50-60 historical data sources represented
Can use for entity resolution immediately.
Key Decisions
Q1 Contract Scope:
Approach:
- Draft Q1 scope focusing on modeling and data landing
- Add Okta/Shopify discovery if ready
- Full year plan after returning from holidays
Katherine’s timeline:
“I don’t want to make more work… but knowing how long it takes to get contracts through… Q1 scope now, plan full year in January”
Focus Shift: Modeling Over BI:
Agreement to prioritize:
- Complete staging layer (all Remembers modules)
- Ingest historical S3 data
- Build marts (starting with unified member view)
- Enable “citizen data engineers”
- Defer BI tool decision until Q2
Katherine’s vision:
“My hope is just like probably most folks are going to go into like either dark or panic mode until mid January. But when they come back, I want there to be like, all sorts of beautiful stuff for them to play with in Snowflake.”
Entity Resolution Path Forward:
Two-phase approach:
- Immediate: Ingest S3 archive, profile existing identifiers
- Ongoing: Build DataOps ID as canonical identifier
Katherine already:
- Created DataOps ID concept in her entity resolution work
- Has it storing in Merits for her data stream
- Proposing it to replace email in vendor query parameters
Opportunity: If successful with one vendor, could become canonical across all CES systems.
Issues & Blockers
Okta Login Problems:
Uttam’s experience:
- OnePassword autofilled but didn’t save
- AWS app won’t show up in Okta dashboard
- “User is not assigned to this application” error
Jay to provision access properly.
CES Registration Security Flaw:
Katherine discovered issue:
- Attendee match vendor uses email as query parameter
- Anyone can intercept and change email to see other attendee’s data
- Working on hotfix: Switch to DataOps ID before CES
- Need to bulk upload IDs and sync real-time
Action Items Captured
For Brainforge:
- Draft Q1 SOW
- Continue Remembers staging models
- Begin S3 historical data ingestion
- Research Snowflake orchestration options
- Prepare Okta/Shopify discovery plans
For Katherine:
- Share 2026 goals document
- Upload active member spreadsheet to S3
- Provide DDL for SFTP workflow views
- Get team access to AWS, Asana, Power BI
- Drop files in S3 for SFTP workflow testing
For Both:
- Polytomic intro call with Galib
- dbt training session (once marts built)
- Cursor demo for Katherine and Jay
- Meet citizen data engineers (Kyle, Kai, Anna’s, Chris, Quinn, JC)
Emerging Themes
1. “Citizen Data Engineers” Strategy
Katherine identified multiple people with data aptitude:
- Kyle: R/Python skills, eager to learn
- Kai: BI background, strong governance experience
- Multiple Annas in membership team
- Chris Deathloff in market research
Approach:
- Brainforge runs working sessions
- Safe environment to learn and fail
- Eventually they present directly to org
- Katherine’s “calculated neglect” to force independence
2. Frugal Tech Stack Philosophy
Katherine consistently pushing for:
- Consolidate into Snowflake where possible
- Avoid new vendors (procurement delays)
- “Squeeze value out of tools we have”
- Leverage AWS Marketplace for consolidated billing
- Use Snowflake partner program (up to 49% can go to partners)
Finance team very cost-conscious - need to prove ROI before asking for budget.
3. Entity Resolution as Foundation
Recurring theme: Same customer has 7+ different IDs across systems.
Katherine’s DataOps ID concept could be breakthrough:
- Created in her entity resolution work
- Storing in Merits registration system
- Proposing to vendors as canonical identifier
- Could solve entity resolution at the source
If successful, eliminates need for complex post-hoc matching.
4. CES Timeline Constraint
Everything affected by CES in January:
- Can’t make major changes before event
- Sales team in “do not disturb” mode
- Katherine’s SFTP workflow must keep running
- Security hotfix needed before event
- Post-CES: More access, more willingness to change
Strategy: Prepare everything in Q1, deploy post-CES.
Technical Details
dbt Project Structure (Ashwini’s Implementation)
dbt_project/
models/
staging/
remembers/
_remembers__sources.yml
accounting/
stg_remembers__accounting_*.sql
app/
award/
crm/
stg_remembers__crm_customer.sql
stg_remembers__crm_individual.sql
crm_v2/
exhibit/
purchase/
shopping/
speaker/
sfmc/ (planned)
intermediate/ (planned)
marts/ (planned)
Conventions:
- Snake case for all column names
- Stage models prefixed with
stg_ - Source-based organization
- Schema.yml for documentation (planned)
Katherine’s Current AWS Setup
S3 Buckets:
cta-dataops-archive: Old SQL Server dumpcta-data-lake: Current data operationscta-dataops-ad-hoc: Collaboration fileswebhooks: Formstack → S3 → Snowflake working example
Snowflake:
REMEMBERSschema: Data share from vendorWEBHOOKSdatabase: Formstack integration demo- External stage created for S3 access
- IAM role configured for bucket permissions
GitHub:
- CTA-dataops repo
- Python scripts for email validation pipeline
- GitHub Actions: repo → S3 on commit
- Lambda function packaging automated
Next Phase Architecture (Planned)
Sources → S3 → Snowflake (via Polytomic or flat files)
↓
dbt Core (staging → intermediate → marts)
↓
Orchestration (Snowflake tasks or GitHub Actions)
↓
Access Layer (Power BI interim, Snowflake direct, future: Sigma)
Metrics to Track
Katherine’s proposed KPIs:
- Manual processes eliminated - Starting from 300+ inherited
- Time saved per week - Quantify automation impact
- Data sources landed - Progress toward full coverage
- Models deployed - dbt staging + marts count
- “Citizens” enabled - People successfully using Snowflake
- Support tickets reduced - Auth issues, download failures
Quotes to Remember
Katherine on priorities:
“P Negative-1” - Even more urgent than P0
Katherine on neglect:
“The more I show up, the less people will walk on their own”
Katherine on Power BI:
“I don’t want people confused and think that’s where we’re going to stay… mostly thinking ahead to whenever I tell finance how much it’ll cost to buy Sigma”
Katherine on Snowflake:
“We could just throw some money at this problem… become the poster child for leveraging all Snowflake features”
Katherine on holiday timing:
“When they come back [from holidays], I want there to be like, all sorts of beautiful stuff for them to play with in Snowflake”
Katherine on mistakes:
“Perhaps I over-committed us to solving a problem that probably would have been better to just let it stay a problem… but I can’t resist helping”
Uttam on approach:
“I don’t mind if it’s… if we have the… if we truly… I don’t… the one thing I don’t want to do is do what every consultancy does, be like, like, we can do it, and we do, like, half-baked”
Next Meeting Schedule
- Weekly syncs continue: Fridays at 10:30 AM
- Next sync: December 19, 2025
- Polytomic intro: Early January (after holidays)
- dbt training: After first marts built
- Team meeting: Post-holidays with Kyle, Kai, and stakeholders
Compiled from transcripts:
cta_weekly_11_24_2025.mdbrainforge_cta_weekly_12_5_2025.mdbrainforge_cta_weekly_12_12_2025.md