CTA DataOps Weekly Sync Notes (Nov-Dec 2025)

Period Covered: November 24 - December 12, 2025
Meeting Series: Weekly syncs
Participants: Katherine Bayless, Uttam Kumaran, Ashwini Sharma


Week of November 24, 2025

Progress Updates

dbt & Snowflake Setup:

  • ✅ Snowflake access established
  • ✅ dbt Core project initialized
  • ✅ GitHub repository and version control set up
  • 🔄 Remembers staging models in progress (Ashwini)
  • ✅ Identified first test case: Active member report (6-column spreadsheet)

Remembers Data Modeling:

  • Ashwini working on standardizing raw data into staging
  • Exploring bi-directional relationships (members/benefits)
  • Katherine to provide Remembers UI access for QA

Key Discussions

Priority Data Sources Refined:

Katherine’s definitive P0 list:

  1. Impexium (Remembers) - “P Negative-1” - Membership team desperate for help
  2. SFMC Marketing Cloud - High impact, APIs available
  3. Merits - CES registration data (FTP only, no APIs)
  4. Formstack - Already working (webhook → S3 → Snowflake)
  5. Historical S3 Archive - For entity resolution

Rationale for prioritization:

  • Membership team “incredibly industrious” but stuck in Excel all day
  • Quick wins build political capital
  • Entity resolution requires historical data
  • Focus on landing data before BI tool selection

Team Rituals & Processes:

Discussion on how to work with “citizen data engineers”:

  • Kyle (moving from Market Research) - has R/Python skills
  • Kai (new BI analyst) - starting soon, from buttoned-up governance background
  • Anna P/K/R, Chris Deathloff, Quinn, JC - all have data literacy

Katherine’s philosophy:

“The more I show up, the less people will walk on their own… neglect is the most effective technique”

Approach:

  • Brainforge to run working sessions with Kyle and others
  • Enable them to build models and present findings
  • Start with safe environment to fail
  • Eventually they present directly to org

OKRs & Metrics:

Katherine wants to establish visible public metrics:

  • Track # of manual processes eliminated (starting from 300+)
  • Move org from “laminated to-do list” to outcomes-based goals
  • Use as template for org-wide goal-setting change

Entity Resolution Scope:

Katherine’s approach:

  1. First pass: Go through all data and identify different identifiers
  2. Figure out how many people we can already flatten
  3. Then tackle ongoing ingestion basis
  4. Use historical S3 data (300-400 tables) to build context

S3 historical data:

  • Already dumped from old SQL Server
  • CSV and Parquet formats available
  • Need to land in Snowflake for analysis

Week of December 5, 2025

Progress Updates

dbt Structure Established:

  • Ashwini walked through 3-layer structure: staging → intermediate → marts
  • Staging layer broken down by source (Remembers, eventually SFMC)
  • Remembers staging organized by schema (accounting, app, award, CRM, etc.)
  • Column names standardized to snake_case

ETL Tool Evaluation:

Polytomic emerging as preferred choice over Fivetran:

  • Willingness to build custom connectors
  • CEO (Galib) accessible via Slack
  • Powers NFL, Okta, other large brands
  • Pricing typically lower than Fivetran
  • 4-5 Brainforge clients using successfully

Uttam to arrange intro call between Katherine and Galib.

Coverage assessment needed for:

  • Salesforce Marketing Cloud (likely buildable)
  • Merits (FTP only, need flat file approach)
  • EventPoint (good APIs)
  • Cvent (SOAP-only, challenging)

Katherine’s SFTP Workflow Deep Dive:

Current process (runs daily):

  1. Download flat files from 8-10 sources
  2. Import to Postgres
  3. Run Python transformation script
  4. Export 5 views
  5. Upload to Marketing Cloud FTP
  6. Upload to Merits vendor FTP

Goal: Migrate entire workflow to dbt + orchestration

Files currently land in Postgres, can instead go to S3 → Snowflake.

Ashwini reviewing Python script to understand transformations.

Key Decisions

BI Tool Selection: Deferred to Q2

Reasoning:

  • Power BI available for interim use (SDG doing inventory)
  • Focus Q1 on building marts and landing data
  • Better to understand stakeholder needs before tool selection
  • Kyle and Kai can put out Power BI reports as needed
  • Don’t want to entrench in Power BI but need something now

Katherine’s concern:

“I don’t want people confused and think that’s where we’re going to stay… mostly thinking ahead to whenever I tell finance how much it’ll cost to buy Sigma”

Orchestration Approach: Snowflake Native (Exploring)

Options evaluated:

  • Snowflake native task orchestration (preferred)
  • GitHub Actions (Katherine has working setup)
  • AWS Glue (expensive, poor UX)
  • Dagster/Airflow (additional vendor/procurement)

Snowflake native benefits:

  • No additional cost
  • Python execution support (new in 2025)
  • Katherine’s strategy: “Squeeze value out of tools we have”
  • Avoid procurement delays

Ashwini to research Snowflake dbt execution capabilities.

Cursor & dbt Training Planned:

Katherine and Jay both want training:

  • Cursor demo for productivity
  • dbt walkthrough once first marts built
  • Show naming conventions, folder structure, tagging
  • Help them understand how to find models

Issues & Blockers

Remembers Data Questions:

Ashwini struggling to see bi-directional relationship examples in data. Katherine to:

  • Share active member spreadsheet in S3 (not SharePoint!)
  • Provide join key guidance
  • Grant Remembers UI access for QA

Okta & Shopify Pain Points Surfacing:

Katherine mentions both as ongoing frustrations:

  • Okta: “I’ve never had to enter my password as much”
  • Shopify: Downloads failing after purchase
  • Both need attention but not blocking main data workstream

Jay showing interest in getting help with these issues.

Bonus Topics

Email Deliverability Crisis:

Multiple issues:

  • SFMC emails not passing DMARC
  • Using dash-tech.org instead of ces.tech
  • Panasonic emails recently blacklisted CTA
  • Survey response rate <1% (likely due to deliverability)
  • Gmail/Outlook warning users about CTA emails

Katherine planning to tackle in new year as everything flows through SFMC.

Access Provisioning:

Katherine to arrange with Jay:

  • AWS access (via Okta)
  • Asana access for project tracking
  • Power BI access
  • Remembers UI access

Week of December 12, 2025

Progress Updates

Gantt Chart Review:

Uttam shared project timeline:

  • Kickoff ✅ Complete
  • Snowflake setup ✅ Complete
  • dbt initialization ✅ Complete
  • 🔄 Tool evaluation (extended to January)
  • ⏳ Data ingestion (pending Polytomic decision)
  • ⏳ Entity resolution (pending data landing)
  • ⏳ Marts development (after staging complete)

Parallel pathing better than expected - multiple workstreams active.

SFTP Workflow Migration Details:

Katherine explained current automation:

  • Webhook from Formstack → S3 (Lambda)
  • Some processing in SQL (Glue calls)
  • Final mile not solved: FTP upload to Marketing Cloud
  • Hoping to use APIs instead of FTP going forward

Technical requirements:

  • Need to assign DataOps IDs (Katherine’s canonical identifier)
  • Bulk upload existing attendees
  • Real-time sync for new registrations
  • Maintain through CES (January event)

S3 Historical Data Ready:

Katherine confirmed S3 archive details:

  • Old marketing SQL Server dumped to both CSV and Parquet
  • Available in buckets: CTA-DataOps-Archive, Data-Lake
  • 6 databases: archive, archive2, 2018, 2019, 2020, marketing
  • Majority of data in DBO schema
  • ~50-60 historical data sources represented

Can use for entity resolution immediately.

Key Decisions

Q1 Contract Scope:

Approach:

  • Draft Q1 scope focusing on modeling and data landing
  • Add Okta/Shopify discovery if ready
  • Full year plan after returning from holidays

Katherine’s timeline:

“I don’t want to make more work… but knowing how long it takes to get contracts through… Q1 scope now, plan full year in January”

Focus Shift: Modeling Over BI:

Agreement to prioritize:

  1. Complete staging layer (all Remembers modules)
  2. Ingest historical S3 data
  3. Build marts (starting with unified member view)
  4. Enable “citizen data engineers”
  5. Defer BI tool decision until Q2

Katherine’s vision:

“My hope is just like probably most folks are going to go into like either dark or panic mode until mid January. But when they come back, I want there to be like, all sorts of beautiful stuff for them to play with in Snowflake.”

Entity Resolution Path Forward:

Two-phase approach:

  1. Immediate: Ingest S3 archive, profile existing identifiers
  2. Ongoing: Build DataOps ID as canonical identifier

Katherine already:

  • Created DataOps ID concept in her entity resolution work
  • Has it storing in Merits for her data stream
  • Proposing it to replace email in vendor query parameters

Opportunity: If successful with one vendor, could become canonical across all CES systems.

Issues & Blockers

Okta Login Problems:

Uttam’s experience:

  • OnePassword autofilled but didn’t save
  • AWS app won’t show up in Okta dashboard
  • “User is not assigned to this application” error

Jay to provision access properly.

CES Registration Security Flaw:

Katherine discovered issue:

  • Attendee match vendor uses email as query parameter
  • Anyone can intercept and change email to see other attendee’s data
  • Working on hotfix: Switch to DataOps ID before CES
  • Need to bulk upload IDs and sync real-time

Action Items Captured

For Brainforge:

  • Draft Q1 SOW
  • Continue Remembers staging models
  • Begin S3 historical data ingestion
  • Research Snowflake orchestration options
  • Prepare Okta/Shopify discovery plans

For Katherine:

  • Share 2026 goals document
  • Upload active member spreadsheet to S3
  • Provide DDL for SFTP workflow views
  • Get team access to AWS, Asana, Power BI
  • Drop files in S3 for SFTP workflow testing

For Both:

  • Polytomic intro call with Galib
  • dbt training session (once marts built)
  • Cursor demo for Katherine and Jay
  • Meet citizen data engineers (Kyle, Kai, Anna’s, Chris, Quinn, JC)

Emerging Themes

1. “Citizen Data Engineers” Strategy

Katherine identified multiple people with data aptitude:

  • Kyle: R/Python skills, eager to learn
  • Kai: BI background, strong governance experience
  • Multiple Annas in membership team
  • Chris Deathloff in market research

Approach:

  • Brainforge runs working sessions
  • Safe environment to learn and fail
  • Eventually they present directly to org
  • Katherine’s “calculated neglect” to force independence

2. Frugal Tech Stack Philosophy

Katherine consistently pushing for:

  • Consolidate into Snowflake where possible
  • Avoid new vendors (procurement delays)
  • “Squeeze value out of tools we have”
  • Leverage AWS Marketplace for consolidated billing
  • Use Snowflake partner program (up to 49% can go to partners)

Finance team very cost-conscious - need to prove ROI before asking for budget.

3. Entity Resolution as Foundation

Recurring theme: Same customer has 7+ different IDs across systems.

Katherine’s DataOps ID concept could be breakthrough:

  • Created in her entity resolution work
  • Storing in Merits registration system
  • Proposing to vendors as canonical identifier
  • Could solve entity resolution at the source

If successful, eliminates need for complex post-hoc matching.

4. CES Timeline Constraint

Everything affected by CES in January:

  • Can’t make major changes before event
  • Sales team in “do not disturb” mode
  • Katherine’s SFTP workflow must keep running
  • Security hotfix needed before event
  • Post-CES: More access, more willingness to change

Strategy: Prepare everything in Q1, deploy post-CES.


Technical Details

dbt Project Structure (Ashwini’s Implementation)

dbt_project/
  models/
    staging/
      remembers/
        _remembers__sources.yml
        accounting/
          stg_remembers__accounting_*.sql
        app/
        award/
        crm/
          stg_remembers__crm_customer.sql
          stg_remembers__crm_individual.sql
        crm_v2/
        exhibit/
        purchase/
        shopping/
        speaker/
      sfmc/  (planned)
    intermediate/  (planned)
    marts/  (planned)

Conventions:

  • Snake case for all column names
  • Stage models prefixed with stg_
  • Source-based organization
  • Schema.yml for documentation (planned)

Katherine’s Current AWS Setup

S3 Buckets:

  • cta-dataops-archive: Old SQL Server dump
  • cta-data-lake: Current data operations
  • cta-dataops-ad-hoc: Collaboration files
  • webhooks: Formstack → S3 → Snowflake working example

Snowflake:

  • REMEMBERS schema: Data share from vendor
  • WEBHOOKS database: Formstack integration demo
  • External stage created for S3 access
  • IAM role configured for bucket permissions

GitHub:

  • CTA-dataops repo
  • Python scripts for email validation pipeline
  • GitHub Actions: repo → S3 on commit
  • Lambda function packaging automated

Next Phase Architecture (Planned)

Sources → S3 → Snowflake (via Polytomic or flat files)
         ↓
      dbt Core (staging → intermediate → marts)
         ↓
   Orchestration (Snowflake tasks or GitHub Actions)
         ↓
    Access Layer (Power BI interim, Snowflake direct, future: Sigma)

Metrics to Track

Katherine’s proposed KPIs:

  1. Manual processes eliminated - Starting from 300+ inherited
  2. Time saved per week - Quantify automation impact
  3. Data sources landed - Progress toward full coverage
  4. Models deployed - dbt staging + marts count
  5. “Citizens” enabled - People successfully using Snowflake
  6. Support tickets reduced - Auth issues, download failures

Quotes to Remember

Katherine on priorities:

“P Negative-1” - Even more urgent than P0

Katherine on neglect:

“The more I show up, the less people will walk on their own”

Katherine on Power BI:

“I don’t want people confused and think that’s where we’re going to stay… mostly thinking ahead to whenever I tell finance how much it’ll cost to buy Sigma”

Katherine on Snowflake:

“We could just throw some money at this problem… become the poster child for leveraging all Snowflake features”

Katherine on holiday timing:

“When they come back [from holidays], I want there to be like, all sorts of beautiful stuff for them to play with in Snowflake”

Katherine on mistakes:

“Perhaps I over-committed us to solving a problem that probably would have been better to just let it stay a problem… but I can’t resist helping”

Uttam on approach:

“I don’t mind if it’s… if we have the… if we truly… I don’t… the one thing I don’t want to do is do what every consultancy does, be like, like, we can do it, and we do, like, half-baked”


Next Meeting Schedule

  • Weekly syncs continue: Fridays at 10:30 AM
  • Next sync: December 19, 2025
  • Polytomic intro: Early January (after holidays)
  • dbt training: After first marts built
  • Team meeting: Post-holidays with Kyle, Kai, and stakeholders

Compiled from transcripts:

  • cta_weekly_11_24_2025.md
  • brainforge_cta_weekly_12_5_2025.md
  • brainforge_cta_weekly_12_12_2025.md