Standard Operating Procedure (SOP)
Title: Snowflake Governance, RBAC, and DBT Setup with GitHub Actions
Version: 1.0
Date: February 3, 2026
Owner: CTA Data Team + Brainforge Data Engineering
1. Purpose
This SOP standardizes the governance structure, role-based access control (RBAC), schema organization, warehouse configuration, and CI/CD deployment workflows for Snowflake data infrastructure at CTA. It establishes patterns for DBT project organization and GitHub Actions automation to ensure consistent, maintainable, and secure data operations.
The document defines how to:
- Structure database roles and manage access permissions
- Organize schemas and warehouses
- Configure DBT projects for multiple environments
- Automate deployments using GitHub Actions
- Monitor and audit access patterns
2. Scope
Applies to:
- Snowflake RBAC configuration and user provisioning
- Schema design and ownership (RAW, STAGING, MARTS layers)
- Warehouse sizing and configuration
- DBT project structure and deployment
- GitHub Actions CI/CD pipelines for data transformations
- Access provisioning and revocation workflows
Does not apply to:
- Source system data extraction pipelines (handled separately)
- BI tool configurations beyond Snowflake access
- Ad-hoc analysis queries (covered by read-only access)
3. Definitions
- RBAC (Role-Based Access Control): Permission model where access is granted through roles rather than directly to users
- Environment: Isolated database context (DEV, PREP, PROD) for different stages of data development
- Schema: Logical grouping of database objects (tables, views) within a database
- Warehouse: Compute cluster in Snowflake that executes queries
- DBT (data build tool): Transformation framework for building data models using SQL
- Target: DBT configuration that determines which environment to deploy to
- Service Account: Non-human user for automated processes (GitHub Actions, ETL tools)
- Least Privilege: Security principle of granting minimum permissions necessary
4. Prerequisites
Before implementing this governance structure:
- Snowflake account admin access for role and warehouse creation
- GitHub repository access with permissions to configure Actions
- 1Password or equivalent for secure credential storage
- DBT Core installed locally for development (version 1.5+)
- Service account created in Snowflake for GitHub Actions
- RSA key pair generated for service account authentication
5. Responsibilities
CTA Data Team:
- Request access to appropriate roles and schemas
- Develop and test DBT models in DEV environment
- Submit pull requests following review process
- Monitor data quality and pipeline health
- Update documentation for schema changes
Brainforge Data Engineering:
- Configure and maintain RBAC role hierarchy
- Provision new users and service accounts
- Review and merge infrastructure changes
- Monitor warehouse usage and costs
- Support escalations for access or deployment issues
Shared:
- Code review for DBT model changes
- Data quality testing and validation
- Documentation of schema and business logic
6. Step-by-Step Procedure
6.1 RBAC Model and Role Hierarchy
Role Structure
Snowflake roles follow a hierarchy from least to most privileged:
ACCOUNTADMIN (top-level admin, use sparingly)
└── SYSADMIN (system administration)
└── DATA_ENGINEER (cross-environment admin)
├── PROD_WRITE (production write access)
│ └── PROD_READ (production read access)
├── PREP_WRITE (staging write access)
│ └── PREP_READ (staging read access)
└── DEV_WRITE (development write access)
└── DEV_READ (development read access)
Role Definitions
Environment-Level Roles (per environment: DEV, PREP, PROD):
-
{ENV}_READ: Read-only access to all schemas in environment- Granted: SELECT on all tables/views in database
- Use case: Analysts, BI tools, downstream consumers
-
{ENV}_WRITE: Full CRUD access within environment- Granted: CREATE, INSERT, UPDATE, DELETE on schemas
- Use case: DBT service accounts, data engineers during development
Team Roles:
-
ANALYST: Aggregate role with read access to PREP and PROD- Reduces friction for team members who need broad visibility
- No write permissions to prevent accidental changes
-
DATA_ENGINEER: Aggregate role with write access to DEV and PREP, read to PROD- Development and testing in lower environments
- Read-only production access for troubleshooting
Service Roles:
-
DBT_SERVICE: Automated transformation execution- WRITE access to PREP and PROD (environment-specific service accounts recommended)
- Used by GitHub Actions for deployments
-
ETL_SERVICE: Data ingestion processes- WRITE access to RAW schema only
- Separate from transformation roles for security
Role Grant Script
-- Create environment-specific read roles
CREATE ROLE IF NOT EXISTS DEV_READ;
CREATE ROLE IF NOT EXISTS PREP_READ;
CREATE ROLE IF NOT EXISTS PROD_READ;
-- Create environment-specific write roles
CREATE ROLE IF NOT EXISTS DEV_WRITE;
CREATE ROLE IF NOT EXISTS PREP_WRITE;
CREATE ROLE IF NOT EXISTS PROD_WRITE;
-- Create team aggregate roles
CREATE ROLE IF NOT EXISTS ANALYST;
CREATE ROLE IF NOT EXISTS DATA_ENGINEER;
-- Create service roles
CREATE ROLE IF NOT EXISTS DBT_SERVICE;
CREATE ROLE IF NOT EXISTS ETL_SERVICE;
-- Grant role hierarchy (read roles to write roles)
GRANT ROLE DEV_READ TO ROLE DEV_WRITE;
GRANT ROLE PREP_READ TO ROLE PREP_WRITE;
GRANT ROLE PROD_READ TO ROLE PROD_WRITE;
-- Grant to aggregate roles
GRANT ROLE PREP_READ TO ROLE ANALYST;
GRANT ROLE PROD_READ TO ROLE ANALYST;
GRANT ROLE DEV_WRITE TO ROLE DATA_ENGINEER;
GRANT ROLE PREP_WRITE TO ROLE DATA_ENGINEER;
GRANT ROLE PROD_READ TO ROLE DATA_ENGINEER;
-- Grant service role permissions
GRANT ROLE PREP_WRITE TO ROLE DBT_SERVICE;
GRANT ROLE PROD_WRITE TO ROLE DBT_SERVICE;
-- Grant to SYSADMIN for admin management
GRANT ROLE DATA_ENGINEER TO ROLE SYSADMIN;
GRANT ROLE DBT_SERVICE TO ROLE SYSADMIN;
GRANT ROLE ETL_SERVICE TO ROLE SYSADMIN;6.2 Schema Strategy and Organization
Three-Layer Architecture
Following DBT best practices, organize schemas into three primary layers:
1. RAW Layer
- Purpose: Source-conformed data, minimal transformation
- Schema naming:
RAW.{SOURCE_SYSTEM}(e.g.,RAW.REMEMBERS,RAW.POLYTOMIC_SHOPIFY) - Ownership: ETL_SERVICE role
- Access: READ for all team roles, WRITE for ETL_SERVICE only
- Materialization: Tables (loaded by external tools)
2. STAGING Layer
- Purpose: Cleaned, renamed, type-cast data from sources
- Schema naming:
{ENV}.STAGINGor{ENV}.STG_{SOURCE}for source-specific staging - Ownership: DBT_SERVICE role
- Access: WRITE for DBT_SERVICE, READ for team roles
- Materialization: Views or ephemeral (lightweight transformations)
3. MARTS Layer
- Purpose: Business-conformed, wide tables for analytics
- Schema naming:
{ENV}.{DOMAIN}(e.g.,PROD.FINANCE,PROD.MEMBERSHIP) - Ownership: DBT_SERVICE role
- Access: WRITE for DBT_SERVICE, READ for team roles
- Materialization: Tables or incremental models (for large datasets)
Schema-Level Grants
-- RAW layer: ETL writes, everyone reads
GRANT USAGE ON DATABASE RAW TO ROLE ANALYST;
GRANT USAGE ON SCHEMA RAW.REMEMBERS TO ROLE ANALYST;
GRANT SELECT ON ALL TABLES IN SCHEMA RAW.REMEMBERS TO ROLE ANALYST;
GRANT SELECT ON FUTURE TABLES IN SCHEMA RAW.REMEMBERS TO ROLE ANALYST;
-- PREP layer: DBT writes, team reads
GRANT USAGE ON DATABASE PREP TO ROLE ANALYST;
GRANT USAGE ON ALL SCHEMAS IN DATABASE PREP TO ROLE ANALYST;
GRANT SELECT ON ALL TABLES IN DATABASE PREP TO ROLE ANALYST;
GRANT SELECT ON FUTURE TABLES IN DATABASE PREP TO ROLE ANALYST;
-- PROD layer: DBT writes, team reads
GRANT USAGE ON DATABASE PROD TO ROLE ANALYST;
GRANT USAGE ON ALL SCHEMAS IN DATABASE PROD TO ROLE ANALYST;
GRANT SELECT ON ALL TABLES IN DATABASE PROD TO ROLE ANALYST;
GRANT SELECT ON FUTURE TABLES IN DATABASE PROD TO ROLE ANALYST;6.3 Warehouse Strategy and Configuration
Warehouse Sizing
Create dedicated warehouses per environment and workload type:
| Warehouse Name | Size | Use Case | Auto-Suspend | Max Clusters |
|---|---|---|---|---|
| DEV_XS | X-Small | Development, testing | 5 min | 1 |
| PREP_S | Small | Staging builds, CI checks | 5 min | 1 |
| PROD_M | Medium | Production transformations | 10 min | 2 |
| PROD_L | Large | Large model builds (as needed) | 5 min | 1 |
| LOADING_XS | X-Small | Data extraction, small loads | 2 min | 1 |
Warehouse Creation Script
-- Development warehouse
CREATE WAREHOUSE IF NOT EXISTS DEV_XS
WAREHOUSE_SIZE = 'X-SMALL'
AUTO_SUSPEND = 300
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE
COMMENT = 'Development and testing warehouse';
-- Staging warehouse
CREATE WAREHOUSE IF NOT EXISTS PREP_S
WAREHOUSE_SIZE = 'SMALL'
AUTO_SUSPEND = 300
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE
COMMENT = 'Staging builds and CI pipeline';
-- Production warehouse
CREATE WAREHOUSE IF NOT EXISTS PROD_M
WAREHOUSE_SIZE = 'MEDIUM'
AUTO_SUSPEND = 600
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE
MIN_CLUSTER_COUNT = 1
MAX_CLUSTER_COUNT = 2
SCALING_POLICY = 'STANDARD'
COMMENT = 'Production transformation runs';
-- Loading warehouse
CREATE WAREHOUSE IF NOT EXISTS LOADING_XS
WAREHOUSE_SIZE = 'X-SMALL'
AUTO_SUSPEND = 120
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE
COMMENT = 'ETL ingestion processes';Warehouse Access Grants
-- Grant warehouse usage to appropriate roles
GRANT USAGE ON WAREHOUSE DEV_XS TO ROLE DEV_WRITE;
GRANT USAGE ON WAREHOUSE PREP_S TO ROLE PREP_WRITE;
GRANT USAGE ON WAREHOUSE PROD_M TO ROLE PROD_WRITE;
GRANT USAGE ON WAREHOUSE LOADING_XS TO ROLE ETL_SERVICE;
-- Service accounts need warehouse usage
GRANT USAGE ON WAREHOUSE PREP_S TO ROLE DBT_SERVICE;
GRANT USAGE ON WAREHOUSE PROD_M TO ROLE DBT_SERVICE;Cost Control Recommendations
- Start with smaller warehouses and scale up only when needed
- Monitor query times and spilling (bytes spilled to local/remote storage)
- Use auto-suspend aggressively (2-5 minutes for infrequent workloads)
- Review warehouse usage weekly using
WAREHOUSE_METERING_HISTORYview - Consider scheduling large jobs during off-peak hours
6.4 User and Service Account Provisioning
Adding a New User
- Create Snowflake user:
CREATE USER IF NOT EXISTS john_smith
PASSWORD = '<temp-password>'
DEFAULT_ROLE = ANALYST
DEFAULT_WAREHOUSE = PREP_S
MUST_CHANGE_PASSWORD = TRUE
COMMENT = 'CTA Data Analyst';- Grant appropriate team role:
-- For analysts (read-only)
GRANT ROLE ANALYST TO USER john_smith;
-- For data engineers (dev + prep write, prod read)
GRANT ROLE DATA_ENGINEER TO USER john_smith;- Provide onboarding information:
- Snowflake account URL
- Temporary password (via secure channel)
- Link to this SOP for context
- Snowflake CLI setup instructions
Creating a Service Account
- Generate RSA key pair:
mkdir -p ~/.snowflake-keys
cd ~/.snowflake-keys
# Generate private key
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out dbt_service_key.p8 -nocrypt
# Generate public key
openssl rsa -in dbt_service_key.p8 -pubout -out dbt_service_key.pub- Extract public key content:
# Remove header/footer and join lines
grep -v "BEGIN PUBLIC KEY" dbt_service_key.pub | grep -v "END PUBLIC KEY" | tr -d '\n'- Create service user with public key:
CREATE USER IF NOT EXISTS dbt_service
DEFAULT_ROLE = DBT_SERVICE
DEFAULT_WAREHOUSE = PREP_S
RSA_PUBLIC_KEY = '<public-key-content>'
COMMENT = 'DBT GitHub Actions service account';
GRANT ROLE DBT_SERVICE TO USER dbt_service;- Store private key securely:
- Add to 1Password as secure note
- Add to GitHub repository secrets (for Actions)
- Never commit private key to version control
Removing Access
When a user leaves or changes roles:
- Revoke role grants:
-- Revoke all roles
REVOKE ROLE ANALYST FROM USER john_smith;
REVOKE ROLE DATA_ENGINEER FROM USER john_smith;- Disable or drop user:
-- Disable (preserves audit history)
ALTER USER john_smith SET DISABLED = TRUE;
-- Or drop entirely (after audit period)
DROP USER IF EXISTS john_smith;6.5 DBT Project Structure and Configuration
Project Organization
Following DBT best practices, organize the project using a modular, layered approach:
cta-dbt/
├── dbt_project.yml
├── packages.yml
├── profiles.yml (local only, not committed)
├── models/
│ ├── staging/
│ │ ├── remembers/
│ │ │ ├── _remembers__sources.yml
│ │ │ ├── _remembers__models.yml
│ │ │ ├── stg_remembers__customers.sql
│ │ │ └── stg_remembers__organizations.sql
│ │ └── shopify/
│ │ ├── _shopify__sources.yml
│ │ ├── stg_shopify__orders.sql
│ │ └── stg_shopify__customers.sql
│ ├── intermediate/
│ │ └── membership/
│ │ ├── _int_membership__models.yml
│ │ └── int_member_engagement.sql
│ └── marts/
│ ├── membership/
│ │ ├── _membership__models.yml
│ │ ├── dim_organizations.sql
│ │ └── fct_member_engagement.sql
│ └── finance/
│ ├── _finance__models.yml
│ └── fct_payments.sql
├── macros/
│ └── generate_warehouse_name.sql
├── tests/
│ └── assert_organization_matching.sql
└── seeds/
└── organization_aliases.csv
Environment Configuration (dbt_project.yml)
name: 'cta_analytics'
version: '1.0.0'
config-version: 2
profile: 'cta_snowflake'
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets: ["target", "dbt_packages"]
models:
cta_analytics:
+materialized: view
staging:
+materialized: view
+schema: staging
intermediate:
+materialized: ephemeral
+schema: intermediate
marts:
+materialized: table
+schema: marts
membership:
+schema: membership
finance:
+schema: finance
vars:
# Environment-specific variables
dev_db_object_expiration: 80 # Days until dev objects are dropped
# Source system variables
current_date: "CURRENT_DATE()"Profile Configuration (profiles.yml)
Store locally in ~/.dbt/profiles.yml:
cta_snowflake:
target: dev
outputs:
dev:
type: snowflake
account: YOUR_ACCOUNT_ID
user: "{{ env_var('SNOWFLAKE_USER') }}"
authenticator: externalbrowser # SSO authentication
role: DEV_WRITE
database: DEV
warehouse: DEV_XS
schema: staging
threads: 4
prep:
type: snowflake
account: YOUR_ACCOUNT_ID
user: "{{ env_var('SNOWFLAKE_USER') }}"
authenticator: externalbrowser
role: PREP_WRITE
database: PREP
warehouse: PREP_S
schema: staging
threads: 8
prod:
type: snowflake
account: YOUR_ACCOUNT_ID
user: dbt_service
role: DBT_SERVICE
database: PROD
warehouse: PROD_M
schema: staging
threads: 8
private_key_path: "{{ env_var('SNOWFLAKE_PRIVATE_KEY_PATH') }}"
private_key_passphrase: "{{ env_var('SNOWFLAKE_PRIVATE_KEY_PASSPHRASE') }}"Model Naming Conventions
Follow consistent naming to improve discoverability:
Staging Models:
- Format:
stg_{source}__{entity}.sql - Example:
stg_remembers__customers.sql,stg_shopify__orders.sql - Purpose: Source-conformed, cleaned, renamed columns
Intermediate Models:
- Format:
int_{domain}__{description}.sql - Example:
int_membership__engagement.sql - Purpose: Logical transformations preparing for marts
Mart Models:
- Format:
{type}_{domain}__{entity}.sql - Types:
dim_(dimensions),fct_(facts),rpt_(reports) - Example:
dim_organizations.sql,fct_member_engagement.sql - Purpose: Business-conformed, wide tables for analytics
Schema and Source Documentation
Every source and model folder should have YAML files:
# _remembers__sources.yml
version: 2
sources:
- name: remembers
description: CTA membership management system
database: RAW
schema: REMEMBERS
tables:
- name: customers
description: Customer records from REMEMBERS
columns:
- name: customer_id
description: Primary key
tests:
- not_null
- unique# _remembers__models.yml
version: 2
models:
- name: stg_remembers__customers
description: Cleaned customer data from REMEMBERS
columns:
- name: customer_id
description: Primary key
tests:
- not_null
- unique
- name: customer_email
description: Customer email address
tests:
- not_null6.6 GitHub Actions CI/CD Configuration
Repository Secrets Setup
Add the following secrets to GitHub repository settings:
SNOWFLAKE_ACCOUNT: Account identifier (e.g.,abc12345.us-east-1)SNOWFLAKE_USER: Service account username (e.g.,dbt_service)SNOWFLAKE_PRIVATE_KEY: Private key content (base64 encoded)SNOWFLAKE_ROLE: Service role name (e.g.,DBT_SERVICE)SNOWFLAKE_DATABASE_PROD: Production database nameSNOWFLAKE_DATABASE_PREP: Staging database nameSNOWFLAKE_WAREHOUSE_PROD: Production warehouse nameSNOWFLAKE_WAREHOUSE_PREP: Staging warehouse name
CI/CD Workflow Configuration
Create .github/workflows/dbt-ci.yml:
name: DBT CI
on:
pull_request:
branches: [main]
paths:
- 'models/**'
- 'macros/**'
- 'tests/**'
- 'dbt_project.yml'
jobs:
dbt-test:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install DBT
run: |
pip install dbt-snowflake==1.7.0
- name: Setup DBT profile
run: |
mkdir -p ~/.dbt
cat << EOF > ~/.dbt/profiles.yml
cta_snowflake:
target: prep
outputs:
prep:
type: snowflake
account: ${{ secrets.SNOWFLAKE_ACCOUNT }}
user: ${{ secrets.SNOWFLAKE_USER }}
role: ${{ secrets.SNOWFLAKE_ROLE }}
database: ${{ secrets.SNOWFLAKE_DATABASE_PREP }}
warehouse: ${{ secrets.SNOWFLAKE_WAREHOUSE_PREP }}
schema: staging
threads: 8
private_key_path: /tmp/snowflake_key.p8
EOF
- name: Decode private key
run: |
echo "${{ secrets.SNOWFLAKE_PRIVATE_KEY }}" | base64 -d > /tmp/snowflake_key.p8
chmod 600 /tmp/snowflake_key.p8
- name: Run DBT deps
run: dbt deps
- name: Run DBT build (changed models only)
run: |
dbt build --select state:modified+ --defer --state ./target
- name: Run DBT test
run: dbt test --select state:modified+Create .github/workflows/dbt-deploy.yml:
name: DBT Deploy
on:
push:
branches: [main]
paths:
- 'models/**'
- 'macros/**'
- 'tests/**'
- 'dbt_project.yml'
jobs:
dbt-prod-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install DBT
run: |
pip install dbt-snowflake==1.7.0
- name: Setup DBT profile
run: |
mkdir -p ~/.dbt
cat << EOF > ~/.dbt/profiles.yml
cta_snowflake:
target: prod
outputs:
prod:
type: snowflake
account: ${{ secrets.SNOWFLAKE_ACCOUNT }}
user: ${{ secrets.SNOWFLAKE_USER }}
role: ${{ secrets.SNOWFLAKE_ROLE }}
database: ${{ secrets.SNOWFLAKE_DATABASE_PROD }}
warehouse: ${{ secrets.SNOWFLAKE_WAREHOUSE_PROD }}
schema: staging
threads: 8
private_key_path: /tmp/snowflake_key.p8
EOF
- name: Decode private key
run: |
echo "${{ secrets.SNOWFLAKE_PRIVATE_KEY }}" | base64 -d > /tmp/snowflake_key.p8
chmod 600 /tmp/snowflake_key.p8
- name: Run DBT deps
run: dbt deps
- name: Run DBT build (full refresh)
run: dbt build --full-refresh
- name: Run DBT test
run: dbt test
- name: Generate DBT docs
run: dbt docs generate
- name: Upload docs artifact
uses: actions/upload-artifact@v3
with:
name: dbt-docs
path: target/Workflow Triggers and Behavior
Pull Request (CI):
- Runs on PR to main branch
- Deploys to PREP environment
- Tests only changed models (state-based selection)
- Blocks merge if tests fail
- Uses smaller PREP warehouse for cost efficiency
Main Branch Push (CD):
- Runs on merge to main
- Deploys to PROD environment
- Runs full build and test suite
- Generates documentation
- Uses production warehouse
- Notifications on failure (configure via GitHub Actions)
6.7 Monitoring and Audit
Access Review Process
Perform quarterly access reviews:
- Generate user access report:
SELECT
grantee_name AS user_name,
role AS granted_role,
granted_on,
granted_by
FROM SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_USERS
WHERE deleted_on IS NULL
ORDER BY grantee_name, role;-
Review with team leads:
- Confirm each user still requires their assigned roles
- Remove access for departed team members
- Adjust roles for changed responsibilities
-
Document review:
- Record review date and findings
- Track any access changes made
- Update this SOP if role definitions change
Query and Warehouse Monitoring
Monitor usage patterns weekly:
-- Warehouse credit usage by day
SELECT
DATE_TRUNC('day', start_time) AS usage_date,
warehouse_name,
SUM(credits_used) AS total_credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time >= DATEADD('day', -7, CURRENT_DATE())
GROUP BY 1, 2
ORDER BY 1 DESC, 3 DESC;
-- Long-running queries
SELECT
query_id,
user_name,
warehouse_name,
execution_time / 1000 AS execution_seconds,
ROUND(bytes_scanned / POWER(1024, 3), 2) AS gb_scanned,
query_text
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE execution_time > 60000 -- Over 1 minute
AND start_time >= DATEADD('day', -1, CURRENT_DATE())
ORDER BY execution_time DESC
LIMIT 20;DBT Run Monitoring
Track DBT execution and model performance:
-- Failed DBT runs (check logs)
SELECT
query_id,
user_name,
start_time,
error_message,
query_text
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE user_name = 'DBT_SERVICE'
AND error_code IS NOT NULL
AND start_time >= DATEADD('day', -7, CURRENT_DATE())
ORDER BY start_time DESC;Set up alerts for:
- Failed GitHub Actions workflows (native GitHub notifications)
- Unusual warehouse credit consumption
- Query timeouts or errors in production
- Schema changes in source systems
7. Quality Checks
Before considering setup complete, verify:
- ✅ All roles created and hierarchy properly configured
- ✅ Test users can connect with appropriate permissions
- ✅ Service account can authenticate using key pair
- ✅ DBT runs successfully in all three environments (dev, prep, prod)
- ✅ GitHub Actions CI workflow passes on test PR
- ✅ GitHub Actions CD workflow deploys to production on merge
- ✅ Warehouse auto-suspend is working (check after 5-10 min idle)
- ✅ Access grants follow least-privilege principle
- ✅ Documentation is generated and accessible
- ✅ Monitoring queries return expected results
Testing Checklist:
-
RBAC Testing:
-- As test user with ANALYST role USE ROLE ANALYST; SELECT * FROM PROD.MEMBERSHIP.DIM_ORGANIZATIONS LIMIT 10; -- Should succeed -- Try to write (should fail) CREATE TABLE PROD.MEMBERSHIP.TEST_TABLE AS SELECT 1; -- Should fail -
DBT Local Development:
# In local environment dbt debug # Verify connection dbt run --select stg_remembers__customers # Test single model dbt test --select stg_remembers__customers # Test validation -
GitHub Actions Testing:
- Create test PR with minor model change
- Verify CI workflow runs and passes
- Merge PR and verify CD workflow deploys to production
8. Escalation Path
Access Issues:
- User cannot connect: Verify role grants and warehouse permissions
- MFA/SSO issues: Check Snowflake authenticator configuration
- Service account errors: Verify key pair and RSA_PUBLIC_KEY configuration
- Contact: Brainforge Data Engineering team via Slack or email
DBT/GitHub Actions Issues:
- CI workflow failures: Check logs in GitHub Actions tab
- Deployment errors: Verify secrets configuration in GitHub
- Model build failures: Check Snowflake query history for errors
- Contact: CTA Data Team lead or Brainforge support
Performance Issues:
- Slow queries: Review warehouse size and query optimization
- High costs: Audit warehouse usage and auto-suspend settings
- Failed builds: Check for data quality issues or schema changes
- Contact: Brainforge Data Engineering for optimization review
Emergency Contacts:
- CTA Data Team Lead: [Contact info]
- Brainforge Support: [Contact info]
- Snowflake Account Admin: [Contact info]
9. Version History
- v1.0 (February 3, 2026) - Initial SOP creation
- Defined RBAC role hierarchy and provisioning workflows
- Documented schema and warehouse strategy
- Established DBT project structure and naming conventions
- Configured GitHub Actions CI/CD pipelines
- Created monitoring and audit procedures
Appendix A: Common Commands
Snowflake RBAC Management
-- Show current role and grants
SELECT CURRENT_ROLE();
SHOW GRANTS TO USER CURRENT_USER();
-- Switch roles
USE ROLE DATA_ENGINEER;
-- Show all users and their roles
SHOW GRANTS TO USER john_smith;
-- Show role hierarchy
SHOW GRANTS OF ROLE DATA_ENGINEER;DBT Commands
# Development
dbt run --select model_name # Run single model
dbt run --select +model_name # Run model and upstream
dbt run --select model_name+ # Run model and downstream
dbt test --select model_name # Test single model
# Environment targeting
dbt run --target prep # Deploy to PREP
dbt run --target prod # Deploy to PROD
# Documentation
dbt docs generate # Generate docs
dbt docs serve # Serve docs locally
# Debugging
dbt debug # Test connection
dbt compile --select model_name # Compile without running
dbt ls --select tag:tdf # List models by tagWarehouse Management
-- Show warehouse status
SHOW WAREHOUSES;
-- Suspend warehouse manually
ALTER WAREHOUSE DEV_XS SUSPEND;
-- Resume warehouse
ALTER WAREHOUSE DEV_XS RESUME;
-- Modify warehouse size
ALTER WAREHOUSE PROD_M SET WAREHOUSE_SIZE = 'LARGE';
-- Check warehouse usage
SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE warehouse_name = 'PROD_M'
AND start_time >= DATEADD('day', -7, CURRENT_DATE());Appendix B: Entity Resolution and Data Matching
Based on the January 26 DataOps meeting, one critical challenge is matching organizations across systems (REMEMBERS, Exhibitor/EXPO CAD, CES registration).
Current Approach:
- Use REMEMBERS
customersandcustomer_linktables as the backbone - REMEMBERS already maintains EXPO CAD IDs and domain mappings
- Match primarily on domain (website, email domain)
- Fall back to organization name matching where domain is unavailable
DBT Implementation Pattern:
-- Example: int_membership__organization_matching.sql
WITH remembers_orgs AS (
SELECT
customer_id,
organization_name,
primary_domain,
expo_cad_ids -- Array or delimited string
FROM {{ ref('stg_remembers__customers') }}
),
ces_registrations AS (
SELECT
registration_id,
company_name,
email_domain,
website_domain
FROM {{ ref('stg_ces__registrations') }}
)
SELECT
r.customer_id,
c.registration_id,
CASE
WHEN r.primary_domain = c.website_domain THEN 'domain_match'
WHEN r.primary_domain = c.email_domain THEN 'email_domain_match'
WHEN LOWER(r.organization_name) = LOWER(c.company_name) THEN 'name_match'
ELSE 'no_match'
END AS match_type
FROM ces_registrations c
LEFT JOIN remembers_orgs r
ON r.primary_domain = c.website_domain
OR r.primary_domain = c.email_domain
OR LOWER(r.organization_name) = LOWER(c.company_name)Recommendations:
- Create
dim_organizationsas single source of truth - Use DBT seeds for manual match overrides
- Build broken join reports to surface unmatched records
- Require source system updates (not manual matching by data team)
Appendix C: Cost Optimization Guidelines
Warehouse Sizing:
- Start with X-Small for development
- Use Small for staging/CI
- Use Medium for production (scale to Large only if needed)
- Monitor bytes spilled to local/remote storage (indicates undersized warehouse)
Auto-Suspend Configuration:
- Development: 2-5 minutes (infrequent use)
- Staging/CI: 5 minutes (burst workloads)
- Production: 10 minutes (regular scheduled jobs)
- Never set auto-suspend > 15 minutes without justification
Query Optimization:
- Use
LIMITfor development queries - Filter on partition keys where available (e.g., date columns)
- Avoid
SELECT *in production models - Use incremental models for large tables (>100M rows)
- Materialize intermediate steps only when reused
Monitoring and Alerts:
- Set budget alerts in Snowflake (Account → Cost Management)
- Review weekly cost reports
- Investigate warehouses with >50% idle time
- Optimize or remove models with consistently poor efficiency
References
- Snowflake RBAC Best Practices
- DBT Best Practices: Project Structure
- GitLab Data Team DBT Guide
- Snowflake Warehouse Sizing
- DBT Documentation
- CTA DataOps Planning Meeting (January 26, 2026)