Playbook: MotherDuck CLI — load Default export into my_db and raw_export

Last updated: 2026-04-09
Prereqs:

  • MotherDuck CLI setup — DuckDB CLI install, MOTHERDUCK_TOKEN, verify duckdb "md:" (avoids repeated browser auth).
  • Access to MotherDuck org, export files on local disk, DuckDB CLI with MotherDuck support.
  • SQL scripts from the Default brainforge-bi repo (see engineering pipeline note).

1. Know your paths

  • Export folder: e.g. …/12-28-2025 to 03-20-2026 (latest)/ — confirm meetings + all submission parts exist before starting long jobs.
  • SQL scripts: Default brainforge-bi repo, scripts/ directory (see engineering note: MotherDuck Default export pipeline).

2. Schema DDL (extended columns) — run once per environment

If raw_meetings / raw_submissions need UTM + landing columns:

duckdb "md:" -f scripts/add_utm_landing_columns_raw_export.sql

(Idempotent: uses ADD COLUMN IF NOT EXISTS.)

3. Load small tables → my_db staging

Edit load_my_db_stg_*.sql if the folder path or staging schema name changed, then:

duckdb "md:" -f scripts/load_my_db_stg_20251228_20260320.sql

4. Merge small tables → raw_export

duckdb "md:" -f scripts/merge_stg_into_raw_export.sql

Note: raw_queues_members merge uses CAST(deleted_at AS TIMESTAMP) because raw_export uses TIMESTAMP for that column while staging may be TIMESTAMPTZ.

5. Load meetings + submissions (large — hours)

Single script (full CSV columns + upsert to raw_export):

duckdb "md:" -f scripts/load_my_db_meetings_submissions.sql

Options for long runs

  • Run in tmux / screen or nohup so disconnects do not kill the job.
  • Redirect output to a log file and tail -f it; DuckDB may stay quiet until a step completes.

Ballpark duration: Tens of GB often means several hours, highly dependent on upload bandwidth and row width.

Merge only (staging already loaded):

duckdb "md:" -f scripts/merge_stg_meetings_submissions_into_raw_export.sql

6. Verify

USE my_db;
SELECT COUNT(*) FROM stg_20251228_20260320.meetings;
SELECT COUNT(*) FROM stg_20251228_20260320.submissions;
 
USE raw_export;
SELECT COUNT(*) FROM main.raw_meetings;
SELECT COUNT(*) FROM main.raw_submissions;

Compare counts to expectations from the export manifest or source system.

7. Troubleshooting

SymptomLikely causeWhat to do
Browser SSO / auth every runNo token in shellMotherDuck CLI setup §2 — set MOTHERDUCK_TOKEN.
EXCLUDE / column not foundCSV schema changed vs scriptStop using EXCLUDE for that file or refresh the exclude list; prefer full column DDL on raw_export + SELECT * once aligned.
Type mismatch on INSERTInference differs from raw_* typesUse explicit INSERT (cols…) SELECT … with CAST.
Job “stuck” with no log outputNormal for huge read_csv_autoCheck process CPU/disk; wait for step boundary.
Meetings/submissions “missing”Export incompleteRe-export; confirm multi-part submissions and meetings file in folder.

8. After this playbook

  • Backfill old rows for new UTM columns if needed (separate project).
  • Update dbt docs / Omni model YAML if new columns should be exposed to analysts.