Default Data Infrastructure Project

Executive Summary

Last Updated: December 15, 2025
Status: Phase 1 - Foundation & Business Reporting
Dashboard: View Live Dashboard

TL;DR

We’ve proven data-driven decision-making works at Default. Our existing dashboard tracks 5,250+ customers and 1.4M+ events with sophisticated analytics in Omni. Leadership uses these insights for pricing, product strategy, and company-wide reporting.

Now we need to scale. Manual data processes, missing integrations (Salesforce, complete Hyperline data), and lack of Phoenix instrumentation prevent us from answering critical questions about customer health, churn prediction, and product ROI.

This project: 12-week investment to automate data pipelines (ETL), integrate missing sources, instrument Phoenix with Amplitude, and enable self-service analytics. Result: Real-time customer intelligence, predictive insights, and data access for 10+ stakeholders across sales, CS, and product.

Bottom line: Transform working foundation into enterprise-grade data infrastructure that scales with our growth.

Where We Are Today

Default has proven the value of data-driven decision-making with our existing infrastructure:

What’s Working

Omni Analytics dashboard operational with comprehensive metrics (view dashboard)
MotherDuck data warehouse established and serving queries
Key business metrics tracked: customer counts (5,250+ customers), activity metrics (1.4M+ events), and geographic distribution
Sophisticated visualizations including time-series trends, stacked area charts, geographic analysis, and financial breakdowns
Demonstrated impact on pricing strategy, company-wide reporting, and product adoption insights

The Challenge

However, our current setup cannot scale to meet growing demands:

Data freshness issues: Product data from Postgres is stale due to manual sync processes - no automated ETL pipeline
Missing critical integrations: Salesforce (customer/sales) and Hyperline (subscriptions) data not yet flowing into the warehouse
Phoenix instrumentation gap: Our next-generation product lacks behavioral analytics tracking in Amplitude
Limited data accessibility: Stakeholders depend on data team for insights, cannot self-serve
Infrastructure brittleness: Ad-hoc processes create reliability risks and maintenance burden

The Business Impact

Our recent Omni demo revealed urgent demand from multiple stakeholders across sales, customer success, product, and leadership for expanded reporting capabilities. Teams need answers to questions we can’t reliably answer today:

Which customer segments have the highest retention and expansion potential?
What product usage patterns predict churn vs. growth?
How do pricing changes impact adoption and revenue?
Which features in Phoenix drive value and which are ignored?
What’s the real-time health of our customer base across product usage + CRM data?

Without modernizing our data infrastructure, we’re building on a foundation that can’t support our growth.

The Opportunity

When you have a self-serve software offering, data infrastructure IS product infrastructure. The companies that win in self-serve are the ones that can:

Identify ideal customer profiles by connecting customer attributes (Salesforce), product usage (Amplitude/Postgres), and revenue outcomes (Hyperline)
Predict and prevent churn by spotting engagement drop-offs and proactively intervening
Optimize pricing and packaging using data-driven insights on feature adoption and willingness to pay
Accelerate product development by understanding which features drive retention and expansion
Empower teams to self-serve with reliable dashboards instead of waiting for data requests

The real power comes from having all of this in one place. You can see that a certain type of customer (Salesforce) adopts specific features quickly (Amplitude), uses them heavily (product usage), expands their subscription (Hyperline), and has low support costs (customer success data). That’s your ideal customer profile, and you can only see it when the data connects.

To date, we’ve already demonstrated the value of data-driven decision-making:

Used usage analytics to inform pricing strategy decisions
Built company-wide reporting dashboards that leadership relies on
Identified product adoption patterns across customer segments

Now we need to scale this capability with proper infrastructure.

Proof of Concept: What We’ve Built

Our existing Omni dashboard demonstrates that we’ve already solved the hard problems:

What the Dashboard Proves

We can integrate complex data sources:

Successfully pulling product usage from Postgres
Monitoring 5,250+ customers and 1.4M+ events

We can build sophisticated analytics:

Time-series trend analysis showing growth patterns
Stacked composition charts revealing category breakdowns
Geographic intelligence (US market distribution)
Detailed drill-down capabilities with tabular data
Multi-metric KPI tracking (customer counts, activity metrics, revenue percentages)

We have stakeholder buy-in:

Leadership uses these dashboards for decision-making
Recent Omni demo generated strong demand from multiple departments
Existing metrics inform pricing and product strategy

What This Means for the Project

This isn’t a greenfield project - we’re scaling proven capabilities:

Infrastructure is operational (MotherDuck + Omni)
Team knows how to model data and build dashboards
Stakeholders understand the value and are demanding more
Now we need: Automation, integration, and self-service to unlock full potential

The foundation is solid. The vision is validated. Now we execute.

Current State: Foundation Built, Scale Needed

Infrastructure

Data Warehouse: MotherDuck (DuckDB cloud) - operational and serving queries
BI Tool: Omni Analytics - deployed with production dashboards (view live)
Product Analytics: Amplitude - initialized but not instrumented for Phoenix product
ETL/Data Movement: No established tool - relying on manual/ad-hoc processes

What’s in the Current Dashboard

Our existing Omni dashboard demonstrates sophisticated data capabilities:

High-Level Business Metrics:

Customer base tracking: 5,250+ customers monitored
Activity metrics: 170,913+ users, 1.4M+ total events
Segment analysis: 564 active segments/cohorts

Current Visualizations:

Time-series analysis: Multi-period trend charts showing growth patterns
Composition analysis: Stacked area charts revealing category breakdowns over time
Geographic intelligence: US market distribution and regional performance
Detailed breakdowns: Tabular data with drill-down capabilities
Comparative analytics: Bar charts for segment and feature comparisons
Distribution insights: Pie/donut charts showing portfolio composition
Financial tracking: Revenue metrics with percentage breakdowns (77.61%, 22.39% splits visible)

Data Currently Represented:

Product usage events and user activity (from Vanilla Postgres)
Some revenue/subscription data (tracking visible, but not totaling $58M)
User segmentation and cohort analysis
Geographic distribution (US-focused)
Temporal trends across multiple time windows

Data Sources: Current vs. Target State

Source	Current State	Target State
Product Data (Postgres)	Flowing but stale (manual sync)	Real-time ETL pipeline
Salesforce CRM	Not integrated	Daily sync of accounts, contacts, opportunities
Hyperline Subscriptions	Partial (some revenue data visible)	Complete subscription lifecycle data
Amplitude (Vanilla)	Some historical data	Maintained for historical analysis
Amplitude (Phoenix)	Not instrumented	Fully tracked with event taxonomy

Critical Gaps Blocking Scale

No ETL infrastructure: Manual processes don’t scale, create data freshness issues
Incomplete Salesforce integration: Cannot connect customer attributes to product behavior
Phoenix not instrumented: Blind to next-gen product adoption and usage patterns
Limited self-service access: Stakeholders wait for data team, slowing decisions
No data quality monitoring: Hard to trust metrics, no alerting on pipeline failures
Missing key business questions:
- Customer health scoring (product engagement + CRM attributes)
- Churn prediction models
- Phoenix vs. Vanilla migration tracking
- Feature-level ROI analysis
- Sales funnel conversion with product trial data

From Good Insights to Great Decisions

What We Can Answer Today (Current Dashboard)

Our existing Omni dashboard enables analysis like:

“How many customers do we have and what’s our revenue?”
→ 5,250 customers, revenue tracking visible (excluding aggregate customer-raised figures)

“What’s our geographic distribution?”
→ US market heatmap showing regional concentration

“How is our activity trending over time?”
→ Time-series charts showing growth in users (170K+) and events (1.4M+)

“What’s our revenue split by category?”
→ Percentage breakdowns showing 77.61% / 22.39% distribution visible

“Which segments are most active?”
→ 564 distinct segments tracked with comparative analytics

What We Can’t Answer Yet (But Need To)

“Which customers are at risk of churning?”
→ Need: Product engagement trends + subscription status + CRM health signals

“What’s the conversion rate from trial to paid, and which product features drive it?”
→ Need: Salesforce opportunity data + product activation events + Hyperline subscriptions

“How do customers use Phoenix vs. Vanilla, and are they migrating?”
→ Need: Phoenix instrumentation in Amplitude + migration funnel tracking

“What’s the ROI of each feature we’ve built?”
→ Need: Development cost data + feature adoption metrics + retention impact analysis

“Which customer segments have the highest LTV and why?”
→ Need: Integrated view of acquisition source + product usage + expansion/churn + support costs

“Are our recent pricing changes working?”
→ Need: Real-time subscription data + cohort analysis by pricing tier + usage patterns

The Gap: Integration + Automation

We have great visualizations of partial data. We need complete, fresh data to make confident decisions.

The project delivers:

Salesforce integration: Customer context and sales intelligence
Hyperline integration: Subscription lifecycle and revenue precision
Phoenix instrumentation: Next-gen product behavioral insights
ETL automation: Fresh data (< 4 hours vs. days/weeks)
Self-service enablement: Teams answer their own questions

Strategic Priorities

Priority 1: Scale & Enhance Business Dashboards (Weeks 1-4)

Goal: Transform existing dashboard foundation into enterprise-grade, integrated reporting

Current State: Our existing Omni dashboard tracks 5,250+ customers and 1.4M+ events with sophisticated visualizations. However, data freshness issues and missing integrations limit its utility.

What We’ll Build:

Establish ETL infrastructure (Fivetran or Airbyte) for reliable, automated data pipelines
Integrate Salesforce: Connect customer attributes, sales stages, and opportunities to product usage
Complete Hyperline integration: Full subscription lifecycle, MRR/ARR tracking, churn/expansion metrics
Modernize Postgres sync: Real-time or near-real-time product event streaming
Enhance existing dashboards with stakeholder feedback:
- Customer 360 view: Product engagement + CRM attributes + subscription status
- Revenue intelligence: Subscription trends, expansion/contraction, cohort LTV
- Sales effectiveness: Pipeline conversion tied to product trial behavior
- Product adoption: Feature usage trends, activation funnels, engagement scoring

Quick Wins from Existing Data:

Improve data freshness from days/weeks to hours (via ETL automation)
Add customer segmentation by combining product usage (existing) + Salesforce attributes (new)
Build churn risk indicators by correlating engagement drops with subscription data
Create sales handoff reports showing trial-to-paid conversion metrics

Why: Multiple stakeholders validated strong demand in Omni demo. We have the foundation - now we need to make it reliable, complete, and actionable.

Success Metrics:

Data freshness: < 4 hours for all sources
Dashboard uptime: 99.5%+
Stakeholder adoption: 10+ active users across sales, CS, product, leadership
Self-service rate: 50%+ of data requests answered via existing dashboards

Dependencies:

ETL tool selection and procurement (Week 1)
Salesforce and Hyperline API access and documentation
Data model alignment with stakeholders (1-hour kickoff workshop)
Monitoring and alerting setup for pipeline health

Priority 2: Instrument Phoenix for Product Analytics (Weeks 5-8)

Goal: Establish robust product analytics for our next-generation product

What:

Design and implement event tracking strategy for Phoenix
Instrument key user flows and feature interactions
Build product analytics dashboards in Amplitude
Create self-serve product insights in Omni

Why: Phoenix represents our product future. We need to understand user behavior from day one to inform product development, prioritize features, and identify friction points. Amplitude gives us best-in-class behavioral analytics capabilities.

Dependencies:

Product team alignment on key events and metrics
Engineering implementation of tracking (client-side + server-side)
Amplitude workspace configuration and governance

Priority 3: Enable Team Self-Service on Omni (Weeks 9-12)

Goal: Onboard Default team to independently query and explore data

What:

Create documentation and training materials for Omni
Onboard stakeholders across sales, CS, product, marketing
Establish data governance and access controls
Build library of common queries and templates

Why: Eliminate data team bottleneck and empower stakeholders to answer their own questions. Self-service culture accelerates decision-making and builds data literacy.

Dependencies:

Stable data models and refresh schedules from Priority 1
Clear documentation of available data and metrics
Training sessions and office hours

What Success Looks Like

Today (Baseline)

Our current dashboard tracks:

5,250 customers with product usage data
1.4M+ events and 170K+ users monitored
Manual data refreshes (days to weeks lag)
Limited self-service (data team dependency)

6 Months (Phase 1-3 Complete)

Integrated customer view: Salesforce + Hyperline + Product data in single dashboard
< 4 hour data freshness across all sources via automated ETL
Phoenix behavioral analytics live in Amplitude with 100+ tracked events
10+ stakeholders self-serving 80%+ of reporting needs in Omni
Key dashboards operational:
- Customer Health 360 (product engagement + CRM + subscriptions)
- Revenue Intelligence (MRR/ARR trends, cohort LTV, churn/expansion)
- Sales Effectiveness (pipeline to trial to paid conversion)
- Phoenix Adoption (feature usage, activation funnels, retention cohorts)
ETL infrastructure stable with 99.5%+ uptime and automated alerting

12 Months (Advanced Capabilities)

Predictive models for churn risk and expansion scoring
Real-time alerting for at-risk customers (CS team) and hot leads (sales team)
A/B testing framework integrated with Amplitude + automated experiment tracking
Advanced analytics:
- Customer journey mapping (awareness to activation to retention)
- Feature ROI analysis (development cost vs. adoption vs. retention impact)
- Pricing optimization modeling (willingness-to-pay by segment)
- Phoenix migration tracking (Vanilla to Phoenix conversion funnels)
Data-driven culture where every major decision references specific metrics
Scaled tracking: 10,000+ customers, $100M+ revenue, 5M+ monthly events

Investment Required

Infrastructure & Tooling (Incremental Costs)

Already Operational:

MotherDuck data warehouse
Omni Analytics with working dashboards
Amplitude workspace initialized

New Investment Needed:

ETL Tool: Fivetran or Polytomic
- Justification: Eliminates manual data processes, ensures freshness, reduces engineering maintenance burden
MotherDuck: Potential upgrade for increased query volume (~$500-1K/month incremental)
Omni Analytics: Additional user seats for self-service access
Amplitude: Phoenix instrumentation implementation effort (one-time engineering cost)

Appendix: Data Types & Use Cases

Customer Data (Salesforce)

What: Account attributes, contact information, sales stages, opportunity pipeline
Use Cases:

Customer segmentation (industry, size, plan type)
Sales funnel analysis and conversion rates
Account health scoring (product usage + CRM data)

Product Data (Postgres to MotherDuck)

What: User actions, feature usage, session data, technical metrics
Use Cases:

Feature adoption and engagement trends
User journey analysis (onboarding to activation to retention)
Technical performance monitoring

Subscription/Revenue Data (Hyperline)

What: Subscription plans, pricing, MRR/ARR, upgrades/downgrades, churn
Use Cases:

Revenue reporting and forecasting
Pricing optimization analysis
Expansion and contraction trends
Customer lifetime value (LTV) calculations

Behavioral Analytics (Amplitude)

What: Event-level user interactions, funnels, retention cohorts, user paths
Use Cases:

Product engagement deep-dives
A/B test analysis
Retention and churn prediction
Feature experimentation and optimization

Brainforge Knowledge

Explorer

data_project_executive_summary

Default Data Infrastructure Project

Executive Summary

TL;DR

Where We Are Today

What’s Working

The Challenge

The Business Impact

The Opportunity

Proof of Concept: What We’ve Built

What the Dashboard Proves

What This Means for the Project

Current State: Foundation Built, Scale Needed

Infrastructure

What’s in the Current Dashboard

Data Sources: Current vs. Target State

Critical Gaps Blocking Scale

From Good Insights to Great Decisions

What We Can Answer Today (Current Dashboard)

What We Can’t Answer Yet (But Need To)

The Gap: Integration + Automation

Strategic Priorities

Priority 1: Scale & Enhance Business Dashboards (Weeks 1-4)

Priority 2: Instrument Phoenix for Product Analytics (Weeks 5-8)

Priority 3: Enable Team Self-Service on Omni (Weeks 9-12)

What Success Looks Like

Today (Baseline)

6 Months (Phase 1-3 Complete)

12 Months (Advanced Capabilities)

Investment Required

Infrastructure & Tooling (Incremental Costs)

Appendix: Data Types & Use Cases

Customer Data (Salesforce)

Product Data (Postgres to MotherDuck)

Subscription/Revenue Data (Hyperline)

Behavioral Analytics (Amplitude)

Graph View

Table of Contents