Default Data Infrastructure Project
Executive Summary
Last Updated: December 15, 2025
Status: Phase 1 - Foundation & Business Reporting
Dashboard: View Live Dashboard
TL;DR
We’ve proven data-driven decision-making works at Default. Our existing dashboard tracks 5,250+ customers and 1.4M+ events with sophisticated analytics in Omni. Leadership uses these insights for pricing, product strategy, and company-wide reporting.
Now we need to scale. Manual data processes, missing integrations (Salesforce, complete Hyperline data), and lack of Phoenix instrumentation prevent us from answering critical questions about customer health, churn prediction, and product ROI.
This project: 12-week investment to automate data pipelines (ETL), integrate missing sources, instrument Phoenix with Amplitude, and enable self-service analytics. Result: Real-time customer intelligence, predictive insights, and data access for 10+ stakeholders across sales, CS, and product.
Bottom line: Transform working foundation into enterprise-grade data infrastructure that scales with our growth.
Where We Are Today
Default has proven the value of data-driven decision-making with our existing infrastructure:
What’s Working
- Omni Analytics dashboard operational with comprehensive metrics (view dashboard)
- MotherDuck data warehouse established and serving queries
- Key business metrics tracked: customer counts (5,250+ customers), activity metrics (1.4M+ events), and geographic distribution
- Sophisticated visualizations including time-series trends, stacked area charts, geographic analysis, and financial breakdowns
- Demonstrated impact on pricing strategy, company-wide reporting, and product adoption insights
The Challenge
However, our current setup cannot scale to meet growing demands:
- Data freshness issues: Product data from Postgres is stale due to manual sync processes - no automated ETL pipeline
- Missing critical integrations: Salesforce (customer/sales) and Hyperline (subscriptions) data not yet flowing into the warehouse
- Phoenix instrumentation gap: Our next-generation product lacks behavioral analytics tracking in Amplitude
- Limited data accessibility: Stakeholders depend on data team for insights, cannot self-serve
- Infrastructure brittleness: Ad-hoc processes create reliability risks and maintenance burden
The Business Impact
Our recent Omni demo revealed urgent demand from multiple stakeholders across sales, customer success, product, and leadership for expanded reporting capabilities. Teams need answers to questions we can’t reliably answer today:
- Which customer segments have the highest retention and expansion potential?
- What product usage patterns predict churn vs. growth?
- How do pricing changes impact adoption and revenue?
- Which features in Phoenix drive value and which are ignored?
- What’s the real-time health of our customer base across product usage + CRM data?
Without modernizing our data infrastructure, we’re building on a foundation that can’t support our growth.
The Opportunity
When you have a self-serve software offering, data infrastructure IS product infrastructure. The companies that win in self-serve are the ones that can:
- Identify ideal customer profiles by connecting customer attributes (Salesforce), product usage (Amplitude/Postgres), and revenue outcomes (Hyperline)
- Predict and prevent churn by spotting engagement drop-offs and proactively intervening
- Optimize pricing and packaging using data-driven insights on feature adoption and willingness to pay
- Accelerate product development by understanding which features drive retention and expansion
- Empower teams to self-serve with reliable dashboards instead of waiting for data requests
The real power comes from having all of this in one place. You can see that a certain type of customer (Salesforce) adopts specific features quickly (Amplitude), uses them heavily (product usage), expands their subscription (Hyperline), and has low support costs (customer success data). That’s your ideal customer profile, and you can only see it when the data connects.
To date, we’ve already demonstrated the value of data-driven decision-making:
- Used usage analytics to inform pricing strategy decisions
- Built company-wide reporting dashboards that leadership relies on
- Identified product adoption patterns across customer segments
Now we need to scale this capability with proper infrastructure.
Proof of Concept: What We’ve Built
Our existing Omni dashboard demonstrates that we’ve already solved the hard problems:
What the Dashboard Proves
We can integrate complex data sources:
- Successfully pulling product usage from Postgres
- Monitoring 5,250+ customers and 1.4M+ events
We can build sophisticated analytics:
- Time-series trend analysis showing growth patterns
- Stacked composition charts revealing category breakdowns
- Geographic intelligence (US market distribution)
- Detailed drill-down capabilities with tabular data
- Multi-metric KPI tracking (customer counts, activity metrics, revenue percentages)
We have stakeholder buy-in:
- Leadership uses these dashboards for decision-making
- Recent Omni demo generated strong demand from multiple departments
- Existing metrics inform pricing and product strategy
What This Means for the Project
This isn’t a greenfield project - we’re scaling proven capabilities:
- Infrastructure is operational (MotherDuck + Omni)
- Team knows how to model data and build dashboards
- Stakeholders understand the value and are demanding more
- Now we need: Automation, integration, and self-service to unlock full potential
The foundation is solid. The vision is validated. Now we execute.
Current State: Foundation Built, Scale Needed
Infrastructure
- Data Warehouse: MotherDuck (DuckDB cloud) - operational and serving queries
- BI Tool: Omni Analytics - deployed with production dashboards (view live)
- Product Analytics: Amplitude - initialized but not instrumented for Phoenix product
- ETL/Data Movement: No established tool - relying on manual/ad-hoc processes
What’s in the Current Dashboard
Our existing Omni dashboard demonstrates sophisticated data capabilities:
High-Level Business Metrics:
- Customer base tracking: 5,250+ customers monitored
- Activity metrics: 170,913+ users, 1.4M+ total events
- Segment analysis: 564 active segments/cohorts
Current Visualizations:
- Time-series analysis: Multi-period trend charts showing growth patterns
- Composition analysis: Stacked area charts revealing category breakdowns over time
- Geographic intelligence: US market distribution and regional performance
- Detailed breakdowns: Tabular data with drill-down capabilities
- Comparative analytics: Bar charts for segment and feature comparisons
- Distribution insights: Pie/donut charts showing portfolio composition
- Financial tracking: Revenue metrics with percentage breakdowns (77.61%, 22.39% splits visible)
Data Currently Represented:
- Product usage events and user activity (from Vanilla Postgres)
- Some revenue/subscription data (tracking visible, but not totaling $58M)
- User segmentation and cohort analysis
- Geographic distribution (US-focused)
- Temporal trends across multiple time windows
Data Sources: Current vs. Target State
| Source | Current State | Target State |
|---|---|---|
| Product Data (Postgres) | Flowing but stale (manual sync) | Real-time ETL pipeline |
| Salesforce CRM | Not integrated | Daily sync of accounts, contacts, opportunities |
| Hyperline Subscriptions | Partial (some revenue data visible) | Complete subscription lifecycle data |
| Amplitude (Vanilla) | Some historical data | Maintained for historical analysis |
| Amplitude (Phoenix) | Not instrumented | Fully tracked with event taxonomy |
Critical Gaps Blocking Scale
- No ETL infrastructure: Manual processes don’t scale, create data freshness issues
- Incomplete Salesforce integration: Cannot connect customer attributes to product behavior
- Phoenix not instrumented: Blind to next-gen product adoption and usage patterns
- Limited self-service access: Stakeholders wait for data team, slowing decisions
- No data quality monitoring: Hard to trust metrics, no alerting on pipeline failures
- Missing key business questions:
- Customer health scoring (product engagement + CRM attributes)
- Churn prediction models
- Phoenix vs. Vanilla migration tracking
- Feature-level ROI analysis
- Sales funnel conversion with product trial data
From Good Insights to Great Decisions
What We Can Answer Today (Current Dashboard)
Our existing Omni dashboard enables analysis like:
“How many customers do we have and what’s our revenue?”
→ 5,250 customers, revenue tracking visible (excluding aggregate customer-raised figures)
“What’s our geographic distribution?”
→ US market heatmap showing regional concentration
“How is our activity trending over time?”
→ Time-series charts showing growth in users (170K+) and events (1.4M+)
“What’s our revenue split by category?”
→ Percentage breakdowns showing 77.61% / 22.39% distribution visible
“Which segments are most active?”
→ 564 distinct segments tracked with comparative analytics
What We Can’t Answer Yet (But Need To)
“Which customers are at risk of churning?”
→ Need: Product engagement trends + subscription status + CRM health signals
“What’s the conversion rate from trial to paid, and which product features drive it?”
→ Need: Salesforce opportunity data + product activation events + Hyperline subscriptions
“How do customers use Phoenix vs. Vanilla, and are they migrating?”
→ Need: Phoenix instrumentation in Amplitude + migration funnel tracking
“What’s the ROI of each feature we’ve built?”
→ Need: Development cost data + feature adoption metrics + retention impact analysis
“Which customer segments have the highest LTV and why?”
→ Need: Integrated view of acquisition source + product usage + expansion/churn + support costs
“Are our recent pricing changes working?”
→ Need: Real-time subscription data + cohort analysis by pricing tier + usage patterns
The Gap: Integration + Automation
We have great visualizations of partial data. We need complete, fresh data to make confident decisions.
The project delivers:
- Salesforce integration: Customer context and sales intelligence
- Hyperline integration: Subscription lifecycle and revenue precision
- Phoenix instrumentation: Next-gen product behavioral insights
- ETL automation: Fresh data (< 4 hours vs. days/weeks)
- Self-service enablement: Teams answer their own questions
Strategic Priorities
Priority 1: Scale & Enhance Business Dashboards (Weeks 1-4)
Goal: Transform existing dashboard foundation into enterprise-grade, integrated reporting
Current State: Our existing Omni dashboard tracks 5,250+ customers and 1.4M+ events with sophisticated visualizations. However, data freshness issues and missing integrations limit its utility.
What We’ll Build:
- Establish ETL infrastructure (Fivetran or Airbyte) for reliable, automated data pipelines
- Integrate Salesforce: Connect customer attributes, sales stages, and opportunities to product usage
- Complete Hyperline integration: Full subscription lifecycle, MRR/ARR tracking, churn/expansion metrics
- Modernize Postgres sync: Real-time or near-real-time product event streaming
- Enhance existing dashboards with stakeholder feedback:
- Customer 360 view: Product engagement + CRM attributes + subscription status
- Revenue intelligence: Subscription trends, expansion/contraction, cohort LTV
- Sales effectiveness: Pipeline conversion tied to product trial behavior
- Product adoption: Feature usage trends, activation funnels, engagement scoring
Quick Wins from Existing Data:
- Improve data freshness from days/weeks to hours (via ETL automation)
- Add customer segmentation by combining product usage (existing) + Salesforce attributes (new)
- Build churn risk indicators by correlating engagement drops with subscription data
- Create sales handoff reports showing trial-to-paid conversion metrics
Why: Multiple stakeholders validated strong demand in Omni demo. We have the foundation - now we need to make it reliable, complete, and actionable.
Success Metrics:
- Data freshness: < 4 hours for all sources
- Dashboard uptime: 99.5%+
- Stakeholder adoption: 10+ active users across sales, CS, product, leadership
- Self-service rate: 50%+ of data requests answered via existing dashboards
Dependencies:
- ETL tool selection and procurement (Week 1)
- Salesforce and Hyperline API access and documentation
- Data model alignment with stakeholders (1-hour kickoff workshop)
- Monitoring and alerting setup for pipeline health
Priority 2: Instrument Phoenix for Product Analytics (Weeks 5-8)
Goal: Establish robust product analytics for our next-generation product
What:
- Design and implement event tracking strategy for Phoenix
- Instrument key user flows and feature interactions
- Build product analytics dashboards in Amplitude
- Create self-serve product insights in Omni
Why: Phoenix represents our product future. We need to understand user behavior from day one to inform product development, prioritize features, and identify friction points. Amplitude gives us best-in-class behavioral analytics capabilities.
Dependencies:
- Product team alignment on key events and metrics
- Engineering implementation of tracking (client-side + server-side)
- Amplitude workspace configuration and governance
Priority 3: Enable Team Self-Service on Omni (Weeks 9-12)
Goal: Onboard Default team to independently query and explore data
What:
- Create documentation and training materials for Omni
- Onboard stakeholders across sales, CS, product, marketing
- Establish data governance and access controls
- Build library of common queries and templates
Why: Eliminate data team bottleneck and empower stakeholders to answer their own questions. Self-service culture accelerates decision-making and builds data literacy.
Dependencies:
- Stable data models and refresh schedules from Priority 1
- Clear documentation of available data and metrics
- Training sessions and office hours
What Success Looks Like
Today (Baseline)
Our current dashboard tracks:
- 5,250 customers with product usage data
- 1.4M+ events and 170K+ users monitored
- Manual data refreshes (days to weeks lag)
- Limited self-service (data team dependency)
6 Months (Phase 1-3 Complete)
- Integrated customer view: Salesforce + Hyperline + Product data in single dashboard
- < 4 hour data freshness across all sources via automated ETL
- Phoenix behavioral analytics live in Amplitude with 100+ tracked events
- 10+ stakeholders self-serving 80%+ of reporting needs in Omni
- Key dashboards operational:
- Customer Health 360 (product engagement + CRM + subscriptions)
- Revenue Intelligence (MRR/ARR trends, cohort LTV, churn/expansion)
- Sales Effectiveness (pipeline to trial to paid conversion)
- Phoenix Adoption (feature usage, activation funnels, retention cohorts)
- ETL infrastructure stable with 99.5%+ uptime and automated alerting
12 Months (Advanced Capabilities)
- Predictive models for churn risk and expansion scoring
- Real-time alerting for at-risk customers (CS team) and hot leads (sales team)
- A/B testing framework integrated with Amplitude + automated experiment tracking
- Advanced analytics:
- Customer journey mapping (awareness to activation to retention)
- Feature ROI analysis (development cost vs. adoption vs. retention impact)
- Pricing optimization modeling (willingness-to-pay by segment)
- Phoenix migration tracking (Vanilla to Phoenix conversion funnels)
- Data-driven culture where every major decision references specific metrics
- Scaled tracking: 10,000+ customers, $100M+ revenue, 5M+ monthly events
Investment Required
Infrastructure & Tooling (Incremental Costs)
Already Operational:
- MotherDuck data warehouse
- Omni Analytics with working dashboards
- Amplitude workspace initialized
New Investment Needed:
- ETL Tool: Fivetran or Polytomic
- Justification: Eliminates manual data processes, ensures freshness, reduces engineering maintenance burden
- MotherDuck: Potential upgrade for increased query volume (~$500-1K/month incremental)
- Omni Analytics: Additional user seats for self-service access
- Amplitude: Phoenix instrumentation implementation effort (one-time engineering cost)
Appendix: Data Types & Use Cases
Customer Data (Salesforce)
What: Account attributes, contact information, sales stages, opportunity pipeline
Use Cases:
- Customer segmentation (industry, size, plan type)
- Sales funnel analysis and conversion rates
- Account health scoring (product usage + CRM data)
Product Data (Postgres to MotherDuck)
What: User actions, feature usage, session data, technical metrics
Use Cases:
- Feature adoption and engagement trends
- User journey analysis (onboarding to activation to retention)
- Technical performance monitoring
Subscription/Revenue Data (Hyperline)
What: Subscription plans, pricing, MRR/ARR, upgrades/downgrades, churn
Use Cases:
- Revenue reporting and forecasting
- Pricing optimization analysis
- Expansion and contraction trends
- Customer lifetime value (LTV) calculations
Behavioral Analytics (Amplitude)
What: Event-level user interactions, funnels, retention cohorts, user paths
Use Cases:
- Product engagement deep-dives
- A/B test analysis
- Retention and churn prediction
- Feature experimentation and optimization