LMNT: QA/Investigation Framework for Shivani’s Questions

Date Created: 2026-02-16
Purpose: Framework for answering Shivani’s ad-hoc questions systematically


Framework Structure

For each question, follow this pattern:

  1. THINKING: What we know, what’s missing, what needs investigation
  2. SUMMARY: Answer/approach to take
  3. INVESTIGATION STEPS: Specific actions to get the answer

Question 1: “Have you been able to join any Walmart and Target data?”

THINKING

What we know:

  • Awaish confirmed: Walmart and Target POS data ARE joined in fact_sales table
  • They’re joined via products table (subject fields)
  • Walmart has ADDITIONAL data in separate table: retail_fact_walmart_only_sales (order management system, not POS)
  • Target only sends POS data (simpler structure - just daily sales and daily inventory tables)
  • Walmart sends 12 tables from Emerson, Target sends 3 tables
  • Different table structures between Walmart and Target

What’s missing:

  • Exact join logic/keys used
  • Whether geographic data is preserved in the join
  • Date alignment between sources
  • Any data quality issues from the join

What needs investigation:

  • Check fact_sales table structure in Snowflake
  • Verify join keys and logic in dbt models
  • Test query: SELECT retailer, SUM(sales) FROM fact_sales GROUP BY retailer
  • Check if there are NULLs or data quality issues

SUMMARY

Answer: Yes, Walmart and Target POS data are joined in fact_sales table. However:

  • Walmart total revenue requires joining TWO tables: fact_sales + retail_fact_walmart_only_sales
  • Target revenue is complete in fact_sales alone
  • For apples-to-apples comparison: Use only fact_sales (POS data only)
  • For total Walmart revenue: Must join both Walmart tables

Investigation Steps:

  1. Query fact_sales to confirm both retailers present
  2. Check row counts by retailer
  3. Verify date ranges match between retailers
  4. Test the join for retail_fact_walmart_only_sales to get total Walmart revenue

Question 2: “What were sales across those two stores yesterday? Would you be able to find it?”

THINKING

What we know:

  • Data exists in fact_sales table
  • Awaish is working on joining Walmart tables (in progress)
  • Date field exists (need to verify exact field name and format)

What’s missing:

  • Exact date field name (sale_date, transaction_date, week_ending_date?)
  • Date format (timestamp vs date)
  • Whether “yesterday” means calendar date or business date
  • Timezone considerations
  • Whether data is loaded daily or weekly

What needs investigation:

  • Check table schema for date fields
  • Verify data freshness (when was last data loaded?)
  • Test query with date filter
  • Check if there’s a lag in data loading

SUMMARY

Answer: Yes, can find it, but need to clarify:

  • For POS data only: Query fact_sales WHERE date = yesterday
  • For total Walmart: Need to join retail_fact_walmart_only_sales (Awaish working on this)
  • Clarify: Does “yesterday” mean calendar date or business date?
  • Check: Is data loaded daily or weekly? (Walmart/Target might send weekly batches)

Investigation Steps:

  1. Check fact_sales schema for date field
  2. Query: SELECT MAX(date) FROM fact_sales to see latest data
  3. Test: SELECT retailer, SUM(sales) FROM fact_sales WHERE date = CURRENT_DATE - 1 GROUP BY retailer
  4. If no data for yesterday, check data loading schedule
  5. Document date field name and format for future queries

Question 3: “What are you actually seeing in the delta between the Target and Walmart data right now?”

THINKING

What we know:

  • Different table structures (Walmart: 12 tables, Target: 3 tables)
  • Walmart sends 2 types of data (POS + order management), Target only POS
  • Different data sources (both via Emerson but different formats)
  • Awaish mentioned joining them but didn’t specify differences found

What’s missing:

  • Actual numeric differences (revenue, units, etc.)
  • Definition differences (how they define “sales”, “revenue”, “units”)
  • Date/time differences (week ending dates, recognition timing)
  • Field mapping differences
  • Data quality differences (completeness, accuracy)

What needs investigation:

  • Run comparison queries: Walmart POS vs Target POS
  • Check field definitions in documentation
  • Compare date fields (week ending vs daily)
  • Look for NULLs or missing data patterns
  • Check if there are reconciliation issues

SUMMARY

Answer: Need to investigate systematically. Key areas to check:

1. Definition Differences:

  • How each defines “sales” (gross vs net, before/after discounts)
  • Revenue recognition timing
  • Week ending date definitions (Target might use different week boundaries)

2. Structural Differences:

  • Walmart: 12 tables (more granular)
  • Target: 3 tables (simpler - daily sales + daily inventory)
  • Walmart has order management data Target doesn’t have

3. Data Quality Differences:

  • Completeness (are there gaps?)
  • Accuracy (do numbers reconcile?)
  • Timeliness (loading schedules)

Investigation Steps:

  1. Query both retailers’ POS data from fact_sales for same time period
  2. Compare: SELECT retailer, SUM(sales), COUNT(*), MIN(date), MAX(date) FROM fact_sales GROUP BY retailer
  3. Check documentation for field definitions (Awaish mentioned docs exist)
  4. Look for NULL patterns: SELECT retailer, COUNT(*) as nulls FROM fact_sales WHERE sales IS NULL GROUP BY retailer
  5. Compare week-over-week patterns to see if timing differs
  6. Document findings in a comparison matrix

Question 4: “How is Target performing versus Walmart right now?”

THINKING

What we know:

  • Both retailers’ data exists in fact_sales
  • Can compare POS data directly
  • Need to account for Walmart’s additional order management data

What’s missing:

  • Time period definition (“right now” = this week? this month? YTD?)
  • Metric definition (revenue? units? growth rate?)
  • Context (is this expected? what’s the baseline?)
  • Geographic breakdown (if needed)

What needs investigation:

  • Define time period
  • Choose comparison metric
  • Calculate growth rates if needed
  • Check if geographic segmentation is available

SUMMARY

Answer: Can answer this, but need to clarify scope:

For POS comparison (apples-to-apples):

  • Query fact_sales filtered by date range
  • Compare: SELECT retailer, SUM(sales), COUNT(DISTINCT product), AVG(sales) FROM fact_sales WHERE date >= [start_date] GROUP BY retailer

For total Walmart (includes order management):

  • Join fact_sales + retail_fact_walmart_only_sales
  • Note: This won’t be apples-to-apples with Target

Investigation Steps:

  1. Clarify time period: “right now” = this week? this month? YTD?
  2. Query POS comparison: SELECT retailer, SUM(sales) as revenue, COUNT(DISTINCT date) as days, AVG(sales) as avg_daily FROM fact_sales WHERE date >= [period_start] GROUP BY retailer
  3. Calculate growth rates if comparing to previous period
  4. Add product/category breakdown if needed
  5. Present findings with context (which is performing better and why)

Question 5: “What about Target in the Northeast versus Target in California?”

THINKING

What we know:

  • Target data exists in fact_sales
  • Geographic data might be in the raw tables
  • Awaish mentioned Target sends daily sales and daily inventory tables

What’s missing:

  • Whether geographic fields exist in fact_sales
  • Field names for geography (region, state, store_id, etc.)
  • How geography is structured (state level? region level? store level?)
  • Whether this data is preserved in the join/modeling

What needs investigation:

  • Check fact_sales schema for geographic fields
  • Check raw Target tables for geographic data
  • Verify if geography was included in the model
  • Test query with geographic filter

SUMMARY

Answer: Unknown - need to investigate geographic data availability.

Possible scenarios:

  1. If geographic data exists: Can answer immediately with filtered query
  2. If geographic data exists in raw but not modeled: Need to add to model or query raw tables
  3. If geographic data doesn’t exist: Need to check with Target/Emerson about data availability

Investigation Steps:

  1. Check fact_sales schema: DESCRIBE TABLE fact_sales - look for fields like region, state, store_id, geography
  2. Check raw Target tables: Look at the 3 Target tables from Emerson share
  3. Query test: SELECT region, SUM(sales) FROM fact_sales WHERE retailer = 'Target' GROUP BY region
  4. If no geographic fields found, check dbt models to see if they were dropped
  5. If not available, ask Awaish if geographic data exists in raw tables
  6. Document findings and next steps

Question 6: “What can I actually glean from retail data right now?”

THINKING

What we know:

  • Retail mart exists with fact_sales and retail_fact_walmart_only_sales
  • Data is modeled and available
  • Shivani is frustrated that data exists but isn’t actionable

What’s missing:

  • What specific questions CAN be answered right now
  • What’s blocked (Walmart join, geographic data, etc.)
  • What reporting exists vs what needs to be built
  • What self-service capabilities exist

What needs investigation:

  • List all available fields in retail tables
  • Test common business questions
  • Identify gaps vs requirements
  • Check what reporting/visualization exists

SUMMARY

Answer: Can answer these questions RIGHT NOW:

✅ Can Answer:

  1. Total POS sales by retailer (Walmart vs Target)
  2. Sales trends over time (daily/weekly)
  3. Product performance (if product dimension exists)
  4. Basic comparisons (Walmart POS vs Target POS)

⚠️ Partially Can Answer:

  1. Total Walmart revenue (need to join 2 tables - Awaish working on this)
  2. Category performance (if category field exists)

❌ Cannot Answer Yet:

  1. Geographic breakdowns (need to verify if data exists)
  2. Store-level performance (need to verify)
  3. Advanced analytics (cohorts, retention, etc.)

Investigation Steps:

  1. Document all available fields in retail tables
  2. Create a “What Can I Answer” matrix
  3. Test 5-10 common business questions
  4. Document gaps and blockers
  5. Create a roadmap: “Here’s what you can do today, here’s what’s coming next week”

Question 7: “Where’s that orders table?” (Access Issue)

THINKING

What we know:

  • Robert couldn’t access it when Shivani asked
  • Orders table exists: RAW.POLYTOMIC.SHOPIFY.orders
  • Access/permissions issue in Snowflake
  • Awaish said he’d create new user and send credentials

What’s missing:

  • Whether access has been granted
  • Exact table path/location
  • Whether it’s the right table (Shopify orders vs retail orders)
  • How to access it (Snowflake UI vs SQL client)

What needs investigation:

  • Verify Snowflake access
  • Confirm table location
  • Test query access
  • Document access process for future

SUMMARY

Answer:

  • Table location: RAW.POLYTOMIC.SHOPIFY.orders
  • Issue: Access/permissions (Awaish creating new user)
  • Note: This is Shopify orders (e-commerce + wholesale), NOT retail orders

Investigation Steps:

  1. Verify Snowflake access has been granted (check email from Awaish)
  2. Test access: SELECT * FROM RAW.POLYTOMIC.SHOPIFY.orders LIMIT 10
  3. If access denied, follow up with Awaish
  4. Document full table path and access method
  5. Clarify with Shivani: Does she want Shopify orders or retail orders?

Question 8: “How are Walmart and Target defining revenue differently?”

THINKING

What we know:

  • Different table structures suggest different definitions
  • Walmart sends POS + order management data
  • Target sends only POS data
  • Robert speculated: “Maybe Walmart recognizes revenue at a different time than Target”
  • Awaish mentioned they need to discuss definitions with Will (retail team)

What’s missing:

  • Actual field definitions from documentation
  • Revenue calculation logic in each source
  • Timing differences (when revenue is recognized)
  • Whether discounts/refunds are handled differently
  • Whether there are reconciliation differences

What needs investigation:

  • Check documentation (Awaish mentioned docs exist)
  • Compare revenue fields in both sources
  • Check for discount/refund fields
  • Look at date fields (sale date vs recognition date)
  • Test calculations to see if they match

SUMMARY

Answer: Need to investigate systematically. Key differences likely:

1. Data Structure:

  • Walmart: 12 tables (more granular, includes order management)
  • Target: 3 tables (simpler, POS only)

2. Revenue Recognition:

  • May differ in timing (sale date vs shipment date vs recognition date)
  • May differ in what’s included (gross vs net, discounts, refunds)

3. Data Completeness:

  • Walmart has order management data Target doesn’t have
  • This creates apples-to-oranges comparison if not handled correctly

Investigation Steps:

  1. Review documentation for both retailers (Awaish mentioned docs exist)
  2. Compare revenue fields: SELECT retailer, revenue_field, COUNT(*), SUM(revenue_field) FROM [tables] GROUP BY retailer, revenue_field
  3. Check for discount/refund fields in both
  4. Compare date fields (sale_date vs recognition_date)
  5. Test: Calculate revenue using same logic for both, compare results
  6. Document differences in a comparison table
  7. Schedule meeting with Will (retail team) to align definitions

Question 9: “Why doesn’t our wholesale revenue match finance?”

THINKING

What we know:

  • Finance pulled data from accounting system (QuickBooks/NetSuite)
  • Discrepancy is <10% (Awaish said “probably less than 10%”)
  • Wholesale team doesn’t have issues with the data
  • Amber is working on reconciliation
  • Finance uses different system than operational data

What’s missing:

  • Exact discrepancy amount
  • Root cause (timing? definitions? exclusions?)
  • What finance is including/excluding
  • Whether this is expected (accounting vs operational data)
  • Reconciliation approach

What needs investigation:

  • Get finance’s numbers and methodology
  • Compare field-by-field
  • Check timing differences
  • Identify definition differences
  • Document acceptable variance

SUMMARY

Answer: This is EXPECTED and NORMAL. Key points:

Why It Doesn’t Match:

  1. Different Systems: Finance uses QuickBooks/NetSuite (accounting), we use Shopify (operational)
  2. Different Purposes: Finance = GAAP accounting, Operations = business metrics
  3. Timing Differences: Revenue recognition timing (accrual vs cash)
  4. Definition Differences: What counts as revenue (gross vs net, refunds, etc.)
  5. Exclusions: Finance may exclude certain transactions

<10% Variance is Normal:

  • Accounting systems recognize revenue differently
  • Operational data is more real-time
  • Different cut-off dates
  • This is standard across clients

Investigation Steps:

  1. Get finance’s exact numbers and date range
  2. Compare: Our numbers vs Finance numbers side-by-side
  3. Identify differences:
    • Timing (when revenue recognized)
    • Definitions (gross vs net)
    • Exclusions (what finance excludes)
    • Cut-off dates
  4. Document acceptable variance threshold (<10% is reasonable)
  5. Create reconciliation report showing differences and explanations
  6. Set expectation: Perfect match is unrealistic, <10% is acceptable

Question 10: “How do we define net revenue?”

THINKING

What we know:

  • Awaish mentioned: “What is the definition for that, for example, net revenue? Either we had revenue minus discounts, or refunds, or what?”
  • They met with Will once but haven’t finalized definitions
  • Similar to wholesale: They built mart first, then got feedback on definitions
  • Need to align with retail team (Will) on definitions

What’s missing:

  • Actual definition from Element/retail team
  • Whether discounts are included/excluded
  • Whether refunds are included/excluded
  • Whether shipping is included
  • Whether taxes are included
  • Standard definition vs Element-specific definition

What needs investigation:

  • Check if definition exists in documentation
  • Schedule meeting with Will to align
  • Check what’s currently in the model
  • Compare to industry standards
  • Document agreed definition

SUMMARY

Answer: Definition NOT YET FINALIZED. Need to align with retail team.

Current Status:

  • Awaish met with Will once (discovery call)
  • Built base models with assumptions
  • Waiting for feedback on definitions
  • Similar pattern to wholesale: Build first, then refine

Typical Net Revenue Definition:

  • Gross Revenue
  • Minus Discounts
  • Minus Refunds/Returns
  • (May or may not include shipping/taxes)

Investigation Steps:

  1. Check current model: What definition is currently used?
  2. Review discovery call notes with Will
  3. Schedule follow-up meeting with Will to finalize:
    • Net revenue = Gross - Discounts - Refunds?
    • What about shipping?
    • What about taxes?
    • Any other exclusions?
  4. Document agreed definition
  5. Update models if needed
  6. Create definition document for reference

Question 11: “Can we just ship me all the orders?” (Process Question)

THINKING

What we know:

  • Shivani asked for raw orders data quickly
  • Uttam pushed back: “I can’t get anything in front of you until it’s perfectly formatted”
  • Tension: She wants speed, previously demanded quality
  • She’s in “sprint mode” now

What’s missing:

  • What format she wants (CSV? Excel? Snowflake view?)
  • What date range
  • Which orders (Shopify? Retail? Both?)
  • Whether she wants raw or modeled data
  • Process for handling these requests

What needs investigation:

  • Clarify requirements
  • Set process expectations
  • Define “good enough” vs “perfect”
  • Create quick export process

SUMMARY

Answer: This is a PROCESS question, not a data question. Need to align on expectations.

The Tension:

  • Shivani: Wants speed (“sprint mode”)
  • Team: Previously she demanded perfect formatting
  • Need: Balance between speed and quality

Solution Approach:

  1. Define Tiers:

    • Tier 1: Raw export (fast, unformatted)
    • Tier 2: Formatted export (slower, formatted)
    • Tier 3: Modeled/reporting (slowest, highest quality)
  2. Set Expectations:

    • Raw data = no formatting, no QA
    • Formatted data = formatted, basic QA
    • Reporting = full QA, validated
  3. Create Process:

    • Quick exports: <1 hour, raw data
    • Standard exports: <1 day, formatted
    • Reporting: Per sprint plan

Investigation Steps:

  1. Clarify what she needs:
    • Format (CSV? Excel? Snowflake view?)
    • Date range
    • Which orders (Shopify wholesale? Retail? Both?)
    • Raw or modeled?
  2. Set expectations: “I can give you raw data in 30 min, or formatted in 1 day”
  3. Create quick export process for raw data
  4. Document process for future requests
  5. Align on when perfect formatting is needed vs when speed is priority

Question 12: “What’s the next version of this going to look like?” (Retail Reporting)

THINKING

What we know:

  • Retail mart exists (fact_sales, retail_fact_walmart_only_sales)
  • Phil gave template for retail reporting
  • Amber is building models to get data
  • Awaish finishing models this week/early next week
  • Reporting is in Google Sheets currently
  • Omni BI tool demo this week

What’s missing:

  • What Phil’s template looks like
  • What reporting currently exists
  • What’s planned vs what’s ad-hoc
  • Timeline for next version
  • Whether it moves to Omni or stays in Sheets

What needs investigation:

  • Review Phil’s template
  • Check current reporting state
  • Review Gantt chart/timeline
  • Check Omni plans
  • Document roadmap

SUMMARY

Answer: Next version depends on what “this” refers to. Need to clarify.

If “this” = Retail Reporting:

  • Current: Google Sheets (Amber building)
  • Next: Models finishing this week/early next week
  • Future: Move to Omni BI tool (demo this week)

If “this” = Retail Data Models:

  • Current: fact_sales (POS data joined)
  • Next: Join Walmart tables for total revenue (Awaish working on)
  • Future: More dimensions (geography, category, etc.)

Investigation Steps:

  1. Clarify: What does “this” refer to? (Reporting? Models? Both?)
  2. Review Phil’s template (if available)
  3. Check current state: What exists now?
  4. Review Gantt chart: What’s planned?
  5. Check Omni timeline: When does reporting move there?
  6. Document roadmap: Current → Next → Future
  7. Present clear timeline to Shivani

General Investigation Framework

For Any New Question:

  1. Clarify the Question

    • What exactly is being asked?
    • What’s the business context?
    • What’s the time period?
    • What’s the scope?
  2. Check What Exists

    • What tables/models exist?
    • What fields are available?
    • What documentation exists?
    • What’s the current state?
  3. Identify Gaps

    • What’s missing?
    • What’s blocked?
    • What needs to be built?
    • What needs clarification?
  4. Investigate

    • Query the data
    • Check documentation
    • Test assumptions
    • Document findings
  5. Answer or Escalate

    • Can answer now? → Answer with data
    • Can answer partially? → Answer what you can, note gaps
    • Cannot answer? → Explain why, provide path forward
    • Need clarification? → Ask specific questions
  6. Document

    • Document the answer
    • Document the investigation process
    • Document gaps/findings
    • Update knowledge base

Key Resources to Reference

  1. Snowflake Tables:

    • BROADMARTS.wholesale_mart
    • BROADMARTS.retail_mart.fact_sales
    • BROADMARTS.retail_mart.retail_fact_walmart_only_sales
    • RAW.POLYTOMIC.SHOPIFY.orders
  2. Documentation:

    • Table documentation (Awaish mentioned docs exist)
    • dbt models (check join logic)
    • Gantt chart (timeline)
  3. People:

    • Awaish: Data engineering, table structure
    • Will: Retail team, definitions
    • Phil: Reporting requirements
    • Amber: Reporting/reconciliation
  4. Process:

    • Weekly updates
    • Gantt chart tracking
    • Sprint planning

Next Steps for Robert

  1. Before Monday Meeting:

    • Get Snowflake access (follow up with Awaish)
    • Review table schemas
    • Test basic queries
    • Review documentation
  2. During Monday Meeting:

    • Use this framework to answer questions systematically
    • Document gaps as you go
    • Set expectations on what can/cannot answer
    • Align on process for future questions
  3. After Monday Meeting:

    • Document answers provided
    • Create action items for gaps
    • Update this framework with learnings
    • Build quick reference guide for common questions