LMNT: QA/Investigation Framework for Shivani’s Questions
Date Created: 2026-02-16
Purpose: Framework for answering Shivani’s ad-hoc questions systematically
Framework Structure
For each question, follow this pattern:
- THINKING: What we know, what’s missing, what needs investigation
- SUMMARY: Answer/approach to take
- INVESTIGATION STEPS: Specific actions to get the answer
Question 1: “Have you been able to join any Walmart and Target data?”
THINKING
What we know:
- Awaish confirmed: Walmart and Target POS data ARE joined in
fact_salestable - They’re joined via products table (subject fields)
- Walmart has ADDITIONAL data in separate table:
retail_fact_walmart_only_sales(order management system, not POS) - Target only sends POS data (simpler structure - just daily sales and daily inventory tables)
- Walmart sends 12 tables from Emerson, Target sends 3 tables
- Different table structures between Walmart and Target
What’s missing:
- Exact join logic/keys used
- Whether geographic data is preserved in the join
- Date alignment between sources
- Any data quality issues from the join
What needs investigation:
- Check
fact_salestable structure in Snowflake - Verify join keys and logic in dbt models
- Test query:
SELECT retailer, SUM(sales) FROM fact_sales GROUP BY retailer - Check if there are NULLs or data quality issues
SUMMARY
Answer: Yes, Walmart and Target POS data are joined in fact_sales table. However:
- Walmart total revenue requires joining TWO tables:
fact_sales+retail_fact_walmart_only_sales - Target revenue is complete in
fact_salesalone - For apples-to-apples comparison: Use only
fact_sales(POS data only) - For total Walmart revenue: Must join both Walmart tables
Investigation Steps:
- Query
fact_salesto confirm both retailers present - Check row counts by retailer
- Verify date ranges match between retailers
- Test the join for
retail_fact_walmart_only_salesto get total Walmart revenue
Question 2: “What were sales across those two stores yesterday? Would you be able to find it?”
THINKING
What we know:
- Data exists in
fact_salestable - Awaish is working on joining Walmart tables (in progress)
- Date field exists (need to verify exact field name and format)
What’s missing:
- Exact date field name (
sale_date,transaction_date,week_ending_date?) - Date format (timestamp vs date)
- Whether “yesterday” means calendar date or business date
- Timezone considerations
- Whether data is loaded daily or weekly
What needs investigation:
- Check table schema for date fields
- Verify data freshness (when was last data loaded?)
- Test query with date filter
- Check if there’s a lag in data loading
SUMMARY
Answer: Yes, can find it, but need to clarify:
- For POS data only: Query
fact_salesWHERE date = yesterday - For total Walmart: Need to join
retail_fact_walmart_only_sales(Awaish working on this) - Clarify: Does “yesterday” mean calendar date or business date?
- Check: Is data loaded daily or weekly? (Walmart/Target might send weekly batches)
Investigation Steps:
- Check
fact_salesschema for date field - Query:
SELECT MAX(date) FROM fact_salesto see latest data - Test:
SELECT retailer, SUM(sales) FROM fact_sales WHERE date = CURRENT_DATE - 1 GROUP BY retailer - If no data for yesterday, check data loading schedule
- Document date field name and format for future queries
Question 3: “What are you actually seeing in the delta between the Target and Walmart data right now?”
THINKING
What we know:
- Different table structures (Walmart: 12 tables, Target: 3 tables)
- Walmart sends 2 types of data (POS + order management), Target only POS
- Different data sources (both via Emerson but different formats)
- Awaish mentioned joining them but didn’t specify differences found
What’s missing:
- Actual numeric differences (revenue, units, etc.)
- Definition differences (how they define “sales”, “revenue”, “units”)
- Date/time differences (week ending dates, recognition timing)
- Field mapping differences
- Data quality differences (completeness, accuracy)
What needs investigation:
- Run comparison queries: Walmart POS vs Target POS
- Check field definitions in documentation
- Compare date fields (week ending vs daily)
- Look for NULLs or missing data patterns
- Check if there are reconciliation issues
SUMMARY
Answer: Need to investigate systematically. Key areas to check:
1. Definition Differences:
- How each defines “sales” (gross vs net, before/after discounts)
- Revenue recognition timing
- Week ending date definitions (Target might use different week boundaries)
2. Structural Differences:
- Walmart: 12 tables (more granular)
- Target: 3 tables (simpler - daily sales + daily inventory)
- Walmart has order management data Target doesn’t have
3. Data Quality Differences:
- Completeness (are there gaps?)
- Accuracy (do numbers reconcile?)
- Timeliness (loading schedules)
Investigation Steps:
- Query both retailers’ POS data from
fact_salesfor same time period - Compare:
SELECT retailer, SUM(sales), COUNT(*), MIN(date), MAX(date) FROM fact_sales GROUP BY retailer - Check documentation for field definitions (Awaish mentioned docs exist)
- Look for NULL patterns:
SELECT retailer, COUNT(*) as nulls FROM fact_sales WHERE sales IS NULL GROUP BY retailer - Compare week-over-week patterns to see if timing differs
- Document findings in a comparison matrix
Question 4: “How is Target performing versus Walmart right now?”
THINKING
What we know:
- Both retailers’ data exists in
fact_sales - Can compare POS data directly
- Need to account for Walmart’s additional order management data
What’s missing:
- Time period definition (“right now” = this week? this month? YTD?)
- Metric definition (revenue? units? growth rate?)
- Context (is this expected? what’s the baseline?)
- Geographic breakdown (if needed)
What needs investigation:
- Define time period
- Choose comparison metric
- Calculate growth rates if needed
- Check if geographic segmentation is available
SUMMARY
Answer: Can answer this, but need to clarify scope:
For POS comparison (apples-to-apples):
- Query
fact_salesfiltered by date range - Compare:
SELECT retailer, SUM(sales), COUNT(DISTINCT product), AVG(sales) FROM fact_sales WHERE date >= [start_date] GROUP BY retailer
For total Walmart (includes order management):
- Join
fact_sales+retail_fact_walmart_only_sales - Note: This won’t be apples-to-apples with Target
Investigation Steps:
- Clarify time period: “right now” = this week? this month? YTD?
- Query POS comparison:
SELECT retailer, SUM(sales) as revenue, COUNT(DISTINCT date) as days, AVG(sales) as avg_daily FROM fact_sales WHERE date >= [period_start] GROUP BY retailer - Calculate growth rates if comparing to previous period
- Add product/category breakdown if needed
- Present findings with context (which is performing better and why)
Question 5: “What about Target in the Northeast versus Target in California?”
THINKING
What we know:
- Target data exists in
fact_sales - Geographic data might be in the raw tables
- Awaish mentioned Target sends daily sales and daily inventory tables
What’s missing:
- Whether geographic fields exist in
fact_sales - Field names for geography (region, state, store_id, etc.)
- How geography is structured (state level? region level? store level?)
- Whether this data is preserved in the join/modeling
What needs investigation:
- Check
fact_salesschema for geographic fields - Check raw Target tables for geographic data
- Verify if geography was included in the model
- Test query with geographic filter
SUMMARY
Answer: Unknown - need to investigate geographic data availability.
Possible scenarios:
- If geographic data exists: Can answer immediately with filtered query
- If geographic data exists in raw but not modeled: Need to add to model or query raw tables
- If geographic data doesn’t exist: Need to check with Target/Emerson about data availability
Investigation Steps:
- Check
fact_salesschema:DESCRIBE TABLE fact_sales- look for fields likeregion,state,store_id,geography - Check raw Target tables: Look at the 3 Target tables from Emerson share
- Query test:
SELECT region, SUM(sales) FROM fact_sales WHERE retailer = 'Target' GROUP BY region - If no geographic fields found, check dbt models to see if they were dropped
- If not available, ask Awaish if geographic data exists in raw tables
- Document findings and next steps
Question 6: “What can I actually glean from retail data right now?”
THINKING
What we know:
- Retail mart exists with
fact_salesandretail_fact_walmart_only_sales - Data is modeled and available
- Shivani is frustrated that data exists but isn’t actionable
What’s missing:
- What specific questions CAN be answered right now
- What’s blocked (Walmart join, geographic data, etc.)
- What reporting exists vs what needs to be built
- What self-service capabilities exist
What needs investigation:
- List all available fields in retail tables
- Test common business questions
- Identify gaps vs requirements
- Check what reporting/visualization exists
SUMMARY
Answer: Can answer these questions RIGHT NOW:
✅ Can Answer:
- Total POS sales by retailer (Walmart vs Target)
- Sales trends over time (daily/weekly)
- Product performance (if product dimension exists)
- Basic comparisons (Walmart POS vs Target POS)
⚠️ Partially Can Answer:
- Total Walmart revenue (need to join 2 tables - Awaish working on this)
- Category performance (if category field exists)
❌ Cannot Answer Yet:
- Geographic breakdowns (need to verify if data exists)
- Store-level performance (need to verify)
- Advanced analytics (cohorts, retention, etc.)
Investigation Steps:
- Document all available fields in retail tables
- Create a “What Can I Answer” matrix
- Test 5-10 common business questions
- Document gaps and blockers
- Create a roadmap: “Here’s what you can do today, here’s what’s coming next week”
Question 7: “Where’s that orders table?” (Access Issue)
THINKING
What we know:
- Robert couldn’t access it when Shivani asked
- Orders table exists:
RAW.POLYTOMIC.SHOPIFY.orders - Access/permissions issue in Snowflake
- Awaish said he’d create new user and send credentials
What’s missing:
- Whether access has been granted
- Exact table path/location
- Whether it’s the right table (Shopify orders vs retail orders)
- How to access it (Snowflake UI vs SQL client)
What needs investigation:
- Verify Snowflake access
- Confirm table location
- Test query access
- Document access process for future
SUMMARY
Answer:
- Table location:
RAW.POLYTOMIC.SHOPIFY.orders - Issue: Access/permissions (Awaish creating new user)
- Note: This is Shopify orders (e-commerce + wholesale), NOT retail orders
Investigation Steps:
- Verify Snowflake access has been granted (check email from Awaish)
- Test access:
SELECT * FROM RAW.POLYTOMIC.SHOPIFY.orders LIMIT 10 - If access denied, follow up with Awaish
- Document full table path and access method
- Clarify with Shivani: Does she want Shopify orders or retail orders?
Question 8: “How are Walmart and Target defining revenue differently?”
THINKING
What we know:
- Different table structures suggest different definitions
- Walmart sends POS + order management data
- Target sends only POS data
- Robert speculated: “Maybe Walmart recognizes revenue at a different time than Target”
- Awaish mentioned they need to discuss definitions with Will (retail team)
What’s missing:
- Actual field definitions from documentation
- Revenue calculation logic in each source
- Timing differences (when revenue is recognized)
- Whether discounts/refunds are handled differently
- Whether there are reconciliation differences
What needs investigation:
- Check documentation (Awaish mentioned docs exist)
- Compare revenue fields in both sources
- Check for discount/refund fields
- Look at date fields (sale date vs recognition date)
- Test calculations to see if they match
SUMMARY
Answer: Need to investigate systematically. Key differences likely:
1. Data Structure:
- Walmart: 12 tables (more granular, includes order management)
- Target: 3 tables (simpler, POS only)
2. Revenue Recognition:
- May differ in timing (sale date vs shipment date vs recognition date)
- May differ in what’s included (gross vs net, discounts, refunds)
3. Data Completeness:
- Walmart has order management data Target doesn’t have
- This creates apples-to-oranges comparison if not handled correctly
Investigation Steps:
- Review documentation for both retailers (Awaish mentioned docs exist)
- Compare revenue fields:
SELECT retailer, revenue_field, COUNT(*), SUM(revenue_field) FROM [tables] GROUP BY retailer, revenue_field - Check for discount/refund fields in both
- Compare date fields (sale_date vs recognition_date)
- Test: Calculate revenue using same logic for both, compare results
- Document differences in a comparison table
- Schedule meeting with Will (retail team) to align definitions
Question 9: “Why doesn’t our wholesale revenue match finance?”
THINKING
What we know:
- Finance pulled data from accounting system (QuickBooks/NetSuite)
- Discrepancy is <10% (Awaish said “probably less than 10%”)
- Wholesale team doesn’t have issues with the data
- Amber is working on reconciliation
- Finance uses different system than operational data
What’s missing:
- Exact discrepancy amount
- Root cause (timing? definitions? exclusions?)
- What finance is including/excluding
- Whether this is expected (accounting vs operational data)
- Reconciliation approach
What needs investigation:
- Get finance’s numbers and methodology
- Compare field-by-field
- Check timing differences
- Identify definition differences
- Document acceptable variance
SUMMARY
Answer: This is EXPECTED and NORMAL. Key points:
Why It Doesn’t Match:
- Different Systems: Finance uses QuickBooks/NetSuite (accounting), we use Shopify (operational)
- Different Purposes: Finance = GAAP accounting, Operations = business metrics
- Timing Differences: Revenue recognition timing (accrual vs cash)
- Definition Differences: What counts as revenue (gross vs net, refunds, etc.)
- Exclusions: Finance may exclude certain transactions
<10% Variance is Normal:
- Accounting systems recognize revenue differently
- Operational data is more real-time
- Different cut-off dates
- This is standard across clients
Investigation Steps:
- Get finance’s exact numbers and date range
- Compare: Our numbers vs Finance numbers side-by-side
- Identify differences:
- Timing (when revenue recognized)
- Definitions (gross vs net)
- Exclusions (what finance excludes)
- Cut-off dates
- Document acceptable variance threshold (<10% is reasonable)
- Create reconciliation report showing differences and explanations
- Set expectation: Perfect match is unrealistic, <10% is acceptable
Question 10: “How do we define net revenue?”
THINKING
What we know:
- Awaish mentioned: “What is the definition for that, for example, net revenue? Either we had revenue minus discounts, or refunds, or what?”
- They met with Will once but haven’t finalized definitions
- Similar to wholesale: They built mart first, then got feedback on definitions
- Need to align with retail team (Will) on definitions
What’s missing:
- Actual definition from Element/retail team
- Whether discounts are included/excluded
- Whether refunds are included/excluded
- Whether shipping is included
- Whether taxes are included
- Standard definition vs Element-specific definition
What needs investigation:
- Check if definition exists in documentation
- Schedule meeting with Will to align
- Check what’s currently in the model
- Compare to industry standards
- Document agreed definition
SUMMARY
Answer: Definition NOT YET FINALIZED. Need to align with retail team.
Current Status:
- Awaish met with Will once (discovery call)
- Built base models with assumptions
- Waiting for feedback on definitions
- Similar pattern to wholesale: Build first, then refine
Typical Net Revenue Definition:
- Gross Revenue
- Minus Discounts
- Minus Refunds/Returns
- (May or may not include shipping/taxes)
Investigation Steps:
- Check current model: What definition is currently used?
- Review discovery call notes with Will
- Schedule follow-up meeting with Will to finalize:
- Net revenue = Gross - Discounts - Refunds?
- What about shipping?
- What about taxes?
- Any other exclusions?
- Document agreed definition
- Update models if needed
- Create definition document for reference
Question 11: “Can we just ship me all the orders?” (Process Question)
THINKING
What we know:
- Shivani asked for raw orders data quickly
- Uttam pushed back: “I can’t get anything in front of you until it’s perfectly formatted”
- Tension: She wants speed, previously demanded quality
- She’s in “sprint mode” now
What’s missing:
- What format she wants (CSV? Excel? Snowflake view?)
- What date range
- Which orders (Shopify? Retail? Both?)
- Whether she wants raw or modeled data
- Process for handling these requests
What needs investigation:
- Clarify requirements
- Set process expectations
- Define “good enough” vs “perfect”
- Create quick export process
SUMMARY
Answer: This is a PROCESS question, not a data question. Need to align on expectations.
The Tension:
- Shivani: Wants speed (“sprint mode”)
- Team: Previously she demanded perfect formatting
- Need: Balance between speed and quality
Solution Approach:
-
Define Tiers:
- Tier 1: Raw export (fast, unformatted)
- Tier 2: Formatted export (slower, formatted)
- Tier 3: Modeled/reporting (slowest, highest quality)
-
Set Expectations:
- Raw data = no formatting, no QA
- Formatted data = formatted, basic QA
- Reporting = full QA, validated
-
Create Process:
- Quick exports: <1 hour, raw data
- Standard exports: <1 day, formatted
- Reporting: Per sprint plan
Investigation Steps:
- Clarify what she needs:
- Format (CSV? Excel? Snowflake view?)
- Date range
- Which orders (Shopify wholesale? Retail? Both?)
- Raw or modeled?
- Set expectations: “I can give you raw data in 30 min, or formatted in 1 day”
- Create quick export process for raw data
- Document process for future requests
- Align on when perfect formatting is needed vs when speed is priority
Question 12: “What’s the next version of this going to look like?” (Retail Reporting)
THINKING
What we know:
- Retail mart exists (
fact_sales,retail_fact_walmart_only_sales) - Phil gave template for retail reporting
- Amber is building models to get data
- Awaish finishing models this week/early next week
- Reporting is in Google Sheets currently
- Omni BI tool demo this week
What’s missing:
- What Phil’s template looks like
- What reporting currently exists
- What’s planned vs what’s ad-hoc
- Timeline for next version
- Whether it moves to Omni or stays in Sheets
What needs investigation:
- Review Phil’s template
- Check current reporting state
- Review Gantt chart/timeline
- Check Omni plans
- Document roadmap
SUMMARY
Answer: Next version depends on what “this” refers to. Need to clarify.
If “this” = Retail Reporting:
- Current: Google Sheets (Amber building)
- Next: Models finishing this week/early next week
- Future: Move to Omni BI tool (demo this week)
If “this” = Retail Data Models:
- Current:
fact_sales(POS data joined) - Next: Join Walmart tables for total revenue (Awaish working on)
- Future: More dimensions (geography, category, etc.)
Investigation Steps:
- Clarify: What does “this” refer to? (Reporting? Models? Both?)
- Review Phil’s template (if available)
- Check current state: What exists now?
- Review Gantt chart: What’s planned?
- Check Omni timeline: When does reporting move there?
- Document roadmap: Current → Next → Future
- Present clear timeline to Shivani
General Investigation Framework
For Any New Question:
-
Clarify the Question
- What exactly is being asked?
- What’s the business context?
- What’s the time period?
- What’s the scope?
-
Check What Exists
- What tables/models exist?
- What fields are available?
- What documentation exists?
- What’s the current state?
-
Identify Gaps
- What’s missing?
- What’s blocked?
- What needs to be built?
- What needs clarification?
-
Investigate
- Query the data
- Check documentation
- Test assumptions
- Document findings
-
Answer or Escalate
- Can answer now? → Answer with data
- Can answer partially? → Answer what you can, note gaps
- Cannot answer? → Explain why, provide path forward
- Need clarification? → Ask specific questions
-
Document
- Document the answer
- Document the investigation process
- Document gaps/findings
- Update knowledge base
Key Resources to Reference
-
Snowflake Tables:
BROADMARTS.wholesale_martBROADMARTS.retail_mart.fact_salesBROADMARTS.retail_mart.retail_fact_walmart_only_salesRAW.POLYTOMIC.SHOPIFY.orders
-
Documentation:
- Table documentation (Awaish mentioned docs exist)
- dbt models (check join logic)
- Gantt chart (timeline)
-
People:
- Awaish: Data engineering, table structure
- Will: Retail team, definitions
- Phil: Reporting requirements
- Amber: Reporting/reconciliation
-
Process:
- Weekly updates
- Gantt chart tracking
- Sprint planning
Next Steps for Robert
-
Before Monday Meeting:
- Get Snowflake access (follow up with Awaish)
- Review table schemas
- Test basic queries
- Review documentation
-
During Monday Meeting:
- Use this framework to answer questions systematically
- Document gaps as you go
- Set expectations on what can/cannot answer
- Align on process for future questions
-
After Monday Meeting:
- Document answers provided
- Create action items for gaps
- Update this framework with learnings
- Build quick reference guide for common questions