LMNT: QA/Investigation Framework for Shivani’s Questions

Date Created: 2026-02-16
Purpose: Framework for answering Shivani’s ad-hoc questions systematically

Framework Structure

For each question, follow this pattern:

THINKING: What we know, what’s missing, what needs investigation
SUMMARY: Answer/approach to take
INVESTIGATION STEPS: Specific actions to get the answer

Question 1: “Have you been able to join any Walmart and Target data?”

THINKING

What we know:

Awaish confirmed: Walmart and Target POS data ARE joined in fact_sales table
They’re joined via products table (subject fields)
Walmart has ADDITIONAL data in separate table: retail_fact_walmart_only_sales (order management system, not POS)
Target only sends POS data (simpler structure - just daily sales and daily inventory tables)
Walmart sends 12 tables from Emerson, Target sends 3 tables
Different table structures between Walmart and Target

What’s missing:

Exact join logic/keys used
Whether geographic data is preserved in the join
Date alignment between sources
Any data quality issues from the join

What needs investigation:

Check fact_sales table structure in Snowflake
Verify join keys and logic in dbt models
Test query: SELECT retailer, SUM(sales) FROM fact_sales GROUP BY retailer
Check if there are NULLs or data quality issues

SUMMARY

Answer: Yes, Walmart and Target POS data are joined in fact_sales table. However:

Walmart total revenue requires joining TWO tables: fact_sales + retail_fact_walmart_only_sales
Target revenue is complete in fact_sales alone
For apples-to-apples comparison: Use only fact_sales (POS data only)
For total Walmart revenue: Must join both Walmart tables

Investigation Steps:

Query fact_sales to confirm both retailers present
Check row counts by retailer
Verify date ranges match between retailers
Test the join for retail_fact_walmart_only_sales to get total Walmart revenue

Question 2: “What were sales across those two stores yesterday? Would you be able to find it?”

THINKING

What we know:

Data exists in fact_sales table
Awaish is working on joining Walmart tables (in progress)
Date field exists (need to verify exact field name and format)

What’s missing:

Exact date field name (sale_date, transaction_date, week_ending_date?)
Date format (timestamp vs date)
Whether “yesterday” means calendar date or business date
Timezone considerations
Whether data is loaded daily or weekly

What needs investigation:

Check table schema for date fields
Verify data freshness (when was last data loaded?)
Test query with date filter
Check if there’s a lag in data loading

SUMMARY

Answer: Yes, can find it, but need to clarify:

For POS data only: Query fact_sales WHERE date = yesterday
For total Walmart: Need to join retail_fact_walmart_only_sales (Awaish working on this)
Clarify: Does “yesterday” mean calendar date or business date?
Check: Is data loaded daily or weekly? (Walmart/Target might send weekly batches)

Investigation Steps:

Check fact_sales schema for date field
Query: SELECT MAX(date) FROM fact_sales to see latest data
Test: SELECT retailer, SUM(sales) FROM fact_sales WHERE date = CURRENT_DATE - 1 GROUP BY retailer
If no data for yesterday, check data loading schedule
Document date field name and format for future queries

Question 3: “What are you actually seeing in the delta between the Target and Walmart data right now?”

THINKING

What we know:

Different table structures (Walmart: 12 tables, Target: 3 tables)
Walmart sends 2 types of data (POS + order management), Target only POS
Different data sources (both via Emerson but different formats)
Awaish mentioned joining them but didn’t specify differences found

What’s missing:

Actual numeric differences (revenue, units, etc.)
Definition differences (how they define “sales”, “revenue”, “units”)
Date/time differences (week ending dates, recognition timing)
Field mapping differences
Data quality differences (completeness, accuracy)

What needs investigation:

Run comparison queries: Walmart POS vs Target POS
Check field definitions in documentation
Compare date fields (week ending vs daily)
Look for NULLs or missing data patterns
Check if there are reconciliation issues

SUMMARY

Answer: Need to investigate systematically. Key areas to check:

1. Definition Differences:

How each defines “sales” (gross vs net, before/after discounts)
Revenue recognition timing
Week ending date definitions (Target might use different week boundaries)

2. Structural Differences:

Walmart: 12 tables (more granular)
Target: 3 tables (simpler - daily sales + daily inventory)
Walmart has order management data Target doesn’t have

3. Data Quality Differences:

Completeness (are there gaps?)
Accuracy (do numbers reconcile?)
Timeliness (loading schedules)

Investigation Steps:

Query both retailers’ POS data from fact_sales for same time period
Compare: SELECT retailer, SUM(sales), COUNT(*), MIN(date), MAX(date) FROM fact_sales GROUP BY retailer
Check documentation for field definitions (Awaish mentioned docs exist)
Look for NULL patterns: SELECT retailer, COUNT(*) as nulls FROM fact_sales WHERE sales IS NULL GROUP BY retailer
Compare week-over-week patterns to see if timing differs
Document findings in a comparison matrix

Question 4: “How is Target performing versus Walmart right now?”

THINKING

What we know:

Both retailers’ data exists in fact_sales
Can compare POS data directly
Need to account for Walmart’s additional order management data

What’s missing:

Time period definition (“right now” = this week? this month? YTD?)
Metric definition (revenue? units? growth rate?)
Context (is this expected? what’s the baseline?)
Geographic breakdown (if needed)

What needs investigation:

Define time period
Choose comparison metric
Calculate growth rates if needed
Check if geographic segmentation is available

SUMMARY

Answer: Can answer this, but need to clarify scope:

For POS comparison (apples-to-apples):

Query fact_sales filtered by date range
Compare: SELECT retailer, SUM(sales), COUNT(DISTINCT product), AVG(sales) FROM fact_sales WHERE date >= [start_date] GROUP BY retailer

For total Walmart (includes order management):

Join fact_sales + retail_fact_walmart_only_sales
Note: This won’t be apples-to-apples with Target

Investigation Steps:

Clarify time period: “right now” = this week? this month? YTD?
Query POS comparison: SELECT retailer, SUM(sales) as revenue, COUNT(DISTINCT date) as days, AVG(sales) as avg_daily FROM fact_sales WHERE date >= [period_start] GROUP BY retailer
Calculate growth rates if comparing to previous period
Add product/category breakdown if needed
Present findings with context (which is performing better and why)

Question 5: “What about Target in the Northeast versus Target in California?”

THINKING

What we know:

Target data exists in fact_sales
Geographic data might be in the raw tables
Awaish mentioned Target sends daily sales and daily inventory tables

What’s missing:

Whether geographic fields exist in fact_sales
Field names for geography (region, state, store_id, etc.)
How geography is structured (state level? region level? store level?)
Whether this data is preserved in the join/modeling

What needs investigation:

Check fact_sales schema for geographic fields
Check raw Target tables for geographic data
Verify if geography was included in the model
Test query with geographic filter

SUMMARY

Answer: Unknown - need to investigate geographic data availability.

Possible scenarios:

If geographic data exists: Can answer immediately with filtered query
If geographic data exists in raw but not modeled: Need to add to model or query raw tables
If geographic data doesn’t exist: Need to check with Target/Emerson about data availability

Investigation Steps:

Check fact_sales schema: DESCRIBE TABLE fact_sales - look for fields like region, state, store_id, geography
Check raw Target tables: Look at the 3 Target tables from Emerson share
Query test: SELECT region, SUM(sales) FROM fact_sales WHERE retailer = 'Target' GROUP BY region
If no geographic fields found, check dbt models to see if they were dropped
If not available, ask Awaish if geographic data exists in raw tables
Document findings and next steps

Question 6: “What can I actually glean from retail data right now?”

THINKING

What we know:

Retail mart exists with fact_sales and retail_fact_walmart_only_sales
Data is modeled and available
Shivani is frustrated that data exists but isn’t actionable

What’s missing:

What specific questions CAN be answered right now
What’s blocked (Walmart join, geographic data, etc.)
What reporting exists vs what needs to be built
What self-service capabilities exist

What needs investigation:

List all available fields in retail tables
Test common business questions
Identify gaps vs requirements
Check what reporting/visualization exists

SUMMARY

Answer: Can answer these questions RIGHT NOW:

✅ Can Answer:

Total POS sales by retailer (Walmart vs Target)
Sales trends over time (daily/weekly)
Product performance (if product dimension exists)
Basic comparisons (Walmart POS vs Target POS)

⚠️ Partially Can Answer:

Total Walmart revenue (need to join 2 tables - Awaish working on this)
Category performance (if category field exists)

❌ Cannot Answer Yet:

Geographic breakdowns (need to verify if data exists)
Store-level performance (need to verify)
Advanced analytics (cohorts, retention, etc.)

Investigation Steps:

Document all available fields in retail tables
Create a “What Can I Answer” matrix
Test 5-10 common business questions
Document gaps and blockers
Create a roadmap: “Here’s what you can do today, here’s what’s coming next week”

Question 7: “Where’s that orders table?” (Access Issue)

THINKING

What we know:

Robert couldn’t access it when Shivani asked
Orders table exists: RAW.POLYTOMIC.SHOPIFY.orders
Access/permissions issue in Snowflake
Awaish said he’d create new user and send credentials

What’s missing:

Whether access has been granted
Exact table path/location
Whether it’s the right table (Shopify orders vs retail orders)
How to access it (Snowflake UI vs SQL client)

What needs investigation:

Verify Snowflake access
Confirm table location
Test query access
Document access process for future

SUMMARY

Answer:

Table location: RAW.POLYTOMIC.SHOPIFY.orders
Issue: Access/permissions (Awaish creating new user)
Note: This is Shopify orders (e-commerce + wholesale), NOT retail orders

Investigation Steps:

Verify Snowflake access has been granted (check email from Awaish)
Test access: SELECT * FROM RAW.POLYTOMIC.SHOPIFY.orders LIMIT 10
If access denied, follow up with Awaish
Document full table path and access method
Clarify with Shivani: Does she want Shopify orders or retail orders?

Question 8: “How are Walmart and Target defining revenue differently?”

THINKING

What we know:

Different table structures suggest different definitions
Walmart sends POS + order management data
Target sends only POS data
Robert speculated: “Maybe Walmart recognizes revenue at a different time than Target”
Awaish mentioned they need to discuss definitions with Will (retail team)

What’s missing:

Actual field definitions from documentation
Revenue calculation logic in each source
Timing differences (when revenue is recognized)
Whether discounts/refunds are handled differently
Whether there are reconciliation differences

What needs investigation:

Check documentation (Awaish mentioned docs exist)
Compare revenue fields in both sources
Check for discount/refund fields
Look at date fields (sale date vs recognition date)
Test calculations to see if they match

SUMMARY

Answer: Need to investigate systematically. Key differences likely:

1. Data Structure:

Walmart: 12 tables (more granular, includes order management)
Target: 3 tables (simpler, POS only)

2. Revenue Recognition:

May differ in timing (sale date vs shipment date vs recognition date)
May differ in what’s included (gross vs net, discounts, refunds)

3. Data Completeness:

Walmart has order management data Target doesn’t have
This creates apples-to-oranges comparison if not handled correctly

Investigation Steps:

Review documentation for both retailers (Awaish mentioned docs exist)
Compare revenue fields: SELECT retailer, revenue_field, COUNT(*), SUM(revenue_field) FROM [tables] GROUP BY retailer, revenue_field
Check for discount/refund fields in both
Compare date fields (sale_date vs recognition_date)
Test: Calculate revenue using same logic for both, compare results
Document differences in a comparison table
Schedule meeting with Will (retail team) to align definitions

Question 9: “Why doesn’t our wholesale revenue match finance?”

THINKING

What we know:

Finance pulled data from accounting system (QuickBooks/NetSuite)
Discrepancy is <10% (Awaish said “probably less than 10%”)
Wholesale team doesn’t have issues with the data
Amber is working on reconciliation
Finance uses different system than operational data

What’s missing:

Exact discrepancy amount
Root cause (timing? definitions? exclusions?)
What finance is including/excluding
Whether this is expected (accounting vs operational data)
Reconciliation approach

What needs investigation:

Get finance’s numbers and methodology
Compare field-by-field
Check timing differences
Identify definition differences
Document acceptable variance

SUMMARY

Answer: This is EXPECTED and NORMAL. Key points:

Why It Doesn’t Match:

Different Systems: Finance uses QuickBooks/NetSuite (accounting), we use Shopify (operational)
Different Purposes: Finance = GAAP accounting, Operations = business metrics
Timing Differences: Revenue recognition timing (accrual vs cash)
Definition Differences: What counts as revenue (gross vs net, refunds, etc.)
Exclusions: Finance may exclude certain transactions

<10% Variance is Normal:

Accounting systems recognize revenue differently
Operational data is more real-time
Different cut-off dates
This is standard across clients

Investigation Steps:

Get finance’s exact numbers and date range
Compare: Our numbers vs Finance numbers side-by-side
Identify differences:
- Timing (when revenue recognized)
- Definitions (gross vs net)
- Exclusions (what finance excludes)
- Cut-off dates
Document acceptable variance threshold (<10% is reasonable)
Create reconciliation report showing differences and explanations
Set expectation: Perfect match is unrealistic, <10% is acceptable

Question 10: “How do we define net revenue?”

THINKING

What we know:

Awaish mentioned: “What is the definition for that, for example, net revenue? Either we had revenue minus discounts, or refunds, or what?”
They met with Will once but haven’t finalized definitions
Similar to wholesale: They built mart first, then got feedback on definitions
Need to align with retail team (Will) on definitions

What’s missing:

Actual definition from Element/retail team
Whether discounts are included/excluded
Whether refunds are included/excluded
Whether shipping is included
Whether taxes are included
Standard definition vs Element-specific definition

What needs investigation:

Check if definition exists in documentation
Schedule meeting with Will to align
Check what’s currently in the model
Compare to industry standards
Document agreed definition

SUMMARY

Answer: Definition NOT YET FINALIZED. Need to align with retail team.

Current Status:

Awaish met with Will once (discovery call)
Built base models with assumptions
Waiting for feedback on definitions
Similar pattern to wholesale: Build first, then refine

Typical Net Revenue Definition:

Gross Revenue
Minus Discounts
Minus Refunds/Returns
(May or may not include shipping/taxes)

Investigation Steps:

Check current model: What definition is currently used?
Review discovery call notes with Will
Schedule follow-up meeting with Will to finalize:
- Net revenue = Gross - Discounts - Refunds?
- What about shipping?
- What about taxes?
- Any other exclusions?
Document agreed definition
Update models if needed
Create definition document for reference

Question 11: “Can we just ship me all the orders?” (Process Question)

THINKING

What we know:

Shivani asked for raw orders data quickly
Uttam pushed back: “I can’t get anything in front of you until it’s perfectly formatted”
Tension: She wants speed, previously demanded quality
She’s in “sprint mode” now

What’s missing:

What format she wants (CSV? Excel? Snowflake view?)
What date range
Which orders (Shopify? Retail? Both?)
Whether she wants raw or modeled data
Process for handling these requests

What needs investigation:

Clarify requirements
Set process expectations
Define “good enough” vs “perfect”
Create quick export process

SUMMARY

Answer: This is a PROCESS question, not a data question. Need to align on expectations.

The Tension:

Shivani: Wants speed (“sprint mode”)
Team: Previously she demanded perfect formatting
Need: Balance between speed and quality

Solution Approach:

Define Tiers:
- Tier 1: Raw export (fast, unformatted)
- Tier 2: Formatted export (slower, formatted)
- Tier 3: Modeled/reporting (slowest, highest quality)
Set Expectations:
- Raw data = no formatting, no QA
- Formatted data = formatted, basic QA
- Reporting = full QA, validated
Create Process:
- Quick exports: <1 hour, raw data
- Standard exports: <1 day, formatted
- Reporting: Per sprint plan

Investigation Steps:

Clarify what she needs:
- Format (CSV? Excel? Snowflake view?)
- Date range
- Which orders (Shopify wholesale? Retail? Both?)
- Raw or modeled?
Set expectations: “I can give you raw data in 30 min, or formatted in 1 day”
Create quick export process for raw data
Document process for future requests
Align on when perfect formatting is needed vs when speed is priority

Question 12: “What’s the next version of this going to look like?” (Retail Reporting)

THINKING

What we know:

Retail mart exists (fact_sales, retail_fact_walmart_only_sales)
Phil gave template for retail reporting
Amber is building models to get data
Awaish finishing models this week/early next week
Reporting is in Google Sheets currently
Omni BI tool demo this week

What’s missing:

What Phil’s template looks like
What reporting currently exists
What’s planned vs what’s ad-hoc
Timeline for next version
Whether it moves to Omni or stays in Sheets

What needs investigation:

Review Phil’s template
Check current reporting state
Review Gantt chart/timeline
Check Omni plans
Document roadmap

SUMMARY

Answer: Next version depends on what “this” refers to. Need to clarify.

If “this” = Retail Reporting:

Current: Google Sheets (Amber building)
Next: Models finishing this week/early next week
Future: Move to Omni BI tool (demo this week)

If “this” = Retail Data Models:

Current: fact_sales (POS data joined)
Next: Join Walmart tables for total revenue (Awaish working on)
Future: More dimensions (geography, category, etc.)

Investigation Steps:

Clarify: What does “this” refer to? (Reporting? Models? Both?)
Review Phil’s template (if available)
Check current state: What exists now?
Review Gantt chart: What’s planned?
Check Omni timeline: When does reporting move there?
Document roadmap: Current → Next → Future
Present clear timeline to Shivani

General Investigation Framework

For Any New Question:

Clarify the Question
- What exactly is being asked?
- What’s the business context?
- What’s the time period?
- What’s the scope?
Check What Exists
- What tables/models exist?
- What fields are available?
- What documentation exists?
- What’s the current state?
Identify Gaps
- What’s missing?
- What’s blocked?
- What needs to be built?
- What needs clarification?
Investigate
- Query the data
- Check documentation
- Test assumptions
- Document findings
Answer or Escalate
- Can answer now? → Answer with data
- Can answer partially? → Answer what you can, note gaps
- Cannot answer? → Explain why, provide path forward
- Need clarification? → Ask specific questions
Document
- Document the answer
- Document the investigation process
- Document gaps/findings
- Update knowledge base

Key Resources to Reference

Snowflake Tables:
- BROADMARTS.wholesale_mart
- BROADMARTS.retail_mart.fact_sales
- BROADMARTS.retail_mart.retail_fact_walmart_only_sales
- RAW.POLYTOMIC.SHOPIFY.orders
Documentation:
- Table documentation (Awaish mentioned docs exist)
- dbt models (check join logic)
- Gantt chart (timeline)
People:
- Awaish: Data engineering, table structure
- Will: Retail team, definitions
- Phil: Reporting requirements
- Amber: Reporting/reconciliation
Process:
- Weekly updates
- Gantt chart tracking
- Sprint planning

Next Steps for Robert

Before Monday Meeting:
- Get Snowflake access (follow up with Awaish)
- Review table schemas
- Test basic queries
- Review documentation
During Monday Meeting:
- Use this framework to answer questions systematically
- Document gaps as you go
- Set expectations on what can/cannot answer
- Align on process for future questions
After Monday Meeting:
- Document answers provided
- Create action items for gaps
- Update this framework with learnings
- Build quick reference guide for common questions

Brainforge Knowledge

Explorer

LMNT_QA_Investigation_Framework

LMNT: QA/Investigation Framework for Shivani’s Questions

Framework Structure

Question 1: “Have you been able to join any Walmart and Target data?”

THINKING

SUMMARY

Question 2: “What were sales across those two stores yesterday? Would you be able to find it?”

THINKING

SUMMARY

Question 3: “What are you actually seeing in the delta between the Target and Walmart data right now?”

THINKING

SUMMARY

Question 4: “How is Target performing versus Walmart right now?”

THINKING

SUMMARY

Question 5: “What about Target in the Northeast versus Target in California?”

THINKING

SUMMARY

Question 6: “What can I actually glean from retail data right now?”

THINKING

SUMMARY

Question 7: “Where’s that orders table?” (Access Issue)

THINKING

SUMMARY

Question 8: “How are Walmart and Target defining revenue differently?”

THINKING

SUMMARY

Question 9: “Why doesn’t our wholesale revenue match finance?”

THINKING

SUMMARY

Question 10: “How do we define net revenue?”

THINKING

SUMMARY

Question 11: “Can we just ship me all the orders?” (Process Question)

THINKING

SUMMARY

Question 12: “What’s the next version of this going to look like?” (Retail Reporting)

THINKING

SUMMARY

General Investigation Framework

For Any New Question:

Key Resources to Reference

Next Steps for Robert

Graph View

Table of Contents