Amazon Raw Models Missing Data Report
Scope: dbt_project/models/raw/amazon/
Files reviewed:
raw_amazon__orders.sqlraw_amazon__order_items.sqlraw_amazon__financial_events.sqlraw_amazon__fulfillment_orders.sqlraw_amazon__catalog_items.sqlraw_amazon__listings_items.sqlraw_amazon__marketplace_participations.sql
Goal: highlight key business data that is not present in the current raw Amazon models.
Executive Summary
The Amazon raw layer is strong on identifiers and marketplace metadata, but it is missing several fields needed for revenue, customer, and fulfillment analysis.
Highest-priority gaps
-
No order total in
raw_amazon__orders- The orders model has order identity and status fields, but no explicit order amount / grand total.
- Missing related financial fields also include tax, shipping, discount, and currency at the order header level.
-
No line revenue in
raw_amazon__order_items- The order items model has quantity and SKU details, but no unit price, line subtotal, line discount, tax, or line revenue.
- That prevents item-level revenue reporting and makes order-level reconciliation harder.
-
No dedicated customer model in the Amazon raw layer
- The only customer-like fields in
raw_amazon__ordersareBUYER_EMAILandBUYER_NAME. - There is no customer table or richer customer profile data such as customer ID, shipping address, phone, or lifecycle metadata.
- The only customer-like fields in
-
Financial events are not normalized
raw_amazon__financial_eventsexposes event lists, but not exploded event rows with amount, currency, event date, or linked order/item identifiers.- This makes refunds, fees, and settlement analysis difficult.
Model-by-Model Review
raw_amazon__orders
What it has:
- Order identifiers and status fields
- Buyer email/name
- Shipment timing and channel metadata
Missing or incomplete data:
- Order total / grand total
- Tax amount
- Shipping amount
- Discount amount
- Currency code
- Customer ID
- Shipping address / billing address
- Phone number or richer customer profile fields
Impact:
- Cannot calculate revenue directly from the order header.
- Cannot build a complete customer dimension from this table alone.
raw_amazon__order_items
What it has:
- Item identifiers
- SKU / ASIN
- Quantity ordered and shipped
- Item attributes like condition, gift wrap, promotion IDs
Missing or incomplete data:
- Unit price
- Line subtotal
- Line revenue
- Line discount amount
- Line tax amount
- Shipping allocation at line level
- Currency code
Impact:
- Cannot produce item-level revenue or margin analysis.
- Cannot reconcile order total to item-level totals.
raw_amazon__financial_events
What it has:
- Event-type groupings for refunds, fees, shipments, reimbursements, and settlement-related activity
Missing or incomplete data:
- Flattened transaction rows
- Event amount
- Currency code
- Event timestamp
- Order ID / order item ID linkage
- Fee type and settlement detail in a normalized structure
Impact:
- Financial reconciliation and net revenue analysis are constrained.
- The table is difficult to join cleanly to orders and order items.
raw_amazon__fulfillment_orders
What it has:
- Fulfillment order identifiers and status metadata
- Fulfillment policy and shipping speed category
Missing or incomplete data:
- Shipment tracking number
- Carrier / service details
- Shipment cost
- Package-level line item revenue impact
- Delivery confirmation detail
Impact:
- Good for status tracking, but weak for logistics cost and delivery analytics.
raw_amazon__catalog_items
What it has:
- Product identifiers and catalog metadata
Missing or incomplete data:
- Customer data
- Sales transaction data
- Inventory quantity / stock on hand
- Pricing history
Impact:
- Useful as a product reference layer, but not enough for demand or revenue analysis by itself.
raw_amazon__listings_items
What it has:
- Listing metadata, offer structure, availability, and issues
Missing or incomplete data:
- Customer data
- Order data
- Line revenue
- Historical price and discount snapshots
Impact:
- Useful for catalog/listing operations, but not enough for sales reporting.
raw_amazon__marketplace_participations
What it has:
- Marketplace identity, locale, and participation flags
Missing or incomplete data:
- Business transaction data
- Customer data
- Order and item measures
Impact:
- This is a lookup table only, not a transactional source.
Cross-Cutting Missing Data
The Amazon raw layer is missing these broader entities or measures:
- Customer dimension: no dedicated customer model or customer ID
- Order financial summary: no order total, tax, shipping, discount, or currency at the order header level
- Order item financials: no unit price, line revenue, or discount allocation
- Normalized financial ledger: financial events are nested lists instead of row-level facts
- Logistics detail: no tracking, carrier, or shipping cost fields in fulfillment data
Recommended Follow-Up
If these fields exist upstream in Amazon SP-API or Polytomic, add them to the raw layer first.
Recommended additions:
ORDER_TOTAL,ITEM_PRICE,LINE_TOTAL,LINE_REVENUE,TAX_AMOUNT,SHIPPING_AMOUNT,DISCOUNT_AMOUNT,CURRENCY_CODECUSTOMER_ID,CUSTOMER_EMAIL,SHIPPING_ADDRESS,BILLING_ADDRESS,PHONEEVENT_AMOUNT,EVENT_DATE,EVENT_TYPE,CURRENCY_CODE,ORDER_ID,ORDER_ITEM_IDTRACKING_NUMBER,CARRIER_NAME,SHIPPING_COST
If Amazon does not expose these directly, create intermediate models that derive what can be inferred from the available source data.
Interim Implementation Update
The shipment-event portion of the financial events feed is now normalized in dbt_project/models/intermediate/amazon/int_amazon__financial_shipment_events.sql.
It flattens shipment_event_list into row-level shipment item charge records while preserving the raw nested payloads for traceability.