Meeting Title: Order Pipeline Delay Discussion Date: 2025-07-17 Meeting participants: Robert Tseng, Awaish Kumar
WEBVTT
1 00:00:21.440 ⇒ 00:00:22.350 Awaish Kumar: Hello!
2 00:00:23.560 ⇒ 00:00:24.509 Robert Tseng: Hey?
3 00:00:26.340 ⇒ 00:00:28.529 Robert Tseng: How’s how’s my audio?
4 00:00:29.920 ⇒ 00:00:32.150 Awaish Kumar: Yeah, it is. It is okay. It is fine.
5 00:00:32.590 ⇒ 00:00:41.629 Robert Tseng: Okay, yeah. Let me know if there’s any issues. I’m like sitting outside. I’m like, gonna get on a flight in a couple of hours. Going to Denver.
6 00:00:46.140 ⇒ 00:00:51.270 Awaish Kumar: Okay, no worries. So like I saw your recent message. Is it about
7 00:00:51.550 ⇒ 00:00:54.440 Awaish Kumar: delaying the pipeline? What that means.
8 00:00:54.950 ⇒ 00:00:59.770 Robert Tseng: Yeah, so like, let’s say an event.
9 00:01:02.270 ⇒ 00:01:07.340 Robert Tseng: Let’s say, oh, I wish I could just draw this out. I’m much better by recording.
10 00:01:08.100 ⇒ 00:01:13.016 Robert Tseng: Okay, is there a whiteboard feature?
11 00:01:19.030 ⇒ 00:01:20.020 Robert Tseng: thank you.
12 00:01:21.830 ⇒ 00:01:22.630 Robert Tseng: White.
13 00:01:28.650 ⇒ 00:01:33.600 Robert Tseng: Okay, great. So let’s say, order shipped or order completed.
14 00:01:34.680 ⇒ 00:01:41.690 Robert Tseng: happens here and then order shipped happens here.
15 00:01:42.200 ⇒ 00:01:58.620 Robert Tseng: And then yeah. Then we, you know it goes into are modeling process gets into, maybe like in
16 00:01:59.350 ⇒ 00:02:03.030 Robert Tseng: hit meds, and then it also gets into
17 00:02:05.040 ⇒ 00:02:07.750 Robert Tseng: I guess this new model that you would describe.
18 00:02:08.449 ⇒ 00:02:12.440 Robert Tseng: And then this is what we end up sending into.
19 00:02:19.080 ⇒ 00:02:23.760 Robert Tseng: I guess this would go through segments.
20 00:02:24.000 ⇒ 00:02:29.780 Robert Tseng: and then it would get into Meta.
21 00:02:30.000 ⇒ 00:02:31.030 Robert Tseng: So
22 00:02:35.050 ⇒ 00:02:36.300 Robert Tseng: I’m good
23 00:02:46.730 ⇒ 00:03:03.479 Robert Tseng: scripts, then shipments to model segments, the reverse. Epl.
24 00:03:07.450 ⇒ 00:03:10.610 Robert Tseng: then it gets into Meta.
25 00:03:13.310 ⇒ 00:03:17.739 Robert Tseng: Okay, so yeah, I don’t really know like
26 00:03:18.050 ⇒ 00:03:24.839 Robert Tseng: this is quite a few number of steps. So I’m sure there’s going to be delay in each of these different
27 00:03:26.223 ⇒ 00:03:31.990 Robert Tseng: kind of steps right? And I just, I’m worried that like,
28 00:03:36.830 ⇒ 00:03:40.930 Robert Tseng: yeah, let’s say the total delay is like 12 h. Then
29 00:03:41.190 ⇒ 00:03:48.060 Robert Tseng: we lose another 12 h in our conversion window. Right? So I just I don’t really know
30 00:03:48.590 ⇒ 00:03:51.000 Robert Tseng: how quickly all of this happens.
31 00:03:53.730 ⇒ 00:04:02.890 Awaish Kumar: Okay, so so like, we need to calculate like when we are getting order shipped event.
32 00:04:03.504 ⇒ 00:04:08.789 Awaish Kumar: So like order completed event. Like, if the data which is coming in.
33 00:04:09.040 ⇒ 00:04:14.130 Awaish Kumar: like, for example, an order completed on 17th of July
34 00:04:14.260 ⇒ 00:04:19.079 Awaish Kumar: or 16th of July, and it came into our data warehouse on 17.th
35 00:04:19.200 ⇒ 00:04:27.710 Awaish Kumar: That’s really the problem. But otherwise like, if it is coming on the same day, and
36 00:04:28.370 ⇒ 00:04:31.740 Awaish Kumar: it should be processed in, call it
37 00:04:32.170 ⇒ 00:04:35.560 Awaish Kumar: in an hour, because we run our retail pipeline.
38 00:04:35.930 ⇒ 00:04:40.440 Awaish Kumar: Our Dbt runs are running like every hour. Now. I think
39 00:04:41.190 ⇒ 00:04:49.560 Awaish Kumar: so. Every new data which is coming in is getting processed. It goes to the Tim shipment, and then it will go to New Model as well.
40 00:04:51.130 ⇒ 00:04:58.100 Robert Tseng: Yeah, okay, so yeah, you’re saying, is the main thing. Yeah.
41 00:04:58.630 ⇒ 00:05:07.809 Awaish Kumar: Yeah. So the the events which are coming in or or completed and order shipped from the segment. That’s like, if there’s no delay there like what
42 00:05:08.241 ⇒ 00:05:14.280 Awaish Kumar: what like like we need to. If we need to find out a delay, then that should be there, because
43 00:05:14.390 ⇒ 00:05:23.060 Awaish Kumar: if an order placed like an an event which came into bigger after 5 h. That’s really the
44 00:05:23.370 ⇒ 00:05:24.470 Awaish Kumar: the delay.
45 00:05:26.070 ⇒ 00:05:26.400 Robert Tseng: I see.
46 00:05:26.400 ⇒ 00:05:30.240 Awaish Kumar: After that we only have 1 h of delivery at Max.
47 00:05:30.400 ⇒ 00:05:36.859 Awaish Kumar: Like when the data loads into the raw data we will be running our DVD job after an hour.
48 00:05:37.090 ⇒ 00:05:38.900 Awaish Kumar: Right? So it’s.
49 00:05:39.370 ⇒ 00:05:43.310 Robert Tseng: Are you able to check that in in bigquery like?
50 00:05:44.358 ⇒ 00:05:52.729 Robert Tseng: Obviously, we have the web hook event, timestamp. But like, when it actually lands in bigquery like, I want to know, like what the delay is on this.
51 00:05:53.570 ⇒ 00:05:55.929 Awaish Kumar: Okay, I I can check that.
52 00:06:03.100 ⇒ 00:06:06.079 Robert Tseng: Okay. While you’re pulling that up, I’ll also pull something up.
53 00:06:09.180 ⇒ 00:06:12.700 Robert Tseng: I’m just gonna show you how it actually looks in the segment.
54 00:06:13.530 ⇒ 00:06:14.320 Robert Tseng: Well.
55 00:07:07.320 ⇒ 00:07:09.250 Robert Tseng: oh, export this thing! Huh!
56 00:07:22.610 ⇒ 00:07:26.760 Robert Tseng: mean this? Is not that helpful nice shipment.
57 00:07:34.650 ⇒ 00:07:35.959 Awaish Kumar: Yeah, like.
58 00:07:36.824 ⇒ 00:07:48.720 Awaish Kumar: like, I just checked few rows in the picture. It is like even timestamp which is coming in for each data. And that’s basically that’s the
59 00:07:48.880 ⇒ 00:07:52.549 Awaish Kumar: the timestamp which we are using to determine the order date.
60 00:07:54.290 ⇒ 00:07:56.519 Awaish Kumar: So I don’t see any.
61 00:07:57.130 ⇒ 00:07:59.313 Awaish Kumar: There are a few fields
62 00:08:01.152 ⇒ 00:08:08.747 Awaish Kumar: like. It’s called event timestamp, original timestamp and timestamp in the order completed
63 00:08:10.068 ⇒ 00:08:14.539 Awaish Kumar: table, which is being populated by the segment.
64 00:08:14.930 ⇒ 00:08:23.539 Awaish Kumar: And so. But I don’t see any difference between these 3 columns. So what
65 00:08:24.200 ⇒ 00:08:33.230 Awaish Kumar: what I’m was trying to say is that because, like when events comes in, maybe the event timestamp is
66 00:08:33.480 ⇒ 00:08:39.909 Awaish Kumar: is the timestamp column which is being populated when the data got inserted into the big carry.
67 00:08:40.400 ⇒ 00:08:41.760 Awaish Kumar: Okay?
68 00:08:42.390 ⇒ 00:08:45.579 Awaish Kumar: But like, I’m not sure like
69 00:08:45.760 ⇒ 00:08:49.949 Awaish Kumar: for ex like if if there’s a delay on the bus side, for example.
70 00:08:50.450 ⇒ 00:08:52.600 Awaish Kumar: So it like it even happened.
71 00:08:52.750 ⇒ 00:08:57.489 Awaish Kumar: And when it got triggered like an order got placed
72 00:08:57.600 ⇒ 00:09:00.609 Awaish Kumar: on the 17th July at 9 Am.
73 00:09:01.000 ⇒ 00:09:06.929 Awaish Kumar: And it was in the queue, for example, on the Basque, in the bask system, right
74 00:09:07.940 ⇒ 00:09:09.520 Awaish Kumar: where the web hooks are being
75 00:09:09.800 ⇒ 00:09:16.939 Awaish Kumar: run and it sent after like after an hour it hit the segment. Api. So.
76 00:09:18.090 ⇒ 00:09:21.450 Awaish Kumar: But like, that’s that’s not possible for us to determine
77 00:09:22.230 ⇒ 00:09:33.310 Awaish Kumar: like if there is delay on that part. But when it lands into the bigquery I see a maximum of like an hour of delay until it lands to the new model.
78 00:09:34.630 ⇒ 00:09:35.490 Robert Tseng: I see.
79 00:09:35.620 ⇒ 00:09:39.510 Robert Tseng: Okay? So the risk that we would run into is,
80 00:09:40.000 ⇒ 00:09:53.679 Robert Tseng: yeah. The timestamps that we get for the events that’s like, when the event took place. That’s not when the web hook actually gets comes in. I knew that. So we don’t really know, like the delay between the actual event, and when the web hook comes into the warehouse.
81 00:09:54.550 ⇒ 00:10:09.630 Awaish Kumar: Yeah, like, there is, for example, like, like, if if I build a like real time streaming pipeline. So data when it happened, I would ask, okay, let’s write it to somewhere in Kafka, then put into bigquery. But for
82 00:10:09.750 ⇒ 00:10:12.179 Awaish Kumar: like, for example, my architecture is
83 00:10:12.280 ⇒ 00:10:21.640 Awaish Kumar: is is slow or like whatever. Or I have a lot of load that that queuing, and from that that queue can like take longer
84 00:10:21.820 ⇒ 00:10:25.350 Awaish Kumar: to process the data. All the events which are coming in
85 00:10:25.450 ⇒ 00:10:35.039 Awaish Kumar: so that that can be delay on that side. But because it’s that architecture is not in our systems, we can’t determine, like if there is any delay or not?
86 00:10:37.180 ⇒ 00:10:39.551 Robert Tseng: Okay, we do know there is delay.
87 00:10:42.480 ⇒ 00:10:46.940 Robert Tseng: yeah, I think what we need to do then is
88 00:10:50.630 ⇒ 00:10:55.750 Robert Tseng: like the only way that I can think of that we can validate. This is just to get an export of the
89 00:10:55.870 ⇒ 00:10:56.700 Robert Tseng: you know
90 00:10:57.010 ⇒ 00:11:03.259 Robert Tseng: of like completed orders out of the ui like you can just export that because it shows up in their system.
91 00:11:03.360 ⇒ 00:11:05.580 Robert Tseng: And then we just like, see
92 00:11:05.690 ⇒ 00:11:09.919 Robert Tseng: how many we haven’t that like, we just, we just run a query.
93 00:11:10.250 ⇒ 00:11:13.699 Robert Tseng: So let’s say we, we export order completed.
94 00:11:14.790 ⇒ 00:11:29.840 Robert Tseng: I don’t know if we should use today. Only we do like the past 7 days. Yeah, like past 7 days, the Ui, we export that list. And then we go. And we look at our web of data to see how many of those orders actually came through. Then we would get a sense of how long the delay is.
95 00:11:31.810 ⇒ 00:11:34.230 Awaish Kumar: Okay? So you mean the count of orders
96 00:11:38.430 ⇒ 00:11:43.630 Awaish Kumar: like, like, we can compare like in our data warehouse. Last 7 days, we have
97 00:11:43.830 ⇒ 00:11:49.249 Awaish Kumar: 10, like 10,000 orders, but the bus says there are like 12,000,
98 00:11:49.490 ⇒ 00:11:50.760 Awaish Kumar: so like 2,000 orders.
99 00:11:50.760 ⇒ 00:11:51.200 Robert Tseng: That already.
100 00:11:51.200 ⇒ 00:11:51.839 Awaish Kumar: Thank you.
101 00:11:52.740 ⇒ 00:11:54.720 Robert Tseng: I’m just saying that, yeah, yeah.
102 00:11:55.590 ⇒ 00:11:56.800 Awaish Kumar: Sit here.
103 00:11:57.170 ⇒ 00:12:02.240 Robert Tseng: Okay, let me just like, take a look at that 1st I’ll look at the Ui, you. You run the query.
104 00:12:04.940 ⇒ 00:12:09.469 Robert Tseng: I’m just gonna do yeah, July 10th and the 17.th
105 00:12:09.670 ⇒ 00:12:10.930 Robert Tseng: Then I’m just gonna.
106 00:12:11.700 ⇒ 00:12:13.110 Awaish Kumar: Okay, excellent.
107 00:12:13.110 ⇒ 00:12:15.939 Awaish Kumar: Okay, so we are, including the July 10 as well. Right?
108 00:12:16.250 ⇒ 00:12:22.450 Robert Tseng: Yeah, I mean, I’m seeing like, 7,000.
109 00:13:44.280 ⇒ 00:13:46.930 Awaish Kumar: I get 7,135.
110 00:13:49.030 ⇒ 00:13:52.949 Robert Tseng: Okay, you get 7,135.
111 00:13:53.660 ⇒ 00:13:54.260 Awaish Kumar: Yep.
112 00:13:55.392 ⇒ 00:14:05.020 Robert Tseng: But I get 7,023. So somehow what we have is more, which is fine.
113 00:14:05.820 ⇒ 00:14:11.539 Robert Tseng: I wonder what’s what’s missing. So
114 00:14:13.570 ⇒ 00:14:20.220 Robert Tseng: maybe I’ll just go and get the order. Ids order numbers.
115 00:14:20.970 ⇒ 00:14:22.799 Awaish Kumar: We, we have duplicate. Let me.
116 00:14:34.740 ⇒ 00:14:39.040 Robert Tseng: I’ll give you a list, and you could just like running seed order.
117 00:14:43.960 ⇒ 00:14:49.410 Robert Tseng: Sorry I’m like not. I don’t have my like desktop. So like running these types of operations takes so long
118 00:14:54.650 ⇒ 00:14:57.120 Robert Tseng: need to upgrade my laptop so slow.
119 00:15:04.330 ⇒ 00:15:08.190 Robert Tseng: Yeah, so write this as a list.
120 00:17:30.330 ⇒ 00:17:32.526 Robert Tseng: Okay, alright. So I just sent you
121 00:17:34.640 ⇒ 00:17:42.093 Robert Tseng: in list of the orders that I have. We could just like we could just look at
122 00:17:43.820 ⇒ 00:17:45.890 Robert Tseng: Yeah, daily discrepancy.
123 00:17:46.420 ⇒ 00:17:48.949 Robert Tseng: Then we can. Then I’ll be able to tell, like.
124 00:17:49.430 ⇒ 00:17:52.540 Robert Tseng: Hey, like, 90% of those are because
125 00:17:52.950 ⇒ 00:17:57.670 Robert Tseng: are are from like the past day as opposed to like, you know. 7 days ago.
126 00:18:10.160 ⇒ 00:18:16.680 Robert Tseng: Okay, while you’re running that I’m just gonna keep thinking. So assuming that that’s fine, I mean, the margin of error is fine. But 7.
127 00:18:17.600 ⇒ 00:18:18.490 Robert Tseng: That’s
128 00:18:24.790 ⇒ 00:18:28.648 Robert Tseng: yeah. I mean, it’s like a 2% difference. I’m I’m not really concerned about that.
129 00:18:30.210 ⇒ 00:18:32.409 Robert Tseng: So I still think it’s good enough.
130 00:18:32.850 ⇒ 00:18:36.539 Robert Tseng: But yeah, we’ll probably will have to improve it later on.
131 00:18:37.600 ⇒ 00:18:38.045 Awaish Kumar: Okay.
132 00:18:39.970 ⇒ 00:18:44.680 Robert Tseng: But I am curious, just like if it’s all from the past day or something.
133 00:18:44.680 ⇒ 00:18:47.199 Awaish Kumar: So like you have shared me the sub sub.
134 00:18:47.450 ⇒ 00:18:51.179 Awaish Kumar: They get fuel ids right. Not all.
135 00:18:53.040 ⇒ 00:18:53.640 Robert Tseng: Oh, really.
136 00:18:55.190 ⇒ 00:19:03.129 Awaish Kumar: You have shared some fewest few order numbers. I’m like, like what
137 00:19:03.740 ⇒ 00:19:06.789 Awaish Kumar: you just copied some random ones, or they are more like.
138 00:19:07.960 ⇒ 00:19:17.779 Robert Tseng: I I tried to just run it through Gpt to reformat like the spreadsheet that I have. But I guess it didn’t really do it kind of just truncated it.
139 00:19:28.000 ⇒ 00:19:30.400 Robert Tseng: alright, I’m just gonna send
140 00:19:36.930 ⇒ 00:19:41.299 Robert Tseng: to run this so quickly if I just open my python open. This is not running.
141 00:19:45.660 ⇒ 00:19:48.936 Robert Tseng: Okay, no, it’s all good. I’ll I can run the analysis. So
142 00:19:50.700 ⇒ 00:20:03.210 Robert Tseng: yeah, this will take me like a minute. But okay, I don’t want to get stuck on this. So let’s just assume that 2% is not a big difference. We can. We can move forward. Anyway, I can just confirm. I can. I can run. I can get the confirmation afterwards.
143 00:20:06.000 ⇒ 00:20:12.610 Robert Tseng: okay, so that’s order completed and then order shipped is not the same event as order completed right? But what I I did show that
144 00:20:13.204 ⇒ 00:20:16.840 Robert Tseng: you know, 92 to 97% of
145 00:20:17.370 ⇒ 00:20:20.929 Robert Tseng: orders that are completed end up getting shipped.
146 00:20:21.260 ⇒ 00:20:27.490 Robert Tseng: Get a ship status within within 7 days. So there’s probably another like, let’s just say 5% drop off in 7 days.
147 00:20:27.870 ⇒ 00:20:31.090 Robert Tseng: So we have a 2% drop off, and then another 5% drop off.
148 00:20:31.210 ⇒ 00:20:35.179 Robert Tseng: You know, we’re still, you know, hovering around like, you know, 7%.
149 00:20:35.670 ⇒ 00:20:42.900 Robert Tseng: We might lose 7% of of orders. Within that attribution window. I think that’s fine. I feel like this is, this is good enough to ship.
150 00:20:45.090 ⇒ 00:20:49.870 Awaish Kumar: Yeah. So the the carrier I shared it has the order, date ship date
151 00:20:50.160 ⇒ 00:20:55.160 Awaish Kumar: sent to pharmacy date like, and also the flag is sent to pharmacy.
152 00:20:55.550 ⇒ 00:21:00.120 Robert Tseng: Okay, perfect. So I’ll share my screen. And I’ll just show you quickly.
153 00:21:02.500 ⇒ 00:21:10.409 Robert Tseng: so yeah, assuming that you get that set up what we have here is we have a list of sources. If you could hook this up because I’ll be probably on the plane.
154 00:21:11.566 ⇒ 00:21:17.360 Robert Tseng: There’s an Eden warehouse source. Here we’ve been adding model
155 00:21:17.831 ⇒ 00:21:22.219 Robert Tseng: and then we can either use the sequel editor, which I don’t know if I would recommend, then that
156 00:21:22.760 ⇒ 00:21:25.639 Robert Tseng: maybe we have to use the Bpp model here.
157 00:21:27.350 ⇒ 00:21:29.820 Robert Tseng: Not sure if it’s set up this meeting model or not.
158 00:21:29.820 ⇒ 00:21:32.189 Awaish Kumar: I can just copy paste the query.
159 00:21:32.650 ⇒ 00:21:38.625 Robert Tseng: Okay, perfectly fine. Yeah. You do that. You get the model in, and then you can create it.
160 00:21:39.830 ⇒ 00:21:40.240 Awaish Kumar: Okay.
161 00:21:40.240 ⇒ 00:21:41.340 Robert Tseng: Yeah, so my system.
162 00:21:41.340 ⇒ 00:21:42.580 Awaish Kumar: What exactly are you doing?
163 00:21:42.690 ⇒ 00:21:46.210 Awaish Kumar: What exactly do you need like the query I wrote.
164 00:21:46.952 ⇒ 00:21:49.549 Awaish Kumar: It’s just 3 4 columns like.
165 00:21:49.770 ⇒ 00:21:55.320 Awaish Kumar: what exact columns do you need or like? Can just I reference the query you showed in stand up.
166 00:21:56.856 ⇒ 00:22:03.609 Robert Tseng: Yeah. So what? Exactly I will need, I guess I.
167 00:22:03.890 ⇒ 00:22:06.479 Awaish Kumar: Do you need everything from order completed.
168 00:22:10.400 ⇒ 00:22:17.919 Robert Tseng: I think it would be good to have everything there, because I can just map it out or, okay, fine. You’re right. Maybe I should re go from there.
169 00:22:18.581 ⇒ 00:22:31.080 Robert Tseng: Yeah, actually, if you mind. Just yeah. I mean, I can put everything in there, because once I created, then I have to like, actually define the mapping. If there’s anything that needs to be excluded, I can just go back and edit the query and take it out. So I think that’s fine.
170 00:22:32.720 ⇒ 00:22:36.720 Awaish Kumar: Okay, so, like.
171 00:22:36.720 ⇒ 00:22:37.320 Robert Tseng: Cool, cool.
172 00:22:37.320 ⇒ 00:22:37.820 Awaish Kumar: Have a good day.
173 00:22:38.233 ⇒ 00:22:39.060 Robert Tseng: To help.
174 00:22:39.890 ⇒ 00:22:44.920 Awaish Kumar: Okay, I will get everything from order completed and and everything from order shipped.
175 00:22:45.140 ⇒ 00:22:49.129 Awaish Kumar: And then you can like, select whatever columns you need.
176 00:22:49.270 ⇒ 00:22:51.770 Robert Tseng: Yeah, and the yeah and the dial machine. Yep.
177 00:22:52.340 ⇒ 00:23:00.349 Robert Tseng: okay, cool. So I’m assuming you’ll send that to me shortly, and then I’ll run this through, and I’ll set up the mapping, and hopefully I can fire and test event.
178 00:23:00.700 ⇒ 00:23:06.650 Robert Tseng: You know, the next 30 min before, and if it works, then I’ll kind of
179 00:23:07.010 ⇒ 00:23:08.190 Robert Tseng: then I think we can.
180 00:23:08.360 ⇒ 00:23:12.589 Robert Tseng: Oh, yeah, then we’ll then we’ll try to figure out how to push it to the next step.
181 00:23:12.590 ⇒ 00:23:15.169 Awaish Kumar: Should I build this in the segment like?
182 00:23:15.670 ⇒ 00:23:17.389 Awaish Kumar: Should I define this model right.
183 00:23:18.493 ⇒ 00:23:23.370 Robert Tseng: Yeah, this models and segment. I mean, yeah, you can.
184 00:23:25.510 ⇒ 00:23:33.789 Awaish Kumar: Like I’m I’m just want to be clear, like the query will write, then I will create this model, and then I will inform you right? Is that what needed? Okay.
185 00:23:33.790 ⇒ 00:23:35.460 Robert Tseng: Yeah, that’s it. Okay.
186 00:23:36.910 ⇒ 00:23:38.130 Awaish Kumar: Perfect. Thank you.
187 00:23:38.130 ⇒ 00:23:39.620 Robert Tseng: Thanks, but.