Meeting Title: Order Pipeline Delay Discussion Date: 2025-07-17 Meeting participants: Robert Tseng, Awaish Kumar


WEBVTT

1 00:00:21.440 00:00:22.350 Awaish Kumar: Hello!

2 00:00:23.560 00:00:24.509 Robert Tseng: Hey?

3 00:00:26.340 00:00:28.529 Robert Tseng: How’s how’s my audio?

4 00:00:29.920 00:00:32.150 Awaish Kumar: Yeah, it is. It is okay. It is fine.

5 00:00:32.590 00:00:41.629 Robert Tseng: Okay, yeah. Let me know if there’s any issues. I’m like sitting outside. I’m like, gonna get on a flight in a couple of hours. Going to Denver.

6 00:00:46.140 00:00:51.270 Awaish Kumar: Okay, no worries. So like I saw your recent message. Is it about

7 00:00:51.550 00:00:54.440 Awaish Kumar: delaying the pipeline? What that means.

8 00:00:54.950 00:00:59.770 Robert Tseng: Yeah, so like, let’s say an event.

9 00:01:02.270 00:01:07.340 Robert Tseng: Let’s say, oh, I wish I could just draw this out. I’m much better by recording.

10 00:01:08.100 00:01:13.016 Robert Tseng: Okay, is there a whiteboard feature?

11 00:01:19.030 00:01:20.020 Robert Tseng: thank you.

12 00:01:21.830 00:01:22.630 Robert Tseng: White.

13 00:01:28.650 00:01:33.600 Robert Tseng: Okay, great. So let’s say, order shipped or order completed.

14 00:01:34.680 00:01:41.690 Robert Tseng: happens here and then order shipped happens here.

15 00:01:42.200 00:01:58.620 Robert Tseng: And then yeah. Then we, you know it goes into are modeling process gets into, maybe like in

16 00:01:59.350 00:02:03.030 Robert Tseng: hit meds, and then it also gets into

17 00:02:05.040 00:02:07.750 Robert Tseng: I guess this new model that you would describe.

18 00:02:08.449 00:02:12.440 Robert Tseng: And then this is what we end up sending into.

19 00:02:19.080 00:02:23.760 Robert Tseng: I guess this would go through segments.

20 00:02:24.000 00:02:29.780 Robert Tseng: and then it would get into Meta.

21 00:02:30.000 00:02:31.030 Robert Tseng: So

22 00:02:35.050 00:02:36.300 Robert Tseng: I’m good

23 00:02:46.730 00:03:03.479 Robert Tseng: scripts, then shipments to model segments, the reverse. Epl.

24 00:03:07.450 00:03:10.610 Robert Tseng: then it gets into Meta.

25 00:03:13.310 00:03:17.739 Robert Tseng: Okay, so yeah, I don’t really know like

26 00:03:18.050 00:03:24.839 Robert Tseng: this is quite a few number of steps. So I’m sure there’s going to be delay in each of these different

27 00:03:26.223 00:03:31.990 Robert Tseng: kind of steps right? And I just, I’m worried that like,

28 00:03:36.830 00:03:40.930 Robert Tseng: yeah, let’s say the total delay is like 12 h. Then

29 00:03:41.190 00:03:48.060 Robert Tseng: we lose another 12 h in our conversion window. Right? So I just I don’t really know

30 00:03:48.590 00:03:51.000 Robert Tseng: how quickly all of this happens.

31 00:03:53.730 00:04:02.890 Awaish Kumar: Okay, so so like, we need to calculate like when we are getting order shipped event.

32 00:04:03.504 00:04:08.789 Awaish Kumar: So like order completed event. Like, if the data which is coming in.

33 00:04:09.040 00:04:14.130 Awaish Kumar: like, for example, an order completed on 17th of July

34 00:04:14.260 00:04:19.079 Awaish Kumar: or 16th of July, and it came into our data warehouse on 17.th

35 00:04:19.200 00:04:27.710 Awaish Kumar: That’s really the problem. But otherwise like, if it is coming on the same day, and

36 00:04:28.370 00:04:31.740 Awaish Kumar: it should be processed in, call it

37 00:04:32.170 00:04:35.560 Awaish Kumar: in an hour, because we run our retail pipeline.

38 00:04:35.930 00:04:40.440 Awaish Kumar: Our Dbt runs are running like every hour. Now. I think

39 00:04:41.190 00:04:49.560 Awaish Kumar: so. Every new data which is coming in is getting processed. It goes to the Tim shipment, and then it will go to New Model as well.

40 00:04:51.130 00:04:58.100 Robert Tseng: Yeah, okay, so yeah, you’re saying, is the main thing. Yeah.

41 00:04:58.630 00:05:07.809 Awaish Kumar: Yeah. So the the events which are coming in or or completed and order shipped from the segment. That’s like, if there’s no delay there like what

42 00:05:08.241 00:05:14.280 Awaish Kumar: what like like we need to. If we need to find out a delay, then that should be there, because

43 00:05:14.390 00:05:23.060 Awaish Kumar: if an order placed like an an event which came into bigger after 5 h. That’s really the

44 00:05:23.370 00:05:24.470 Awaish Kumar: the delay.

45 00:05:26.070 00:05:26.400 Robert Tseng: I see.

46 00:05:26.400 00:05:30.240 Awaish Kumar: After that we only have 1 h of delivery at Max.

47 00:05:30.400 00:05:36.859 Awaish Kumar: Like when the data loads into the raw data we will be running our DVD job after an hour.

48 00:05:37.090 00:05:38.900 Awaish Kumar: Right? So it’s.

49 00:05:39.370 00:05:43.310 Robert Tseng: Are you able to check that in in bigquery like?

50 00:05:44.358 00:05:52.729 Robert Tseng: Obviously, we have the web hook event, timestamp. But like, when it actually lands in bigquery like, I want to know, like what the delay is on this.

51 00:05:53.570 00:05:55.929 Awaish Kumar: Okay, I I can check that.

52 00:06:03.100 00:06:06.079 Robert Tseng: Okay. While you’re pulling that up, I’ll also pull something up.

53 00:06:09.180 00:06:12.700 Robert Tseng: I’m just gonna show you how it actually looks in the segment.

54 00:06:13.530 00:06:14.320 Robert Tseng: Well.

55 00:07:07.320 00:07:09.250 Robert Tseng: oh, export this thing! Huh!

56 00:07:22.610 00:07:26.760 Robert Tseng: mean this? Is not that helpful nice shipment.

57 00:07:34.650 00:07:35.959 Awaish Kumar: Yeah, like.

58 00:07:36.824 00:07:48.720 Awaish Kumar: like, I just checked few rows in the picture. It is like even timestamp which is coming in for each data. And that’s basically that’s the

59 00:07:48.880 00:07:52.549 Awaish Kumar: the timestamp which we are using to determine the order date.

60 00:07:54.290 00:07:56.519 Awaish Kumar: So I don’t see any.

61 00:07:57.130 00:07:59.313 Awaish Kumar: There are a few fields

62 00:08:01.152 00:08:08.747 Awaish Kumar: like. It’s called event timestamp, original timestamp and timestamp in the order completed

63 00:08:10.068 00:08:14.539 Awaish Kumar: table, which is being populated by the segment.

64 00:08:14.930 00:08:23.539 Awaish Kumar: And so. But I don’t see any difference between these 3 columns. So what

65 00:08:24.200 00:08:33.230 Awaish Kumar: what I’m was trying to say is that because, like when events comes in, maybe the event timestamp is

66 00:08:33.480 00:08:39.909 Awaish Kumar: is the timestamp column which is being populated when the data got inserted into the big carry.

67 00:08:40.400 00:08:41.760 Awaish Kumar: Okay?

68 00:08:42.390 00:08:45.579 Awaish Kumar: But like, I’m not sure like

69 00:08:45.760 00:08:49.949 Awaish Kumar: for ex like if if there’s a delay on the bus side, for example.

70 00:08:50.450 00:08:52.600 Awaish Kumar: So it like it even happened.

71 00:08:52.750 00:08:57.489 Awaish Kumar: And when it got triggered like an order got placed

72 00:08:57.600 00:09:00.609 Awaish Kumar: on the 17th July at 9 Am.

73 00:09:01.000 00:09:06.929 Awaish Kumar: And it was in the queue, for example, on the Basque, in the bask system, right

74 00:09:07.940 00:09:09.520 Awaish Kumar: where the web hooks are being

75 00:09:09.800 00:09:16.939 Awaish Kumar: run and it sent after like after an hour it hit the segment. Api. So.

76 00:09:18.090 00:09:21.450 Awaish Kumar: But like, that’s that’s not possible for us to determine

77 00:09:22.230 00:09:33.310 Awaish Kumar: like if there is delay on that part. But when it lands into the bigquery I see a maximum of like an hour of delay until it lands to the new model.

78 00:09:34.630 00:09:35.490 Robert Tseng: I see.

79 00:09:35.620 00:09:39.510 Robert Tseng: Okay? So the risk that we would run into is,

80 00:09:40.000 00:09:53.679 Robert Tseng: yeah. The timestamps that we get for the events that’s like, when the event took place. That’s not when the web hook actually gets comes in. I knew that. So we don’t really know, like the delay between the actual event, and when the web hook comes into the warehouse.

81 00:09:54.550 00:10:09.630 Awaish Kumar: Yeah, like, there is, for example, like, like, if if I build a like real time streaming pipeline. So data when it happened, I would ask, okay, let’s write it to somewhere in Kafka, then put into bigquery. But for

82 00:10:09.750 00:10:12.179 Awaish Kumar: like, for example, my architecture is

83 00:10:12.280 00:10:21.640 Awaish Kumar: is is slow or like whatever. Or I have a lot of load that that queuing, and from that that queue can like take longer

84 00:10:21.820 00:10:25.350 Awaish Kumar: to process the data. All the events which are coming in

85 00:10:25.450 00:10:35.039 Awaish Kumar: so that that can be delay on that side. But because it’s that architecture is not in our systems, we can’t determine, like if there is any delay or not?

86 00:10:37.180 00:10:39.551 Robert Tseng: Okay, we do know there is delay.

87 00:10:42.480 00:10:46.940 Robert Tseng: yeah, I think what we need to do then is

88 00:10:50.630 00:10:55.750 Robert Tseng: like the only way that I can think of that we can validate. This is just to get an export of the

89 00:10:55.870 00:10:56.700 Robert Tseng: you know

90 00:10:57.010 00:11:03.259 Robert Tseng: of like completed orders out of the ui like you can just export that because it shows up in their system.

91 00:11:03.360 00:11:05.580 Robert Tseng: And then we just like, see

92 00:11:05.690 00:11:09.919 Robert Tseng: how many we haven’t that like, we just, we just run a query.

93 00:11:10.250 00:11:13.699 Robert Tseng: So let’s say we, we export order completed.

94 00:11:14.790 00:11:29.840 Robert Tseng: I don’t know if we should use today. Only we do like the past 7 days. Yeah, like past 7 days, the Ui, we export that list. And then we go. And we look at our web of data to see how many of those orders actually came through. Then we would get a sense of how long the delay is.

95 00:11:31.810 00:11:34.230 Awaish Kumar: Okay? So you mean the count of orders

96 00:11:38.430 00:11:43.630 Awaish Kumar: like, like, we can compare like in our data warehouse. Last 7 days, we have

97 00:11:43.830 00:11:49.249 Awaish Kumar: 10, like 10,000 orders, but the bus says there are like 12,000,

98 00:11:49.490 00:11:50.760 Awaish Kumar: so like 2,000 orders.

99 00:11:50.760 00:11:51.200 Robert Tseng: That already.

100 00:11:51.200 00:11:51.839 Awaish Kumar: Thank you.

101 00:11:52.740 00:11:54.720 Robert Tseng: I’m just saying that, yeah, yeah.

102 00:11:55.590 00:11:56.800 Awaish Kumar: Sit here.

103 00:11:57.170 00:12:02.240 Robert Tseng: Okay, let me just like, take a look at that 1st I’ll look at the Ui, you. You run the query.

104 00:12:04.940 00:12:09.469 Robert Tseng: I’m just gonna do yeah, July 10th and the 17.th

105 00:12:09.670 00:12:10.930 Robert Tseng: Then I’m just gonna.

106 00:12:11.700 00:12:13.110 Awaish Kumar: Okay, excellent.

107 00:12:13.110 00:12:15.939 Awaish Kumar: Okay, so we are, including the July 10 as well. Right?

108 00:12:16.250 00:12:22.450 Robert Tseng: Yeah, I mean, I’m seeing like, 7,000.

109 00:13:44.280 00:13:46.930 Awaish Kumar: I get 7,135.

110 00:13:49.030 00:13:52.949 Robert Tseng: Okay, you get 7,135.

111 00:13:53.660 00:13:54.260 Awaish Kumar: Yep.

112 00:13:55.392 00:14:05.020 Robert Tseng: But I get 7,023. So somehow what we have is more, which is fine.

113 00:14:05.820 00:14:11.539 Robert Tseng: I wonder what’s what’s missing. So

114 00:14:13.570 00:14:20.220 Robert Tseng: maybe I’ll just go and get the order. Ids order numbers.

115 00:14:20.970 00:14:22.799 Awaish Kumar: We, we have duplicate. Let me.

116 00:14:34.740 00:14:39.040 Robert Tseng: I’ll give you a list, and you could just like running seed order.

117 00:14:43.960 00:14:49.410 Robert Tseng: Sorry I’m like not. I don’t have my like desktop. So like running these types of operations takes so long

118 00:14:54.650 00:14:57.120 Robert Tseng: need to upgrade my laptop so slow.

119 00:15:04.330 00:15:08.190 Robert Tseng: Yeah, so write this as a list.

120 00:17:30.330 00:17:32.526 Robert Tseng: Okay, alright. So I just sent you

121 00:17:34.640 00:17:42.093 Robert Tseng: in list of the orders that I have. We could just like we could just look at

122 00:17:43.820 00:17:45.890 Robert Tseng: Yeah, daily discrepancy.

123 00:17:46.420 00:17:48.949 Robert Tseng: Then we can. Then I’ll be able to tell, like.

124 00:17:49.430 00:17:52.540 Robert Tseng: Hey, like, 90% of those are because

125 00:17:52.950 00:17:57.670 Robert Tseng: are are from like the past day as opposed to like, you know. 7 days ago.

126 00:18:10.160 00:18:16.680 Robert Tseng: Okay, while you’re running that I’m just gonna keep thinking. So assuming that that’s fine, I mean, the margin of error is fine. But 7.

127 00:18:17.600 00:18:18.490 Robert Tseng: That’s

128 00:18:24.790 00:18:28.648 Robert Tseng: yeah. I mean, it’s like a 2% difference. I’m I’m not really concerned about that.

129 00:18:30.210 00:18:32.409 Robert Tseng: So I still think it’s good enough.

130 00:18:32.850 00:18:36.539 Robert Tseng: But yeah, we’ll probably will have to improve it later on.

131 00:18:37.600 00:18:38.045 Awaish Kumar: Okay.

132 00:18:39.970 00:18:44.680 Robert Tseng: But I am curious, just like if it’s all from the past day or something.

133 00:18:44.680 00:18:47.199 Awaish Kumar: So like you have shared me the sub sub.

134 00:18:47.450 00:18:51.179 Awaish Kumar: They get fuel ids right. Not all.

135 00:18:53.040 00:18:53.640 Robert Tseng: Oh, really.

136 00:18:55.190 00:19:03.129 Awaish Kumar: You have shared some fewest few order numbers. I’m like, like what

137 00:19:03.740 00:19:06.789 Awaish Kumar: you just copied some random ones, or they are more like.

138 00:19:07.960 00:19:17.779 Robert Tseng: I I tried to just run it through Gpt to reformat like the spreadsheet that I have. But I guess it didn’t really do it kind of just truncated it.

139 00:19:28.000 00:19:30.400 Robert Tseng: alright, I’m just gonna send

140 00:19:36.930 00:19:41.299 Robert Tseng: to run this so quickly if I just open my python open. This is not running.

141 00:19:45.660 00:19:48.936 Robert Tseng: Okay, no, it’s all good. I’ll I can run the analysis. So

142 00:19:50.700 00:20:03.210 Robert Tseng: yeah, this will take me like a minute. But okay, I don’t want to get stuck on this. So let’s just assume that 2% is not a big difference. We can. We can move forward. Anyway, I can just confirm. I can. I can run. I can get the confirmation afterwards.

143 00:20:06.000 00:20:12.610 Robert Tseng: okay, so that’s order completed and then order shipped is not the same event as order completed right? But what I I did show that

144 00:20:13.204 00:20:16.840 Robert Tseng: you know, 92 to 97% of

145 00:20:17.370 00:20:20.929 Robert Tseng: orders that are completed end up getting shipped.

146 00:20:21.260 00:20:27.490 Robert Tseng: Get a ship status within within 7 days. So there’s probably another like, let’s just say 5% drop off in 7 days.

147 00:20:27.870 00:20:31.090 Robert Tseng: So we have a 2% drop off, and then another 5% drop off.

148 00:20:31.210 00:20:35.179 Robert Tseng: You know, we’re still, you know, hovering around like, you know, 7%.

149 00:20:35.670 00:20:42.900 Robert Tseng: We might lose 7% of of orders. Within that attribution window. I think that’s fine. I feel like this is, this is good enough to ship.

150 00:20:45.090 00:20:49.870 Awaish Kumar: Yeah. So the the carrier I shared it has the order, date ship date

151 00:20:50.160 00:20:55.160 Awaish Kumar: sent to pharmacy date like, and also the flag is sent to pharmacy.

152 00:20:55.550 00:21:00.120 Robert Tseng: Okay, perfect. So I’ll share my screen. And I’ll just show you quickly.

153 00:21:02.500 00:21:10.409 Robert Tseng: so yeah, assuming that you get that set up what we have here is we have a list of sources. If you could hook this up because I’ll be probably on the plane.

154 00:21:11.566 00:21:17.360 Robert Tseng: There’s an Eden warehouse source. Here we’ve been adding model

155 00:21:17.831 00:21:22.219 Robert Tseng: and then we can either use the sequel editor, which I don’t know if I would recommend, then that

156 00:21:22.760 00:21:25.639 Robert Tseng: maybe we have to use the Bpp model here.

157 00:21:27.350 00:21:29.820 Robert Tseng: Not sure if it’s set up this meeting model or not.

158 00:21:29.820 00:21:32.189 Awaish Kumar: I can just copy paste the query.

159 00:21:32.650 00:21:38.625 Robert Tseng: Okay, perfectly fine. Yeah. You do that. You get the model in, and then you can create it.

160 00:21:39.830 00:21:40.240 Awaish Kumar: Okay.

161 00:21:40.240 00:21:41.340 Robert Tseng: Yeah, so my system.

162 00:21:41.340 00:21:42.580 Awaish Kumar: What exactly are you doing?

163 00:21:42.690 00:21:46.210 Awaish Kumar: What exactly do you need like the query I wrote.

164 00:21:46.952 00:21:49.549 Awaish Kumar: It’s just 3 4 columns like.

165 00:21:49.770 00:21:55.320 Awaish Kumar: what exact columns do you need or like? Can just I reference the query you showed in stand up.

166 00:21:56.856 00:22:03.609 Robert Tseng: Yeah. So what? Exactly I will need, I guess I.

167 00:22:03.890 00:22:06.479 Awaish Kumar: Do you need everything from order completed.

168 00:22:10.400 00:22:17.919 Robert Tseng: I think it would be good to have everything there, because I can just map it out or, okay, fine. You’re right. Maybe I should re go from there.

169 00:22:18.581 00:22:31.080 Robert Tseng: Yeah, actually, if you mind. Just yeah. I mean, I can put everything in there, because once I created, then I have to like, actually define the mapping. If there’s anything that needs to be excluded, I can just go back and edit the query and take it out. So I think that’s fine.

170 00:22:32.720 00:22:36.720 Awaish Kumar: Okay, so, like.

171 00:22:36.720 00:22:37.320 Robert Tseng: Cool, cool.

172 00:22:37.320 00:22:37.820 Awaish Kumar: Have a good day.

173 00:22:38.233 00:22:39.060 Robert Tseng: To help.

174 00:22:39.890 00:22:44.920 Awaish Kumar: Okay, I will get everything from order completed and and everything from order shipped.

175 00:22:45.140 00:22:49.129 Awaish Kumar: And then you can like, select whatever columns you need.

176 00:22:49.270 00:22:51.770 Robert Tseng: Yeah, and the yeah and the dial machine. Yep.

177 00:22:52.340 00:23:00.349 Robert Tseng: okay, cool. So I’m assuming you’ll send that to me shortly, and then I’ll run this through, and I’ll set up the mapping, and hopefully I can fire and test event.

178 00:23:00.700 00:23:06.650 Robert Tseng: You know, the next 30 min before, and if it works, then I’ll kind of

179 00:23:07.010 00:23:08.190 Robert Tseng: then I think we can.

180 00:23:08.360 00:23:12.589 Robert Tseng: Oh, yeah, then we’ll then we’ll try to figure out how to push it to the next step.

181 00:23:12.590 00:23:15.169 Awaish Kumar: Should I build this in the segment like?

182 00:23:15.670 00:23:17.389 Awaish Kumar: Should I define this model right.

183 00:23:18.493 00:23:23.370 Robert Tseng: Yeah, this models and segment. I mean, yeah, you can.

184 00:23:25.510 00:23:33.789 Awaish Kumar: Like I’m I’m just want to be clear, like the query will write, then I will create this model, and then I will inform you right? Is that what needed? Okay.

185 00:23:33.790 00:23:35.460 Robert Tseng: Yeah, that’s it. Okay.

186 00:23:36.910 00:23:38.130 Awaish Kumar: Perfect. Thank you.

187 00:23:38.130 00:23:39.620 Robert Tseng: Thanks, but.