Meeting Title: Metaplane-Data Obervability Date: 2025-06-17 Meeting participants: Uttam Kumaran, Demilade Agboola, Awaish Kumar, Luke Daque
WEBVTT
1 00:01:56.230 ⇒ 00:01:57.610 Demilade Agboola: Hold on!
2 00:01:57.610 ⇒ 00:01:58.790 Uttam Kumaran: Hey! How are you?
3 00:01:59.570 ⇒ 00:02:00.989 Demilade Agboola: I’m pretty good. How are you?
4 00:02:01.390 ⇒ 00:02:02.150 Uttam Kumaran: Good.
5 00:02:42.320 ⇒ 00:02:43.520 Uttam Kumaran: Let me ping a wish.
6 00:02:50.740 ⇒ 00:03:06.040 Demilade Agboola: Something I told robert is going forward ideally. I think we should have especially when people reach out to us for request that we’re not going to respond to any like. We’re not going to make dashboard changes in under 24 h.
7 00:03:06.500 ⇒ 00:03:11.489 Demilade Agboola: We can make like we can give them Csvs. We can respond to things, but like
8 00:03:12.120 ⇒ 00:03:18.890 Demilade Agboola: changing our entire dashboard, I don’t think like, especially when the the models are linked.
9 00:03:19.690 ⇒ 00:03:22.520 Demilade Agboola: So other dashboards. I don’t think it’s the best practice.
10 00:03:23.320 ⇒ 00:03:23.890 Demilade Agboola: I agree.
11 00:03:23.890 ⇒ 00:03:25.629 Uttam Kumaran: I agree. I saw that. And I was like.
12 00:03:26.280 ⇒ 00:03:28.870 Uttam Kumaran: why did you? Why did we agree to this.
13 00:03:31.130 ⇒ 00:03:48.180 Demilade Agboola: Yeah. So cause like, usually, how I’m used to building out models is each model or as much as possible, we build models for the dashboard. So like, if there’s a retention dashboard, there’ll be a retention model or a couple of models for the retention dashboard. So if anything happens
14 00:03:48.310 ⇒ 00:03:54.670 Demilade Agboola: in that flow. It’s localized like, if anything goes wrong, it’s just a retention dashboard that’s affected.
15 00:03:55.140 ⇒ 00:03:55.980 Uttam Kumaran: Yeah.
16 00:03:57.070 ⇒ 00:04:02.949 Demilade Agboola: Unless it’s like coming upstream, which is another issue on its own. But like this.
17 00:04:03.060 ⇒ 00:04:10.169 Demilade Agboola: because this model product sales transaction by summary sales summary by transaction is using multiple models
18 00:04:10.570 ⇒ 00:04:18.939 Demilade Agboola: that like kind of cascaded across the number of dashboards to which people are like, hey? Why, why is this inflated. Why is that inflated?
19 00:04:20.495 ⇒ 00:04:21.149 Demilade Agboola: Yeah.
20 00:04:24.190 ⇒ 00:04:28.040 Uttam Kumaran: Yeah, I I also agree, like I don’t think we should.
21 00:04:28.950 ⇒ 00:04:36.130 Uttam Kumaran: I don’t think we should do things within 24 h at all.
22 00:04:36.790 ⇒ 00:04:45.679 Uttam Kumaran: I don’t think that’s I mean, I also think that’s like such a blanket statement without having any triage. You should just say we’re gonna do it as we can do it
23 00:04:46.200 ⇒ 00:04:50.049 Uttam Kumaran: as fast as we can get it done. Why, even why make any promises.
24 00:04:50.400 ⇒ 00:04:56.370 Demilade Agboola: Yeah. So Cutter was like, I need this dashboard. I you know they have like their end of week reports and stuff. And he was like
25 00:04:56.810 ⇒ 00:05:04.440 Demilade Agboola: pushing for the dash through. And that’s that’s what I was saying like, if it was a query like, Hey, can you query these numbers for us
26 00:05:05.520 ⇒ 00:05:15.940 Demilade Agboola: done no problem. Hey? Here’s your over the last. What 10 days over like this is these are the 1st time product users and 1st time product revenue. That’s not a problem.
27 00:05:16.190 ⇒ 00:05:24.370 Demilade Agboola: But when you are like pushing for something on your dashboard that requires remodeling.
28 00:05:24.670 ⇒ 00:05:30.979 Demilade Agboola: That requires Qa. That request like it starts to like it’s it’s not.
29 00:05:30.980 ⇒ 00:05:38.020 Uttam Kumaran: Process. Yeah, I mean, yeah, I agree, like, I don’t.
30 00:05:38.280 ⇒ 00:05:39.130 Uttam Kumaran: Yeah.
31 00:05:39.360 ⇒ 00:05:44.180 Uttam Kumaran: I mean this, that could be an outcome of this meeting. We could talk about that. So maybe.
32 00:05:44.520 ⇒ 00:05:45.950 Demilade Agboola: I don’t think so. Good. Yeah.
33 00:05:47.270 ⇒ 00:05:52.699 Uttam Kumaran: Okay, so at least the 3 of us here. So let me get started. I have a brief agenda.
34 00:05:56.400 ⇒ 00:05:59.250 Uttam Kumaran: But I will just share kind of probably my wholes.
35 00:06:01.910 ⇒ 00:06:05.660 Uttam Kumaran: This here let me pull up
36 00:06:19.030 ⇒ 00:06:23.740 Uttam Kumaran: away. Should we remove that old data platform notion.
37 00:06:26.940 ⇒ 00:06:27.605 Awaish Kumar: Hi!
38 00:06:28.810 ⇒ 00:06:32.649 Awaish Kumar: I never seen the old one. I just created a new one.
39 00:06:34.150 ⇒ 00:06:39.689 Uttam Kumaran: Well, no, there was a 1 where we listed like here are the priorities, for like observability here the priorities for documentation.
40 00:06:39.890 ⇒ 00:06:41.699 Awaish Kumar: Oh, that page. Okay.
41 00:06:42.220 ⇒ 00:06:43.700 Uttam Kumaran: Yeah, I just forgot what it was called.
42 00:06:43.700 ⇒ 00:06:47.270 Awaish Kumar: No, we we didn’t remove it, but I don’t know where to.
43 00:06:48.010 ⇒ 00:06:48.820 Awaish Kumar: Fine.
44 00:06:51.860 ⇒ 00:06:55.534 Uttam Kumaran: Okay, so I’m gonna share this doc.
45 00:06:57.880 ⇒ 00:07:02.179 Uttam Kumaran: So if everyone, if y’all want to follow along with me,
46 00:07:08.020 ⇒ 00:07:10.120 Uttam Kumaran: let me send it here in zoom.
47 00:07:10.570 ⇒ 00:07:12.530 Uttam Kumaran: And then I’m gonna share.
48 00:07:13.360 ⇒ 00:07:15.880 Uttam Kumaran: I’m gonna share my screen here.
49 00:07:16.670 ⇒ 00:07:20.359 Uttam Kumaran: And we can walk through this. So
50 00:07:22.660 ⇒ 00:07:31.889 Uttam Kumaran: yeah, basically, I guess first, st maybe I wanted to just get like I have a kind of a sense of what’s going on. But maybe someone wants to give me just the 2 min
51 00:07:32.340 ⇒ 00:07:37.059 Uttam Kumaran: on the Eden issue. I know you guys have been talking about it probably for a while, but
52 00:07:37.410 ⇒ 00:07:40.920 Uttam Kumaran: if you just wanna so just I know I’m not missing anything
53 00:07:41.730 ⇒ 00:07:45.150 Uttam Kumaran: I assume a Pr just caused like a
54 00:07:46.010 ⇒ 00:07:51.320 Uttam Kumaran: messed up join, or something, and it caused classic like fan out, which.
55 00:07:51.940 ⇒ 00:07:56.470 Uttam Kumaran: like just duplicated some metric, is that like the rough.
56 00:07:57.020 ⇒ 00:07:59.499 Demilade Agboola: Yeah, that’s that’s the rough summer. Yeah.
57 00:07:59.500 ⇒ 00:08:05.930 Uttam Kumaran: Okay. So I assume, like, right now, are there still problems.
58 00:08:06.950 ⇒ 00:08:19.519 Demilade Agboola: Oh, no, the number. So the the problems we have are like based off the previous calculation that existed. So we’re trying to rework it. But in terms of like that Pr issue. No, it’s it’s it’s fine.
59 00:08:19.520 ⇒ 00:08:20.350 Uttam Kumaran: Okay, cool.
60 00:08:20.770 ⇒ 00:08:23.719 Uttam Kumaran: And then my second question is, did we?
61 00:08:24.440 ⇒ 00:08:29.120 Uttam Kumaran: I kind, I kind of actually want to go through the Pr and
62 00:08:29.340 ⇒ 00:08:34.590 Uttam Kumaran: this Metaplan, slack and metaplane to see if we saw anything.
63 00:08:35.799 ⇒ 00:08:41.471 Uttam Kumaran: So I know Demalati, you had that
64 00:08:42.120 ⇒ 00:08:46.320 Uttam Kumaran: that pr that you shared. Maybe I could just go find it, and we can just see
65 00:08:47.010 ⇒ 00:08:48.980 Uttam Kumaran: like which one it was.
66 00:08:51.560 ⇒ 00:08:55.590 Awaish Kumar: So I just want to mention that
67 00:08:55.890 ⇒ 00:09:04.479 Awaish Kumar: Meta plane does not work for on Priya, like, if a table
68 00:09:05.690 ⇒ 00:09:09.120 Awaish Kumar: was updated and it was merged like
69 00:09:10.020 ⇒ 00:09:16.550 Awaish Kumar: next day. We might go like after after the merge, when when the data
70 00:09:16.850 ⇒ 00:09:22.379 Awaish Kumar: refreshes in the production, then we are going to see if there’s any incident.
71 00:09:22.820 ⇒ 00:09:23.300 Awaish Kumar: So.
72 00:09:23.300 ⇒ 00:09:29.539 Uttam Kumaran: Okay. But I think even for now I just want to go see what it’s surface. I hear you that like it’s
73 00:09:29.660 ⇒ 00:09:32.269 Uttam Kumaran: it’s it’s running on what’s in prod. And it’s
74 00:09:32.420 ⇒ 00:09:36.530 Uttam Kumaran: it’s not. But I want to kind of note that down. So everybody’s aware, because I wasn’t aware of that.
75 00:09:37.370 ⇒ 00:09:41.990 Awaish Kumar: So let’s just go through. At least I’ll have screenshots of like what it is. So.
76 00:09:42.730 ⇒ 00:09:46.830 Uttam Kumaran: That might point me to which Pr. It was.
77 00:09:48.380 ⇒ 00:09:53.420 Demilade Agboola: 1, 2, 3.
78 00:09:56.800 ⇒ 00:09:57.360 Uttam Kumaran: Cool.
79 00:09:59.870 ⇒ 00:10:03.799 Uttam Kumaran: So looking at the so I’m just going to.
80 00:10:05.830 ⇒ 00:10:16.839 Demilade Agboola: Another thing. I also just be like, obviously beyond data. Observability is, I’ve asked Annie, like, if over the course of this of this week she can create like dashboards
81 00:10:16.980 ⇒ 00:10:19.560 Demilade Agboola: off staging. Yeah. So
82 00:10:20.380 ⇒ 00:10:29.389 Demilade Agboola: any merge we can kind of just do an eye check and just kind of see, hey? This is what the staging values are, and that’s what the prod values are like. That makes sense. That doesn’t make any sense.
83 00:10:32.020 ⇒ 00:10:34.969 Uttam Kumaran: So yeah, let’s, I’m gonna just look at this. So it looks like
84 00:10:35.770 ⇒ 00:10:39.080 Uttam Kumaran: it said, there are some downstream impacts.
85 00:10:39.860 ⇒ 00:10:47.509 Uttam Kumaran: But then it doesn’t look like it impacted any. There was no.
86 00:10:50.020 ⇒ 00:10:51.974 Uttam Kumaran: I mean, basically, there were no
87 00:10:53.000 ⇒ 00:11:00.040 Uttam Kumaran: flags that were tripped. So let’s go look at in Meta plane to kind of confirm what Oasia saying
88 00:11:07.780 ⇒ 00:11:12.510 Uttam Kumaran: so ideally, it should have found it, in which in which table.
89 00:11:16.132 ⇒ 00:11:17.430 Demilade Agboola: Product, sale. Summary.
90 00:11:18.430 ⇒ 00:11:19.850 Uttam Kumaran: Bye transaction.
91 00:11:21.810 ⇒ 00:11:22.740 Awaish Kumar: Yeah, so.
92 00:11:23.610 ⇒ 00:11:36.050 Uttam Kumaran: So ideally, it should have found that just like almost probably every metric was inflated.
93 00:11:36.780 ⇒ 00:11:54.989 Demilade Agboola: Yeah, like. So we had increased ad spend. Because, for instance, there were multiple. That’s actually one of the things that could have pointed out over the weekend. The ad spend was ridiculously high. And that’s cause. Obviously the it found out it was adding multiple things.
94 00:11:55.330 ⇒ 00:12:01.120 Uttam Kumaran: So we should have seen that row count was actually inflated. But it actually doesn’t look like we have
95 00:12:01.860 ⇒ 00:12:04.159 Uttam Kumaran: a test, the or the sums.
96 00:12:08.560 ⇒ 00:12:11.339 Uttam Kumaran: I’m just gonna just wanna save this. So that cause we’ll.
97 00:12:11.340 ⇒ 00:12:15.699 Awaish Kumar: We. Is this like the what I’m trying to say is that you? You?
98 00:12:15.700 ⇒ 00:12:16.420 Awaish Kumar: No, no, no.
99 00:12:16.780 ⇒ 00:12:22.710 Uttam Kumaran: I I hear you. I hear you. But that doesn’t mean it’s right like
100 00:12:22.990 ⇒ 00:12:29.489 Uttam Kumaran: I get that. It’s looking at what’s in production. And so it didn’t merge the change and then identify what it impact was.
101 00:12:29.690 ⇒ 00:12:36.429 Uttam Kumaran: But then it should have, or like, we need some way to do that right, because I don’t think I didn’t know that I don’t think Demolati knew that.
102 00:12:37.090 ⇒ 00:12:43.060 Uttam Kumaran: like, I think, I assume that these tests were running on the the changed values.
103 00:12:43.060 ⇒ 00:12:44.210 Awaish Kumar: No. No. Yeah, yeah.
104 00:12:44.410 ⇒ 00:12:48.250 Uttam Kumaran: So that’s what I just want to go through and like, get it. Get an audit of like.
105 00:12:48.410 ⇒ 00:12:57.010 Uttam Kumaran: Okay, what was alerted? And then what? What do we expect? Because we should try to recreate this Pr and see whether it flags? Right?
106 00:12:57.270 ⇒ 00:13:02.979 Uttam Kumaran: So I kind of want to look at. Okay, the impact. This is the impact report.
107 00:13:03.820 ⇒ 00:13:06.280 Uttam Kumaran: This is basically, when was this merged.
108 00:13:07.210 ⇒ 00:13:07.665 Demilade Agboola: Friday.
109 00:13:09.960 ⇒ 00:13:11.310 Awaish Kumar: Which is part of the problem.
110 00:13:11.310 ⇒ 00:13:15.490 Awaish Kumar: It should have flagged on, maybe Saturday or late Friday.
111 00:13:16.010 ⇒ 00:13:16.610 Demilade Agboola: Yeah.
112 00:13:17.210 ⇒ 00:13:20.620 Awaish Kumar: Because in the production the rule, the numbers inflate.
113 00:13:21.720 ⇒ 00:13:24.659 Uttam Kumaran: So Friday was the 13.th
114 00:13:27.872 ⇒ 00:13:36.520 Uttam Kumaran: Yeah, I don’t know. It still doesn’t seem like there was any flagging of anything.
115 00:13:39.050 ⇒ 00:13:42.309 Uttam Kumaran: Okay? Well, we can look at that. So yeah, I mean, basically.
116 00:13:42.310 ⇒ 00:13:43.480 Awaish Kumar: Table monitors.
117 00:13:44.370 ⇒ 00:13:50.610 Uttam Kumaran: Yeah, let me just write this down. So there were no impacts. The row counts.
118 00:13:51.210 ⇒ 00:13:58.500 Uttam Kumaran: We also have no some tests on metrics.
119 00:14:04.170 ⇒ 00:14:06.230 Uttam Kumaran: So if I go to monitoring.
120 00:14:07.640 ⇒ 00:14:10.060 Uttam Kumaran: And then I go to.
121 00:14:11.300 ⇒ 00:14:12.939 Uttam Kumaran: I just click on a table.
122 00:14:16.360 ⇒ 00:14:19.249 Uttam Kumaran: So we’re looking at product sales summary.
123 00:14:27.530 ⇒ 00:14:30.450 Uttam Kumaran: Okay? I mean, it looks like there is some.
124 00:14:33.400 ⇒ 00:14:34.810 Uttam Kumaran: There’s something.
125 00:14:38.370 ⇒ 00:14:44.200 Uttam Kumaran: Oh, but I mean I don’t know. This looks like unrelated.
126 00:14:47.150 ⇒ 00:14:48.519 Demilade Agboola: What’s what’s that drop.
127 00:14:49.220 ⇒ 00:14:50.740 Uttam Kumaran: This is just a Pr running.
128 00:14:51.270 ⇒ 00:14:54.069 Uttam Kumaran: I mean, like the, this is just a model running. Basically.
129 00:14:54.360 ⇒ 00:14:57.890 Uttam Kumaran: you can see it’s because the test is running at 11 0, 2.
130 00:14:59.590 ⇒ 00:15:00.760 Uttam Kumaran: I feel like this.
131 00:15:03.000 ⇒ 00:15:05.680 Uttam Kumaran: This is most likely when the models are running again.
132 00:15:06.800 ⇒ 00:15:11.090 Awaish Kumar: Okay, so we yeah, the the thing is probably
133 00:15:13.050 ⇒ 00:15:18.120 Awaish Kumar: in the product sales summary. The rules are not changing that much. It’s because
134 00:15:18.300 ⇒ 00:15:22.939 Awaish Kumar: because it is an aggregated table. So if if the orders are duplicated.
135 00:15:23.060 ⇒ 00:15:28.443 Awaish Kumar: we are just counting South so low, count and products send somebody, but by transaction does not.
136 00:15:28.760 ⇒ 00:15:31.419 Uttam Kumaran: But what’s what’s before? What’s before that? Then?
137 00:15:34.150 ⇒ 00:15:40.199 Uttam Kumaran: Right like, if if okay, I get it, that product sales summary is the aggregate. But then what is it aggregating over.
138 00:15:40.990 ⇒ 00:15:46.100 Awaish Kumar: Yeah. Like order counts. Some of order counts should inflate right? You download it.
139 00:15:47.290 ⇒ 00:15:47.970 Demilade Agboola: Yeah. The summer.
140 00:15:47.970 ⇒ 00:15:48.540 Uttam Kumaran: I see.
141 00:15:48.540 ⇒ 00:15:49.150 Demilade Agboola: That’s true.
142 00:15:49.550 ⇒ 00:15:55.179 Demilade Agboola: Some of ad spend should inflate
143 00:15:57.760 ⇒ 00:16:04.910 Uttam Kumaran: But I guess this is where like we, because probably didn’t have a monitor on the core metrics.
144 00:16:06.490 ⇒ 00:16:07.860 Uttam Kumaran: it may not have caught.
145 00:16:08.790 ⇒ 00:16:12.750 Uttam Kumaran: Because, okay, that makes sense that the row count wasn’t affected.
146 00:16:14.620 ⇒ 00:16:19.840 Demilade Agboola: I mean, I I believe it was, but just not like a huge
147 00:16:20.640 ⇒ 00:16:24.170 Demilade Agboola: like. It wasn’t gonna be a huge thing.
148 00:16:26.800 ⇒ 00:16:29.220 Uttam Kumaran: Okay, okay, I hear you.
149 00:16:29.920 ⇒ 00:16:33.459 Uttam Kumaran: So that’s something also is like, we don’t have a monitor on.
150 00:16:34.140 ⇒ 00:16:37.719 Uttam Kumaran: We just need a monitor on the core some of the core metrics as well.
151 00:16:39.550 ⇒ 00:16:42.990 Uttam Kumaran: but we don’t have monitors on.
152 00:16:48.700 ⇒ 00:16:51.189 Awaish Kumar: Also like this is not a table.
153 00:16:52.740 ⇒ 00:16:53.579 Uttam Kumaran: What is it?
154 00:16:54.950 ⇒ 00:16:56.940 Awaish Kumar: Product sales summary by transaction.
155 00:16:56.940 ⇒ 00:17:00.470 Uttam Kumaran: Oh, okay, but same thing, right? We we don’t be sure.
156 00:17:00.740 ⇒ 00:17:01.010 Awaish Kumar: Please.
157 00:17:01.010 ⇒ 00:17:01.380 Awaish Kumar: Okay.
158 00:17:01.380 ⇒ 00:17:01.960 Awaish Kumar: Yes.
159 00:17:08.940 ⇒ 00:17:10.139 Uttam Kumaran: Alright, here it is.
160 00:17:14.220 ⇒ 00:17:18.719 Uttam Kumaran: Okay. So it looks like there was an issue
161 00:17:20.230 ⇒ 00:17:23.770 Uttam Kumaran: 6, 13 at 4 pm. Seems about right.
162 00:17:24.390 ⇒ 00:17:27.520 Uttam Kumaran: So let’s see, if did we get anything in slack.
163 00:17:35.640 ⇒ 00:17:36.600 Awaish Kumar: Both, in.
164 00:17:37.300 ⇒ 00:17:38.550 Uttam Kumaran: Oh, 4 Pm. Sorry.
165 00:17:44.530 ⇒ 00:17:45.320 Awaish Kumar: It is.
166 00:17:46.860 ⇒ 00:17:47.580 Uttam Kumaran: Where.
167 00:17:48.750 ⇒ 00:17:52.979 Awaish Kumar: It was something related to product, said something here, just scroll up a little bit.
168 00:18:03.800 ⇒ 00:18:08.480 Uttam Kumaran: June 15, th June 15, th June 15, th June 12.th
169 00:18:09.060 ⇒ 00:18:10.130 Uttam Kumaran: Yeah, nothing.
170 00:18:10.130 ⇒ 00:18:11.290 Awaish Kumar: 13.
171 00:18:11.950 ⇒ 00:18:15.669 Uttam Kumaran: No, no, no! But these are. These are just schema. Change alerts.
172 00:18:15.670 ⇒ 00:18:16.870 Awaish Kumar: Okay. Okay.
173 00:18:17.420 ⇒ 00:18:21.779 Uttam Kumaran: Yeah, these are just schema change alerts. So this didn’t even go anywhere.
174 00:18:22.280 ⇒ 00:18:25.939 Demilade Agboola: Yeah, unless, like, Ryan got the an email.
175 00:18:26.510 ⇒ 00:18:30.820 Demilade Agboola: But I think he said he like he fought like, maybe that’s what.
176 00:18:30.820 ⇒ 00:18:33.370 Uttam Kumaran: Yeah. But dude, what the fuck is like that’s useless.
177 00:18:34.210 ⇒ 00:18:39.969 Uttam Kumaran: It’s like, Okay, alright. Well, so it worked.
178 00:18:41.150 ⇒ 00:18:42.210 Uttam Kumaran: So I don’t wanna lie.
179 00:18:43.090 ⇒ 00:18:46.469 Uttam Kumaran: It’s like it works, but nothing we we don’t even know.
180 00:18:47.500 ⇒ 00:18:49.899 Uttam Kumaran: Alright. At least that’s a little bit helpful.
181 00:18:51.130 ⇒ 00:18:51.580 Awaish Kumar: So, yeah.
182 00:18:52.690 ⇒ 00:18:53.630 Demilade Agboola: So they’re all hot spots.
183 00:18:53.630 ⇒ 00:18:59.820 Awaish Kumar: And other notifications. Also, like we have set up some default values, I think.
184 00:19:00.430 ⇒ 00:19:05.259 Awaish Kumar: No, I know. So we’re one is we’ll turn off. I think we’ll turn off all these. Second is.
185 00:19:06.450 ⇒ 00:19:12.220 Uttam Kumaran: Okay. So I mean, look, we have. So we have 2 monitors. Here. We have row, count, and freshness. So the row count
186 00:19:12.440 ⇒ 00:19:13.469 Uttam Kumaran: got caught.
187 00:19:14.430 ⇒ 00:19:20.010 Uttam Kumaran: It’s just the fact that we had. No, this wasn’t going anywhere.
188 00:19:21.060 ⇒ 00:19:22.899 Demilade Agboola: I think, yeah, I think.
189 00:19:23.760 ⇒ 00:19:24.570 Uttam Kumaran: Okay.
190 00:19:26.570 ⇒ 00:19:27.779 Demilade Agboola: So I think, for this.
191 00:19:27.780 ⇒ 00:19:32.659 Awaish Kumar: That’s that’s my point, like, is it sent? Is it connected to send the.
192 00:19:32.660 ⇒ 00:19:38.899 Uttam Kumaran: No, no, but I guess, like we we can, we? No, but we can figure that all out. Let’s just go. Let’s go a little bit methodical, but
193 00:19:39.420 ⇒ 00:19:41.850 Uttam Kumaran: at least we know that. Okay.
194 00:19:42.040 ⇒ 00:19:46.720 Uttam Kumaran: in an ideal state, we did have some monitors. However, I think one thing is clear.
195 00:19:47.190 ⇒ 00:19:52.920 Uttam Kumaran: there’s a couple of if I was to just jot down a couple of like the current state
196 00:19:53.320 ⇒ 00:19:54.490 Uttam Kumaran: right now.
197 00:19:56.010 ⇒ 00:19:57.150 Uttam Kumaran: We have.
198 00:19:58.080 ⇒ 00:20:00.670 Uttam Kumaran: I’m just gonna note this down somewhere.
199 00:20:06.840 ⇒ 00:20:09.819 Uttam Kumaran: So we don’t have monitors in the core metrics for the core tables.
200 00:20:09.980 ⇒ 00:20:15.690 Uttam Kumaran: We have monitors on sort of like unneeded tables.
201 00:20:16.470 ⇒ 00:20:20.999 Uttam Kumaran: I mean, they’re gonna they’re gonna bleed us dry with the pricing. If we wanted as many tables.
202 00:20:21.120 ⇒ 00:20:23.500 Uttam Kumaran: we have monitors on on like.
203 00:20:23.720 ⇒ 00:20:25.750 Uttam Kumaran: I would say low priority, like
204 00:20:26.900 ⇒ 00:20:32.349 Uttam Kumaran: on p. 2 and p. 3. Tables. How about that? Right? Like not p. 1 or P. 0. Tables.
205 00:20:34.310 ⇒ 00:20:41.449 Awaish Kumar: Yeah. But these are all like we initially, we just edit all the march table. So these are all March table. We have to now
206 00:20:41.770 ⇒ 00:20:44.499 Awaish Kumar: filter out the critical marks. Table.
207 00:20:44.840 ⇒ 00:20:45.570 Uttam Kumaran: Yes.
208 00:20:45.930 ⇒ 00:20:53.330 Uttam Kumaran: So I’m just gonna actually literally go and copy this, because this is literally what we should have
209 00:20:54.600 ⇒ 00:20:57.950 Uttam Kumaran: should have hit that channel and been like yo
210 00:20:58.450 ⇒ 00:21:03.780 Uttam Kumaran: like out look! It would have been at 6 14 that
211 00:21:07.253 ⇒ 00:21:10.389 Uttam Kumaran: like, what is this? 5 Pm. You would have known
212 00:21:11.910 ⇒ 00:21:13.599 Uttam Kumaran: would have given us a little bit of time.
213 00:21:16.430 ⇒ 00:21:21.030 Demilade Agboola: We went from 12 K. Rows to like almost a hundred K. Rows.
214 00:21:21.030 ⇒ 00:21:27.550 Uttam Kumaran: If you look at this, is that the it? The reason it didn’t alert is because it’s been high forever.
215 00:21:28.970 ⇒ 00:21:31.629 Demilade Agboola: Yeah, I think it’s maybe it’s too sensitive.
216 00:21:31.980 ⇒ 00:21:35.865 Uttam Kumaran: It’s not sensitive, but also, I think, like we haven’t
217 00:21:36.850 ⇒ 00:21:37.389 Demilade Agboola: Thank you.
218 00:21:37.390 ⇒ 00:21:37.970 Uttam Kumaran: Weird
219 00:21:38.410 ⇒ 00:21:50.879 Uttam Kumaran: what it should be. But this is the thing is like we should only focus on like the 5 core tables to start. So okay, so we have a couple of things. One is like, I, maybe the slack monitor is working. I don’t know. It’s just that
220 00:21:51.020 ⇒ 00:21:52.389 Uttam Kumaran: the Monitor.
221 00:21:53.100 ⇒ 00:21:53.929 Demilade Agboola: Yeah, but like.
222 00:21:53.930 ⇒ 00:21:56.759 Uttam Kumaran: Or this table was already like firing.
223 00:21:58.770 ⇒ 00:22:08.409 Demilade Agboola: Yeah, so far, I think it’s configured a certain number of rows which are not like being maintained. So it’s ex. It’s still expecting 7,002 58 to 7,000.
224 00:22:08.410 ⇒ 00:22:09.040 Uttam Kumaran: Yes.
225 00:22:09.679 ⇒ 00:22:15.409 Demilade Agboola: Like, obviously, with each day that passes. It’s gonna keep increasing bit by bit.
226 00:22:17.730 ⇒ 00:22:19.049 Demilade Agboola: you know. But I think we can.
227 00:22:19.050 ⇒ 00:22:24.420 Uttam Kumaran: We can change. We can change that through here, like, I gotta read how to do this. But like, yeah, change that through here.
228 00:22:25.540 ⇒ 00:22:31.280 Uttam Kumaran: Okay. So either way, alright, I think we’re we understand that there’s like a
229 00:22:31.610 ⇒ 00:22:38.650 Uttam Kumaran: kind of probably too much being tracked right now, I kind of want to get to like a couple of probably more like specific questions.
230 00:22:38.880 ⇒ 00:22:49.290 Uttam Kumaran: One is like, who is gonna like own this software like, who’s gonna become like
231 00:22:50.010 ⇒ 00:22:53.339 Uttam Kumaran: the expert, or at least like someone who’s gonna own this.
232 00:22:53.910 ⇒ 00:22:59.639 Uttam Kumaran: I I don’t think this is also this is the same person that is, on every team
233 00:22:59.840 ⇒ 00:23:03.310 Uttam Kumaran: sort of like playing Goalie like that. We’ll talk about that next.
234 00:23:03.770 ⇒ 00:23:04.560 Uttam Kumaran: But
235 00:23:06.207 ⇒ 00:23:10.300 Uttam Kumaran: it kind of has to be like. Probably one of the 3 of us at least short term
236 00:23:12.040 ⇒ 00:23:17.599 Uttam Kumaran: to be the facilitator for making sure this happens
237 00:23:21.480 ⇒ 00:23:24.189 Uttam Kumaran: in the short term, I can do it.
238 00:23:24.360 ⇒ 00:23:29.729 Uttam Kumaran: but then I would probably just rely on like doing. It means like, I’m just gonna learn.
239 00:23:30.210 ⇒ 00:23:38.420 Uttam Kumaran: go learn everything about this. But then I will have to lean on individual engineers and the team to make sure that this is implemented per client.
240 00:23:40.460 ⇒ 00:23:41.080 Awaish Kumar: Yeah.
241 00:23:41.590 ⇒ 00:23:45.049 Demilade Agboola: I mean to be fair. I could do it. But obviously I’ll need to learn. Metapplane is that
242 00:23:45.930 ⇒ 00:23:48.280 Demilade Agboola: is that course or anything? Or is that like a book.
243 00:23:49.340 ⇒ 00:23:50.413 Uttam Kumaran: There is
244 00:23:50.950 ⇒ 00:23:52.030 Demilade Agboola: Or something.
245 00:23:56.460 ⇒ 00:24:02.430 Uttam Kumaran: There is a bunch of yeah, I think there is like a resource library.
246 00:24:05.710 ⇒ 00:24:11.390 Uttam Kumaran: But, like, how about this? How about like, maybe I just try to get it up and going.
247 00:24:12.500 ⇒ 00:24:17.500 Uttam Kumaran: and then again, like, I don’t know. I think it’s it’s up to the 3 of us, I mean.
248 00:24:18.470 ⇒ 00:24:21.910 Uttam Kumaran: I don’t know if I can, if I can hold on to this long term.
249 00:24:22.390 ⇒ 00:24:28.750 Uttam Kumaran: I don’t. I think I’ve hold on short term, actually like longer term later, maybe. But I would
250 00:24:29.480 ⇒ 00:24:34.920 Uttam Kumaran: love for this to go. Yeah, either. Maybe it’s maybe it’s you, Demo A, if you’re open to it.
251 00:24:35.680 ⇒ 00:24:39.709 Demilade Agboola: Yeah, sure, I would just like again, once we have, like.
252 00:24:39.850 ⇒ 00:24:42.499 Demilade Agboola: I could like, look for resources on.
253 00:24:42.500 ⇒ 00:24:44.799 Demilade Agboola: Yeah, it’s effectively.
254 00:24:45.020 ⇒ 00:24:49.350 Uttam Kumaran: So why don’t I take ownership of it like in the next, like just the next few weeks.
255 00:24:49.450 ⇒ 00:24:55.180 Uttam Kumaran: to just like bust through whatever problems we have, and then
256 00:24:55.340 ⇒ 00:24:57.439 Uttam Kumaran: but we can. We’ll still collaborate.
257 00:24:57.640 ⇒ 00:25:04.239 Uttam Kumaran: But I have that. I have some time to do this, and then I will slowly transition this over to you for ownership.
258 00:25:04.510 ⇒ 00:25:09.029 Uttam Kumaran: But again, ownership doesn’t mean you’re in charge of isolating every alert
259 00:25:09.320 ⇒ 00:25:11.520 Uttam Kumaran: like, for every client ownership means
260 00:25:11.640 ⇒ 00:25:15.319 Uttam Kumaran: you’re there to empower every team to get onto this platform.
261 00:25:16.930 ⇒ 00:25:17.380 Demilade Agboola: Okay.
262 00:25:17.380 ⇒ 00:25:18.000 Uttam Kumaran: That makes sense.
263 00:25:18.530 ⇒ 00:25:19.320 Demilade Agboola: Yeah, sure.
264 00:25:19.320 ⇒ 00:25:20.939 Uttam Kumaran: You’re not being signed up to like
265 00:25:21.780 ⇒ 00:25:24.610 Uttam Kumaran: be on pager duty forever. It’s like.
266 00:25:29.150 ⇒ 00:25:30.970 Demilade Agboola: Yeah, that’s fine. That’s that makes sense.
267 00:25:31.120 ⇒ 00:25:34.640 Uttam Kumaran: Okay that way. I can get this up and running like this week.
268 00:25:35.230 ⇒ 00:25:36.940 Uttam Kumaran: and then we can work on it
269 00:25:37.190 ⇒ 00:25:40.900 Uttam Kumaran: second for the Metaplane alerts. Channel.
270 00:25:43.175 ⇒ 00:25:50.100 Uttam Kumaran: So yeah, I I think that we should
271 00:25:51.770 ⇒ 00:25:55.289 Uttam Kumaran: probably not have any schema change alerts at all.
272 00:26:01.860 ⇒ 00:26:04.919 Uttam Kumaran: And we only have like failure alerts.
273 00:26:10.500 ⇒ 00:26:15.299 Uttam Kumaran: I mean, I guess we we could have resolution alerts. But I feel like even that’s kind of noisy.
274 00:26:20.250 ⇒ 00:26:24.840 Uttam Kumaran: because failure indicates that someone should take someone should triage and own it.
275 00:26:25.060 ⇒ 00:26:31.730 Uttam Kumaran: And so at that point I feel like someone will comment, saying, This is this is done right.
276 00:26:31.970 ⇒ 00:26:37.289 Demilade Agboola: Yeah, I think, yeah, the focus should just be on like failure alerts.
277 00:26:37.550 ⇒ 00:26:42.929 Demilade Agboola: So that once any message comes in, it’s high priority.
278 00:26:43.060 ⇒ 00:26:46.209 Demilade Agboola: It’s not just like one of those things to like. Oh.
279 00:26:46.740 ⇒ 00:26:51.460 Demilade Agboola: it can! Because even if to be fair, even if that like failure, alert, had come in amongst the.
280 00:26:51.460 ⇒ 00:26:52.749 Uttam Kumaran: You may not have seen it.
281 00:26:53.200 ⇒ 00:26:53.860 Demilade Agboola: Okay.
282 00:26:54.480 ⇒ 00:26:55.140 Demilade Agboola: Yeah.
283 00:26:56.320 ⇒ 00:27:01.679 Uttam Kumaran: So I do think it’s like we shouldn’t have any tests without a severity of high.
284 00:27:02.950 ⇒ 00:27:08.380 Uttam Kumaran: and the goal of every alert is that it needs to be resolved as soon as it hits the channel.
285 00:27:08.830 ⇒ 00:27:12.910 Uttam Kumaran: Right? So there’s no like. There’s always P. It’s only P. 0 alerts
286 00:27:14.780 ⇒ 00:27:19.499 Uttam Kumaran: which will force us to make sure that all of our monitors like work right
287 00:27:21.300 ⇒ 00:27:23.059 Uttam Kumaran: like, how do we feel about that?
288 00:27:25.360 ⇒ 00:27:25.850 Awaish Kumar: Yup.
289 00:27:36.874 ⇒ 00:27:39.299 Uttam Kumaran: Then the next thing is like.
290 00:27:48.780 ⇒ 00:27:56.029 Uttam Kumaran: Yeah, I’m not really super interested in like anomaly detection at this point, like, I only want to focus on the P zeros.
291 00:27:56.150 ⇒ 00:27:57.809 Uttam Kumaran: I think the other thing is like.
292 00:28:00.640 ⇒ 00:28:05.710 Uttam Kumaran: okay, let’s just let’s just talk about that. So okay, that’s fine. I think we made a decision on these.
293 00:28:10.530 ⇒ 00:28:16.820 Uttam Kumaran: the next piece is like, let’s talk about this goalie proposal
294 00:28:18.510 ⇒ 00:28:20.440 Uttam Kumaran: in in like, what you guys think, yeah.
295 00:28:27.330 ⇒ 00:28:35.460 Uttam Kumaran: like, what do you guys think about having someone on call or sprint per per client.
296 00:28:38.850 ⇒ 00:28:45.869 Awaish Kumar: Yes, but like for most of the clients, we have only one data, analytics and data.
297 00:28:45.870 ⇒ 00:28:49.010 Uttam Kumaran: But but the goalie is not. Someone is not who resolves it.
298 00:28:49.170 ⇒ 00:28:50.870 Uttam Kumaran: Just the triage person.
299 00:28:57.310 ⇒ 00:29:00.749 Uttam Kumaran: Because this is how we can start developing basically run books
300 00:29:03.210 ⇒ 00:29:07.219 Uttam Kumaran: if I don’t know if everyone’s familiar with with runbooks. But run book is just like
301 00:29:07.420 ⇒ 00:29:09.230 Uttam Kumaran: when you get this type of alert.
302 00:29:09.450 ⇒ 00:29:11.230 Uttam Kumaran: These are the steps to walk through.
303 00:29:20.130 ⇒ 00:29:20.840 Demilade Agboola: I mean.
304 00:29:20.840 ⇒ 00:29:28.190 Awaish Kumar: You suggesting that like, if for us, a client, if we have analytics, engineer goalie can be anyone
305 00:29:28.620 ⇒ 00:29:31.370 Awaish Kumar: else which is not the part of the team.
306 00:29:32.340 ⇒ 00:29:37.040 Uttam Kumaran: No, no, no, I’m saying, for every for every I would say, large client.
307 00:29:37.400 ⇒ 00:29:40.560 Uttam Kumaran: there is one goalie per sprint.
308 00:29:41.180 ⇒ 00:29:46.850 Uttam Kumaran: I don’t know. If, like, we we wanting. Yeah. And that’s gotta be. That’s just someone on the engineering team
309 00:29:48.240 ⇒ 00:29:50.749 Uttam Kumaran: like someone on that client’s engineering team.
310 00:29:50.980 ⇒ 00:29:59.409 Uttam Kumaran: So on Eden, it would be between you, Annie, and demalade on urban stems would be between Kyle and demalade and me
311 00:29:59.580 ⇒ 00:30:04.009 Uttam Kumaran: on ABC. It’d be couple with one other thing
312 00:30:04.170 ⇒ 00:30:11.379 Uttam Kumaran: so, and then, of course, we have to just build a timeline, so there’s no overlap. So you’re not like on call and 2 clients
313 00:30:11.690 ⇒ 00:30:13.729 Uttam Kumaran: to give you the reasoning one.
314 00:30:13.940 ⇒ 00:30:16.079 Uttam Kumaran: I don’t want everybody to be like
315 00:30:16.760 ⇒ 00:30:21.459 Uttam Kumaran: staring at this channel and figuring it out. One person should be there who can be like
316 00:30:21.680 ⇒ 00:30:27.800 Uttam Kumaran: my job is to get if I get an alert to figure out if the alert is real, and
317 00:30:27.900 ⇒ 00:30:29.639 Uttam Kumaran: if if it is real
318 00:30:29.770 ⇒ 00:30:41.250 Uttam Kumaran: like, where did it come from, and then share that with the team, and if if they can fix it, they can go ahead. Otherwise it’s like, Hey, here’s the summary of like what what’s happening right? So on Friday.
319 00:30:41.390 ⇒ 00:30:45.530 Uttam Kumaran: had we got this alert, it would have been like, Hey, I just triage. This
320 00:30:45.820 ⇒ 00:30:48.919 Uttam Kumaran: looks like we just had a
321 00:30:49.290 ⇒ 00:30:53.380 Uttam Kumaran: like 9 X increase in rows, and the Monitor went off.
322 00:30:54.130 ⇒ 00:30:58.050 Uttam Kumaran: The only I did. I went through a couple of steps of investigation.
323 00:30:58.220 ⇒ 00:31:01.149 Uttam Kumaran: The only thing I noticed is this this Pr got merged.
324 00:31:02.620 ⇒ 00:31:03.910 Uttam Kumaran: That’s the message.
325 00:31:04.460 ⇒ 00:31:06.570 Uttam Kumaran: Here’s like a recommendation right.
326 00:31:08.490 ⇒ 00:31:14.370 Luke Daque: Yeah, I think I agree with that. That’s what I also what do you call this?
327 00:31:14.780 ⇒ 00:31:21.490 Luke Daque: I that’s what I also mentioned before, when it’s like some kind of rotation within the team
328 00:31:22.930 ⇒ 00:31:27.610 Luke Daque: within. It does. Yeah with the client. But yeah, I don’t know what. What do you guys think.
329 00:31:32.640 ⇒ 00:31:39.660 Demilade Agboola: I mean, my question is like, what? What period it? What periods are we looking at for like
330 00:31:39.920 ⇒ 00:31:41.819 Demilade Agboola: that? Are we looking out like
331 00:31:42.090 ⇒ 00:31:46.949 Demilade Agboola: throughout the night over the weekends like, what’s the?
332 00:31:47.130 ⇒ 00:31:52.869 Demilade Agboola: And also the expected time of response when you’re on call during non work hours.
333 00:31:53.970 ⇒ 00:31:57.669 Uttam Kumaran: Yeah, I think. That’s a really good question. I mean.
334 00:32:00.880 ⇒ 00:32:04.120 Uttam Kumaran: yeah, I don’t know. I feel like during the weekend.
335 00:32:11.040 ⇒ 00:32:18.149 Uttam Kumaran: I mean during the weekend. I feel like 24 h seems fair during work hours. I feel like has to be within same day, right like.
336 00:32:19.120 ⇒ 00:32:20.790 Uttam Kumaran: I don’t know. What do you guys think
337 00:32:22.860 ⇒ 00:32:24.980 Uttam Kumaran: I mean it’s same day triage.
338 00:32:27.780 ⇒ 00:32:37.950 Luke Daque: Yeah, I don’t think we have to like respond. As soon as the alert comes off, as long as we are able to respond within the day, or like fix it within the day.
339 00:32:38.550 ⇒ 00:32:39.500 Luke Daque: or like.
340 00:32:40.740 ⇒ 00:32:47.609 Uttam Kumaran: I think a 24 h triage Sla is fair. Hey? This alert happened. This error occurred. We’re on it.
341 00:32:49.380 ⇒ 00:32:50.750 Luke Daque: Yeah, that makes sense.
342 00:32:52.440 ⇒ 00:32:53.940 Awaish Kumar: For the weekdays.
343 00:32:57.420 ⇒ 00:32:58.870 Uttam Kumaran: Weekdays.
344 00:33:04.040 ⇒ 00:33:07.820 Uttam Kumaran: I kind of feel like it should just still be 24 h.
345 00:33:10.010 ⇒ 00:33:14.580 Uttam Kumaran: like, I just don’t want to sign us up for something we can’t do. I don’t know yet.
346 00:33:18.430 ⇒ 00:33:20.360 Luke Daque: Yeah, if it’s only triage, like
347 00:33:22.300 ⇒ 00:33:26.190 Luke Daque: 24 h. Seems to be very fair fair enough.
348 00:33:27.500 ⇒ 00:33:28.200 Awaish Kumar: No no worries.
349 00:33:28.200 ⇒ 00:33:29.489 Luke Daque: We don’t have to fix it.
350 00:33:29.490 ⇒ 00:33:30.890 Awaish Kumar: It’s okay for.
351 00:33:30.890 ⇒ 00:33:31.790 Luke Daque: We’re out.
352 00:33:32.360 ⇒ 00:33:36.359 Awaish Kumar: For weekdays for the weekends? Are we setting this
353 00:33:37.160 ⇒ 00:33:40.289 Awaish Kumar: expectation with the clients that we are going to work on?
354 00:33:40.610 ⇒ 00:33:41.130 Awaish Kumar: Drink.
355 00:33:41.130 ⇒ 00:33:44.509 Uttam Kumaran: No, but but this is where it’s like.
356 00:33:46.070 ⇒ 00:33:53.020 Uttam Kumaran: I mean, it’s a good question like, I wonder if the person on call for that sprint needs to just be aware that like, Hey.
357 00:33:53.460 ⇒ 00:33:59.000 Uttam Kumaran: if there’s something happens during the weekend, your your job is to triage it.
358 00:33:59.120 ⇒ 00:34:01.630 Uttam Kumaran: Spend an hour in triage, and then that’s it.
359 00:34:02.700 ⇒ 00:34:08.480 Uttam Kumaran: I don’t know. I mean, I’m it’s kind of a discussion between us. We haven’t said anything to clients on this topic.
360 00:34:09.320 ⇒ 00:34:11.470 Uttam Kumaran: I mean to give you. To be frank.
361 00:34:11.760 ⇒ 00:34:14.890 Uttam Kumaran: because we haven’t said anything. I think we work weekends
362 00:34:15.360 ⇒ 00:34:21.969 Uttam Kumaran: like I would like us to not work weekends, and instead, be like, Hey, we have an sla on these during the weekend.
363 00:34:39.219 ⇒ 00:34:43.399 Uttam Kumaran: I mean, we don’t have to put an sla for now, but I can keep it as an open item.
364 00:34:46.219 ⇒ 00:34:51.079 Demilade Agboola: Yeah, I mean, obviously, I think the issue will just be like over committing, because.
365 00:34:51.309 ⇒ 00:34:56.929 Demilade Agboola: like weekends can be dynamic. You could travel. You could be on a plane. You know. Things can.
366 00:34:56.929 ⇒ 00:34:57.309 Uttam Kumaran: Yeah.
367 00:34:58.270 ⇒ 00:35:03.330 Demilade Agboola: Might be really difficult to triage within 24 h.
368 00:35:03.330 ⇒ 00:35:13.909 Uttam Kumaran: The reason. The reason you know, to set the goalie is that that person should indicate. Hey, I’m like not going to be on. I’m not going to be around or I’m going. I’m leaving off. I’m leaving office.
369 00:35:14.650 ⇒ 00:35:21.860 Uttam Kumaran: But also this is where, like you have 24 h to triage, and you can also ping someone else and say, Hey, I can’t triage right now, can you? Can you help me out?
370 00:35:23.060 ⇒ 00:35:26.839 Uttam Kumaran: So I think there’s options? I mean we should include, like what all those are.
371 00:35:27.320 ⇒ 00:35:32.060 Uttam Kumaran: I think it is fair for us to say. If there is an if there is a outage.
372 00:35:32.370 ⇒ 00:35:42.539 Uttam Kumaran: we’ll at least have triage the problem within 24 h, because I’m telling you. If there, if and last the last backstop is, I’ll come on and I’ll do it, I’ll help right?
373 00:35:42.900 ⇒ 00:35:46.280 Uttam Kumaran: So I feel like 24 h is
374 00:35:46.760 ⇒ 00:35:53.040 Uttam Kumaran: fair, and there are opportunities for the goalie, even if they’re not able to to find assistance.
375 00:35:59.500 ⇒ 00:36:02.290 Luke Daque: Yeah, I think that’s fair. I I don’t know. You guys.
376 00:36:05.350 ⇒ 00:36:10.130 Uttam Kumaran: I think it’s fair because we’re not signing up to fix the problem in the weekend. We’re just signing up to triage it.
377 00:36:12.550 ⇒ 00:36:16.170 Demilade Agboola: Yeah, sure, I think, especially if the alerts come in.
378 00:36:17.298 ⇒ 00:36:25.060 Demilade Agboola: It will probably be a like, either a change in source data or a Pr.
379 00:36:25.810 ⇒ 00:36:26.270 Luke Daque: Yeah.
380 00:36:26.270 ⇒ 00:36:31.339 Uttam Kumaran: Yeah, like this channel should not be this channel. Like, I have this channel. I have. This channel muted
381 00:36:31.580 ⇒ 00:36:40.480 Uttam Kumaran: like this channel should be very inactive, you know, should not be popping off.
382 00:36:42.220 ⇒ 00:36:44.589 Luke Daque: Yeah. And this is where we also
383 00:36:45.510 ⇒ 00:36:52.900 Luke Daque: need to re, like, if this channel is popping off, then that means there’s probably a Pr that’s like
384 00:36:55.170 ⇒ 00:37:02.227 Luke Daque: doing the damage right like like increasing the rows or whatever. And like this is that would also
385 00:37:04.160 ⇒ 00:37:12.480 Luke Daque: basically signal us that maybe the settings for the alerts are need to be need to change, or something like that.
386 00:37:13.290 ⇒ 00:37:14.110 Luke Daque: Right.
387 00:37:14.570 ⇒ 00:37:15.220 Uttam Kumaran: Yeah.
388 00:37:16.470 ⇒ 00:37:17.460 Luke Daque: Yeah.
389 00:37:18.030 ⇒ 00:37:18.870 Awaish Kumar: From the.
390 00:37:27.160 ⇒ 00:37:27.770 Uttam Kumaran: Okay.
391 00:37:29.680 ⇒ 00:37:43.160 Uttam Kumaran: okay? And then my only other point is that get, we’ll start to have just fewer monitors. So we can sort of document this in the data platform sheet as well. Then we talked about sort of a rotating on call. This would just be a ticket in the sprint.
392 00:37:47.990 ⇒ 00:37:48.640 Uttam Kumaran: Like.
393 00:37:48.640 ⇒ 00:37:56.820 Luke Daque: I think it also makes sense to. I don’t know if like not have Prs on Fridays.
394 00:37:57.030 ⇒ 00:37:58.510 Luke Daque: because that way
395 00:37:59.510 ⇒ 00:38:05.860 Luke Daque: we don’t get the errors on weekends. But yeah, it also depends, like how crucial the Prs need to be as well. But.
396 00:38:06.040 ⇒ 00:38:12.529 Uttam Kumaran: No, I mean, this is a good Co. This that’s a conversation on the Eden team on like. Why, they pressured us to do that.
397 00:38:12.960 ⇒ 00:38:16.100 Uttam Kumaran: and why? And like again, I think
398 00:38:16.320 ⇒ 00:38:19.430 Uttam Kumaran: it’s I don’t blame you, Demo Lotto, because we all
399 00:38:19.660 ⇒ 00:38:23.340 Uttam Kumaran: do this, but it’s like it needs to be a conversation with
400 00:38:24.250 ⇒ 00:38:31.219 Uttam Kumaran: I mean with Robert on like, Hey, client pressured on Friday, how could we have prevented this from happening?
401 00:38:31.320 ⇒ 00:38:33.960 Uttam Kumaran: I mean, of course, we’re gonna implement this system.
402 00:38:34.180 ⇒ 00:38:35.970 Uttam Kumaran: but I also agree in that, like
403 00:38:36.140 ⇒ 00:38:40.860 Uttam Kumaran: the reason why I decided to move a lot of our client stuff, even to Thursdays now
404 00:38:41.060 ⇒ 00:38:43.909 Uttam Kumaran: is so that we don’t push code on Friday.
405 00:38:45.070 ⇒ 00:38:46.979 Uttam Kumaran: just cause it’s like a nightmare.
406 00:38:47.300 ⇒ 00:38:52.940 Uttam Kumaran: Bye, I don’t know. You know, it’s just sometimes it’s just the way it works. That’s a good negotiation
407 00:38:53.390 ⇒ 00:38:56.100 Uttam Kumaran: that needs to be had with the Pm’s. I feel like.
408 00:38:57.380 ⇒ 00:38:59.007 Demilade Agboola: Oh, yeah, definitely
409 00:39:00.100 ⇒ 00:39:05.500 Demilade Agboola: and like, I did, cause like robot was in the loop of it of the request.
410 00:39:07.880 ⇒ 00:39:10.530 Demilade Agboola: And I think it was also one of those things where it was.
411 00:39:11.140 ⇒ 00:39:13.109 Demilade Agboola: Oh, this should be an easy fix.
412 00:39:13.110 ⇒ 00:39:14.160 Uttam Kumaran: Yeah, yeah.
413 00:39:14.400 ⇒ 00:39:22.390 Demilade Agboola: And here’s the thing like I literally was working on at working on this, for, like 2 Am. On like the next day, like 2 Am. On Saturday, my time
414 00:39:22.780 ⇒ 00:39:27.750 Demilade Agboola: and I. I had a call with Annie, and part of the reasons why I didn’t push the dashboard
415 00:39:27.920 ⇒ 00:39:32.879 Demilade Agboola: to the Eden team was because the numbers looked weird.
416 00:39:33.230 ⇒ 00:39:39.079 Demilade Agboola: like I I was looking at it. And I’m like these numbers don’t make sense, and I’m not going to push the dashboard
417 00:39:39.440 ⇒ 00:39:42.490 Demilade Agboola: when the numbers don’t make sense. So I sent
418 00:39:44.230 ⇒ 00:39:47.999 Demilade Agboola: I sent the raw numbers to cutter directly in my
419 00:39:48.860 ⇒ 00:39:53.170 Demilade Agboola: numbers, and these are the customer numbers that you’re requesting.
420 00:39:53.865 ⇒ 00:40:00.190 Demilade Agboola: Obviously, because the like, the the tables are linked to all the dashboards that that, like
421 00:40:00.560 ⇒ 00:40:05.600 Demilade Agboola: the numbers that were weird to me, spread across all the dashboards, and that’s what the team.
422 00:40:05.870 ⇒ 00:40:11.610 Demilade Agboola: And so, just being able to like, I like, I told Otam before you guys joined the call.
423 00:40:12.070 ⇒ 00:40:20.329 Demilade Agboola: Anything that involves dashboards on like just trying to like change the numbers on dashboard. Because actually, what Qatar wanted was a redefinition of everything.
424 00:40:20.510 ⇒ 00:40:23.409 Demilade Agboola: So we’re going to change definition of what a customer meant.
425 00:40:23.830 ⇒ 00:40:24.859 Demilade Agboola: The revenue officer.
426 00:40:24.860 ⇒ 00:40:25.950 Uttam Kumaran: Oh, okay.
427 00:40:26.320 ⇒ 00:40:29.129 Demilade Agboola: That would affect the nros.
428 00:40:29.950 ⇒ 00:40:31.870 Demilade Agboola: And Ross, it will affect the
429 00:40:33.036 ⇒ 00:40:39.460 Demilade Agboola: any any ov like. It’s gonna affect everything. Basically, it’s an entirely new dashboard in that context, right?
430 00:40:39.660 ⇒ 00:40:40.470 Demilade Agboola: So
431 00:40:42.020 ⇒ 00:40:47.390 Demilade Agboola: that those sort of requests are not requests. We, I think we should try and handle in like one.
432 00:40:47.810 ⇒ 00:40:55.349 Demilade Agboola: There should be proper Qa process. Now, if you need the numbers like, you just need us to query the numbers that we have to give you
433 00:40:55.720 ⇒ 00:41:08.869 Demilade Agboola: a Csv of just the that can be done like that’s not a hard thing to query, but when you need to change everything in terms of like the modeling as well as then the dashboard itself. I don’t think we should try and do that in 24 h.
434 00:41:08.870 ⇒ 00:41:11.670 Uttam Kumaran: I agree. I I also still think, for those
435 00:41:12.300 ⇒ 00:41:17.130 Uttam Kumaran: the approval should go through the Pm. I mean, I’m I’m enforcing this on urban stems
436 00:41:17.270 ⇒ 00:41:22.140 Uttam Kumaran: and my clients, and that like, I don’t want folks to go direct to our team.
437 00:41:22.440 ⇒ 00:41:24.710 Uttam Kumaran: It either needs to be in a channel or an email.
438 00:41:25.160 ⇒ 00:41:30.290 Uttam Kumaran: because one of us will see a little bit around the corner and be like, Oh, there’s this other problem, right? Like.
439 00:41:30.920 ⇒ 00:41:36.689 Uttam Kumaran: that’s the thing where it’s like, yeah, I, I know, like consultants always want to add process. But like
440 00:41:37.060 ⇒ 00:41:40.230 Uttam Kumaran: partly, this is what I used to do in companies as well, where like
441 00:41:40.410 ⇒ 00:41:49.550 Uttam Kumaran: do, or whatever will be like. Hey, can you do this? And it’s like dude. I don’t think you understand the ramifications of what you’re asking, so I want to do this in a public forum.
442 00:41:50.150 ⇒ 00:41:52.160 Uttam Kumaran: so that, like, I don’t get fried.
443 00:41:53.730 ⇒ 00:41:55.589 Uttam Kumaran: You know. So
444 00:41:57.360 ⇒ 00:42:05.139 Uttam Kumaran: it’s things we can talk about like. I mean, this is just as we’re getting bigger clients. This is what happens like in in bigger data teams. But
445 00:42:06.750 ⇒ 00:42:15.230 Uttam Kumaran: okay, so I think I think we made decisions on the the goalie work. We’ve made decisions on
446 00:42:16.490 ⇒ 00:42:21.010 Uttam Kumaran: the plan. I think what what sort of I’m thinking of is like. By end of this week
447 00:42:21.270 ⇒ 00:42:25.730 Uttam Kumaran: my goals are one. I’m gonna get this whole thing set up for Eden.
448 00:42:26.090 ⇒ 00:42:34.640 Uttam Kumaran: I think, Demote, I would like to nominate you as the 1st goalie 1st inaugural goalie of the company for Eden.
449 00:42:34.810 ⇒ 00:42:37.349 Uttam Kumaran: You can work with me on just being like.
450 00:42:37.570 ⇒ 00:42:41.620 Uttam Kumaran: do these alerts suffice? That’s probably it, for now, and just like
451 00:42:42.160 ⇒ 00:42:46.580 Uttam Kumaran: we can work on this channel a little bit. I’m gonna go through all of Meta plane.
452 00:42:46.770 ⇒ 00:42:50.590 Uttam Kumaran: learn all how the alerting and monitoring and everything works
453 00:42:52.330 ⇒ 00:42:56.069 Uttam Kumaran: and then make sure this is set up by the end of the week for the core
454 00:42:56.410 ⇒ 00:42:58.000 Uttam Kumaran: models for Eden.
455 00:43:00.480 ⇒ 00:43:06.240 Uttam Kumaran: setting up not only is on like the entire model. But it’s going to be on the specific rows, that matter.
456 00:43:08.460 ⇒ 00:43:10.380 Uttam Kumaran: And I hope to have the slack
457 00:43:10.740 ⇒ 00:43:12.999 Uttam Kumaran: start to go also this week.
458 00:43:14.790 ⇒ 00:43:17.100 Uttam Kumaran: One thing that we still don’t
459 00:43:17.300 ⇒ 00:43:19.100 Uttam Kumaran: have is sort of this like.
460 00:43:19.995 ⇒ 00:43:25.469 Uttam Kumaran: I don’t think we have a Pr impact plan.
461 00:43:26.450 ⇒ 00:43:30.370 Uttam Kumaran: So I’m just gonna note that at the bottom of this here, which is like.
462 00:43:33.550 ⇒ 00:43:34.210 Luke Daque: Check me!
463 00:43:34.210 ⇒ 00:43:35.410 Uttam Kumaran: AR, impact.
464 00:43:36.170 ⇒ 00:43:36.930 Uttam Kumaran: Yeah.
465 00:43:36.930 ⇒ 00:43:38.089 Luke Daque: Did we have a ci.
466 00:43:38.090 ⇒ 00:43:44.090 Awaish Kumar: So I, I actually like looked at data diff tools
467 00:43:44.920 ⇒ 00:43:50.220 Awaish Kumar: like kind, of which which gives us this P. Pr impact like staging versus production
468 00:43:50.380 ⇒ 00:43:52.660 Awaish Kumar: Chinese, but that got deprecated.
469 00:43:52.900 ⇒ 00:43:53.300 Uttam Kumaran: Yeah.
470 00:43:53.300 ⇒ 00:43:58.919 Awaish Kumar: Now, the only data diff thing is happening in the data mesh.
471 00:44:00.100 ⇒ 00:44:05.610 Awaish Kumar: So, like J. Damesh, have built some kind of a similar.
472 00:44:05.610 ⇒ 00:44:07.400 Uttam Kumaran: Do data diff and metaplane.
473 00:44:08.240 ⇒ 00:44:12.910 Awaish Kumar: Yeah, like not. Right now, we have discussed that with the Meta plan team.
474 00:44:13.170 ⇒ 00:44:16.230 Uttam Kumaran: And they said, like, Yeah, we can’t do that. I know.
475 00:44:18.060 ⇒ 00:44:22.120 Uttam Kumaran: Okay, I mean, it’s fine, like, we may not be able to do Pr impact.
476 00:44:22.871 ⇒ 00:44:32.009 Uttam Kumaran: Actually, I mean, this is this is this is would be my question, though like, if a Pr hits create staging, then if we have monitors on staging. Then we should have caught this right.
477 00:44:38.300 ⇒ 00:44:39.880 Awaish Kumar: Oh yes!
478 00:44:40.660 ⇒ 00:44:42.470 Uttam Kumaran: So that’s a potential option.
479 00:44:43.090 ⇒ 00:44:44.080 Luke Daque: What do you mean?
480 00:44:44.310 ⇒ 00:44:44.800 Luke Daque: But I.
481 00:44:44.800 ⇒ 00:44:47.969 Awaish Kumar: Like, if we have monitors only on staging.
482 00:44:49.400 ⇒ 00:44:53.849 Awaish Kumar: But whenever Pr is created it creates
483 00:44:53.970 ⇒ 00:44:56.469 Awaish Kumar: like it updates the static models.
484 00:44:58.420 ⇒ 00:45:02.440 Awaish Kumar: and hence, like meta plane can detect the changes.
485 00:45:03.250 ⇒ 00:45:04.830 Luke Daque: Yeah, yeah.
486 00:45:05.410 ⇒ 00:45:12.230 Uttam Kumaran: But those alerts won’t go here. I want those alerts to just be on the Pr. I mean, I think this is a lower priority.
487 00:45:12.620 ⇒ 00:45:15.200 Uttam Kumaran: It’s a high priority. But like not the highest.
488 00:45:15.550 ⇒ 00:45:23.420 Uttam Kumaran: I think initially, I’m gonna just aim to get stuff on production. And then we can talk about like Pr impact data diff related stuff.
489 00:45:24.170 ⇒ 00:45:31.689 Uttam Kumaran: I think the only other to do is like longer term, and I think done. Lot of you can probably own this as well is like these run books
490 00:45:32.720 ⇒ 00:45:38.730 Uttam Kumaran: like, what are the common data problems that we face? What are the typical steps to go to triage?
491 00:45:39.210 ⇒ 00:45:42.770 Uttam Kumaran: That’ll help everybody just be like I went through all the steps and we’re good.
492 00:45:43.972 ⇒ 00:45:49.520 Uttam Kumaran: This is also like we were gonna hit this at some point in our journey. So
493 00:45:50.020 ⇒ 00:45:51.610 Uttam Kumaran: it’s good that we’re doing this now.
494 00:45:54.620 ⇒ 00:46:08.580 Uttam Kumaran: and then, on Monday I’m talking to Meta plane Monday or Tuesday, so I’ll sign and get make sure we can get everything tracked. Can you can you guys tell me about what? What? What like sort of monitoring options we have for like
495 00:46:10.380 ⇒ 00:46:15.750 Uttam Kumaran: out for, like Gcp, like, what? What do we want to monitor in Gcp. Versus here.
496 00:46:17.030 ⇒ 00:46:20.049 Uttam Kumaran: or or what is, or this is oh, this is like monitoring.
497 00:46:21.550 ⇒ 00:46:22.469 Awaish Kumar: This is monitoring.
498 00:46:22.470 ⇒ 00:46:23.350 Uttam Kumaran: Jobs.
499 00:46:23.780 ⇒ 00:46:24.560 Awaish Kumar: Yes.
500 00:46:25.310 ⇒ 00:46:28.070 Uttam Kumaran: Okay, so not that’s not like super high priority.
501 00:46:30.540 ⇒ 00:46:31.170 Uttam Kumaran: Right?
502 00:46:31.170 ⇒ 00:46:34.519 Awaish Kumar: Yeah, like, we added this to like
503 00:46:34.650 ⇒ 00:46:41.719 Awaish Kumar: ex, like the expecting that it is going to. It is going to like, have Pr impact.
504 00:46:41.930 ⇒ 00:46:46.130 Awaish Kumar: But that’s not what what is happening. So it does have a Pr impact.
505 00:46:46.250 ⇒ 00:46:47.970 Awaish Kumar: But it is. It is not
506 00:46:48.580 ⇒ 00:47:04.629 Awaish Kumar: looking at staging database. So it will what it is going to do is like, if we have a SQL. Change like a model, change a calculation change like in the like revenue like, we have a formula which is different in production and in the Pr we change it.
507 00:47:04.880 ⇒ 00:47:10.539 Awaish Kumar: Then we are going to have an impact like after generating that formula.
508 00:47:10.800 ⇒ 00:47:11.300 Uttam Kumaran: Okay.
509 00:47:11.300 ⇒ 00:47:16.160 Awaish Kumar: How the change looks like. But yeah, it is still going to be in production.
510 00:47:17.050 ⇒ 00:47:21.269 Uttam Kumaran: And then what are the tableau related, monitors, that we can do.
511 00:47:26.760 ⇒ 00:47:31.440 Awaish Kumar: And it’s also like, I think it’s a freshness monitor.
512 00:47:33.170 ⇒ 00:47:34.819 Uttam Kumaran: Freshness, and row, count.
513 00:47:37.030 ⇒ 00:47:39.090 Awaish Kumar: And we can add custom ones. But, like
514 00:47:39.680 ⇒ 00:47:42.049 Awaish Kumar: we need only the freshness ones.
515 00:47:49.660 ⇒ 00:47:50.470 Uttam Kumaran: And
516 00:47:56.300 ⇒ 00:48:03.069 Uttam Kumaran: oh, you can’t even I don’t think you can do, Mon. I don’t think you can do those
517 00:48:04.670 ⇒ 00:48:10.280 Uttam Kumaran: looks like the only monitor I can do is views.
518 00:48:13.180 ⇒ 00:48:16.689 Demilade Agboola: Yeah, everyone we look or to, I think the only one total like views, too.
519 00:48:16.690 ⇒ 00:48:18.700 Uttam Kumaran: Okay, so that’s been useless.
520 00:48:20.680 ⇒ 00:48:25.260 Uttam Kumaran: But then but you’re right, Demo. A is that we do need staging dashboards
521 00:48:26.070 ⇒ 00:48:28.419 Uttam Kumaran: that are on our staging data
522 00:48:29.490 ⇒ 00:48:31.469 Uttam Kumaran: so that one I can talk about as well.
523 00:48:33.410 ⇒ 00:48:40.970 Uttam Kumaran: Okay, cool. So I’m probably gonna just ditch these. I’m probably gonna ditch the tableau and the dbt, one.
524 00:48:41.550 ⇒ 00:48:46.790 Uttam Kumaran: The I don’t think like we have much in terms of job duration problems.
525 00:48:47.500 ⇒ 00:48:49.620 Uttam Kumaran: I care less about this.
526 00:48:50.190 ⇒ 00:48:55.440 Uttam Kumaran: I wanna make sure that we can track Eden, and then we can track urban stems on here.
527 00:48:56.732 ⇒ 00:49:00.685 Uttam Kumaran: Okay, cool. So I feel like we’re pretty clear.
528 00:49:01.840 ⇒ 00:49:06.820 Uttam Kumaran: so I’m I’ll do a little bit of a deep dive, understand? Like, what are all the different monitors we can set?
529 00:49:07.000 ⇒ 00:49:16.600 Uttam Kumaran: And then also like, How do we set stuff where, like, for example, a transaction table. It’s always going to be increasing. So how do we set up the monitors for those different scenarios where
530 00:49:16.780 ⇒ 00:49:23.120 Uttam Kumaran: table with growth table that’s supposed to be fixed things like that.
531 00:49:23.440 ⇒ 00:49:28.390 Uttam Kumaran: And then this week or next week will just be sort of like tuning, I think.
532 00:49:29.128 ⇒ 00:49:36.020 Uttam Kumaran: And then the last thing, maybe, to confirm them a lot is like, Can I? Are these like.
533 00:49:36.170 ⇒ 00:49:42.420 Uttam Kumaran: can I just go ahead and start moving through these? Or would you suggest any other ones to focus on.
534 00:49:43.000 ⇒ 00:49:46.400 Uttam Kumaran: or if any here are not really worth focusing on.
535 00:49:48.324 ⇒ 00:49:55.960 Demilade Agboola: I don’t. I don’t know. I’ll I’ll have to look through. I think the best place to ask would be Annie, because Annie literally uses most of these tables for.
536 00:49:55.960 ⇒ 00:49:56.480 Uttam Kumaran: Okay.
537 00:49:56.870 ⇒ 00:49:57.929 Demilade Agboola: Dashboards, so she knows.
538 00:49:57.930 ⇒ 00:49:59.170 Uttam Kumaran: I’ll ask her.
539 00:50:00.910 ⇒ 00:50:04.199 Demilade Agboola: Yeah, the top 3, I know. Yes, the 1st 3 are definitely
540 00:50:04.710 ⇒ 00:50:11.370 Demilade Agboola: very important. Product, sale, summary fact transactions and other summary of like probably the most important.
541 00:50:18.490 ⇒ 00:50:19.830 Uttam Kumaran: Okay, cool.
542 00:50:21.010 ⇒ 00:50:29.750 Uttam Kumaran: alright guys, that’s all I had. So I think, yeah, we’ll, I’ll keep sending. I’ll send some updates to everybody right now and then also hopefully, demo A and a wish.
543 00:50:30.242 ⇒ 00:50:32.780 Uttam Kumaran: You can also share this with the Eden team
544 00:50:32.970 ⇒ 00:50:38.380 Uttam Kumaran: Eden clients once once we get this up and running. So you can share kind of how we’re we’re learning from this.
545 00:50:41.930 ⇒ 00:50:42.470 Demilade Agboola: Okay.
546 00:50:43.750 ⇒ 00:50:45.269 Uttam Kumaran: Thank you. Guys. Talk to you soon.
547 00:50:45.990 ⇒ 00:50:46.630 Luke Daque: Bye-bye.