Meeting Title: BF Interview: Demilade <> Ashwini Date: 2025-11-14 Meeting participants: Ashwini Sharma, Demilade Agboola
WEBVTT
1 00:02:43.450 ⇒ 00:02:44.570 Ashwini Sharma: Hello.
2 00:02:45.780 ⇒ 00:02:50.319 Demilade Agboola: Hi, Ashwini. My name… did I get that properly? Just so we’re on the same page?
3 00:02:51.000 ⇒ 00:02:52.549 Ashwini Sharma: Oh, yeah, yeah, yeah, I can see you.
4 00:02:52.980 ⇒ 00:02:56.400 Demilade Agboola: Yeah, can you… what… sorry, can you help me with the pronunciation of.
5 00:02:56.400 ⇒ 00:02:58.220 Ashwini Sharma: Oh, yeah, my name’s Ashwini.
6 00:02:58.430 ⇒ 00:03:04.410 Demilade Agboola: Ashwini, okay, that’s good. Hi, my name is Dim Ladeh. I work in the Bringford team.
7 00:03:05.280 ⇒ 00:03:08.859 Demilade Agboola: I’m sure you’ve had conversations with Awash and Utam.
8 00:03:09.110 ⇒ 00:03:10.330 Ashwini Sharma: Yes, I did.
9 00:03:10.330 ⇒ 00:03:15.299 Demilade Agboola: Yes, that’s great. So that means you’re at this point, I don’t necessarily need to introduce Brainforge and what we do.
10 00:03:15.300 ⇒ 00:03:17.239 Ashwini Sharma: That’s fine, yeah, you don’t have to.
11 00:03:17.240 ⇒ 00:03:25.020 Demilade Agboola: Alright then, that’s all good then. So I guess in that case, can I get to know you, and just, like, your work experience?
12 00:03:25.020 ⇒ 00:03:35.210 Ashwini Sharma: Sure, yeah, so currently I’m working as a senior data architect at a company called BSI Financial Services. It was previously called Accentra Solutions.
13 00:03:35.490 ⇒ 00:03:45.510 Ashwini Sharma: what I do over here is basically take care of the entire data platform, which means, like, after talking to the business, identify the data sources that we need to ingest from.
14 00:03:45.570 ⇒ 00:03:55.850 Ashwini Sharma: Identify the kind of transformations that we’d need to do on the raw data that is ingested, and then create data marts, which will be, again, exposed to the business.
15 00:03:56.020 ⇒ 00:04:15.479 Ashwini Sharma: to help them take decisions that matter to them, right? So, yeah, I mean, everything is happening on Databricks currently, so the ingestion scripts are written in PySpark, the data transformation happens on dbt Core, and data visualization happens on Sigma computing, right?
16 00:04:16.220 ⇒ 00:04:17.220 Demilade Agboola: Okay.
17 00:04:17.410 ⇒ 00:04:22.500 Demilade Agboola: So, you do everything right from ingestion all the way to, like, the building the dashboard?
18 00:04:22.500 ⇒ 00:04:23.899 Ashwini Sharma: That’s right, yes.
19 00:04:23.930 ⇒ 00:04:25.730 Demilade Agboola: That’s a lot. That’s a lot.
20 00:04:26.140 ⇒ 00:04:41.949 Demilade Agboola: Yeah, I have a team, so… Okay, so when you say the… because you said you’re data architects now, so are you… do you actually do the building of it, or do you just come up with the flow?
21 00:04:41.950 ⇒ 00:04:46.259 Ashwini Sharma: No, no, the implementation also, so I’m a hands-on person.
22 00:04:46.630 ⇒ 00:05:05.119 Demilade Agboola: Oh, that’s pretty good, that’s pretty good. And when you say… so since you have, like, experience doing everything, what would you… what would you say your favorite part of working in data is? Like, would you say it’s the ingestion, setting up custom ingestion, setting up transformations, or would you say, like, the visualization is probably your.
23 00:05:05.400 ⇒ 00:05:20.500 Ashwini Sharma: So, every area has its own challenges, but I think the most interesting part is in the transformation, right? Where you look at the data and then determine how to transform it to, you know, help the business take proper decisions.
24 00:05:20.620 ⇒ 00:05:26.909 Ashwini Sharma: That’s, that’s the core, you know, area where I make the most impact.
25 00:05:27.460 ⇒ 00:05:32.029 Demilade Agboola: Okay, okay. And what tools do you use for your transformation?
26 00:05:32.330 ⇒ 00:05:33.730 Ashwini Sharma: It’s a dbt core.
27 00:05:33.730 ⇒ 00:05:35.369 Demilade Agboola: It’s a dbt Coral, great.
28 00:05:35.370 ⇒ 00:05:35.910 Ashwini Sharma: Ultra.
29 00:05:36.210 ⇒ 00:05:41.360 Demilade Agboola: That’s pretty cool. Also, when it comes to things like
30 00:05:41.810 ⇒ 00:05:44.560 Demilade Agboola: So beyond just, like, the building of the…
31 00:05:45.000 ⇒ 00:05:53.010 Demilade Agboola: models. So jobs, scheduling jobs, how do you go about that? How do you decide the priorities? What…
32 00:05:53.420 ⇒ 00:06:00.760 Demilade Agboola: Like, yeah, just that entire, like, job sequencing and job scheduling flow, how do you think of it? Like, how do you architect it?
33 00:06:00.990 ⇒ 00:06:10.650 Ashwini Sharma: So, normally, like, the way I like to do it is I like to decouple things, right? So, when I say decoupling, it means, the ingestion works independent of the transformation.
34 00:06:11.150 ⇒ 00:06:31.069 Ashwini Sharma: Yeah. Ingestion continuously works to ingest data. It might be continuous, or it might… generally, it’s not continuous. It’s in micro-batches, right? So, some of the ingestion happens every half an hour, some of the ingestion happens once a day, some of the ingestion happens once a month, right? We have these three kind of ingestions, and
35 00:06:31.070 ⇒ 00:06:43.660 Ashwini Sharma: Transformation is mainly, you know, happening independent of, the ingestion. So generally, the transformation that I do, is either once a day, or for certain cases, it becomes once an hour.
36 00:06:44.350 ⇒ 00:06:45.270 Demilade Agboola: No duh.
37 00:06:46.340 ⇒ 00:06:50.879 Ashwini Sharma: And they are independent of each other, so I don’t like to couple them together.
38 00:06:51.260 ⇒ 00:06:53.510 Demilade Agboola: Okay, that’s fair.
39 00:06:53.910 ⇒ 00:07:04.319 Demilade Agboola: Can you walk me through a… just, like, a general use case you’ve worked on, and what challenges you faced while you were, like, creating that architecture?
40 00:07:07.900 ⇒ 00:07:17.330 Demilade Agboola: Yeah, basically, because I’m thinking of, like, the dbt space, and, like, that kind of thing. What issues have you walked into? What issues did you…
41 00:07:17.780 ⇒ 00:07:18.900 Demilade Agboola: Facial, huh?
42 00:07:19.100 ⇒ 00:07:19.850 Demilade Agboola: dolphin.
43 00:07:20.130 ⇒ 00:07:37.100 Ashwini Sharma: Right, so there was this particular requirement that came from, like, I work for a mortgage servicing company, right? And one of the departments in the mortgage servicing is the collections department, where they try to, you know, ensure that the customers are paying their EMIs on time, right?
44 00:07:37.390 ⇒ 00:07:54.660 Ashwini Sharma: So, and, one of the KPIs that they wanted to analyze is, you know, how well our agents are doing their collection purpose, collection work, right? And the way it happens is, like, we have a customer care department, right, which reaches out to borrowers, and then they talk to them.
45 00:07:54.830 ⇒ 00:08:07.220 Ashwini Sharma: In case the borrowers are not able to make the payment on time, right, we suggest them alternatives on, you know, how they can, you know, how they can plan so that they are able to make the payments on time, right?
46 00:08:07.250 ⇒ 00:08:20.970 Ashwini Sharma: And, that is what, the kind of things that we wanted to analyze, the department wanted to analyze. And I started looking into it. The main call center data source comes from a software called Five9, right?
47 00:08:20.980 ⇒ 00:08:28.759 Ashwini Sharma: So that’s a call center application that we have been using, and it produces, you know, reports, daily reports of call logs.
48 00:08:28.940 ⇒ 00:08:33.919 Ashwini Sharma: And these call logs are, you know, in the form of an Excel file, and then…
49 00:08:34.020 ⇒ 00:08:36.230 Ashwini Sharma: I started ingesting this Excel file.
50 00:08:36.640 ⇒ 00:08:50.380 Ashwini Sharma: And, when I started looking into that data, it was really messed up, because, you know, sometimes even the primary keys were not accurate, right? And data was really, you know, in a bad condition, so I had to do a lot of data cleaning.
51 00:08:50.380 ⇒ 00:09:01.660 Ashwini Sharma: On top of that, and sometimes, you know, that single data set was not enough. We had to look into some other datasets to actually identify who was the agent that was talking to them.
52 00:09:01.820 ⇒ 00:09:17.059 Ashwini Sharma: to the customer, right? It is at the call level, right, those kind of records. And what happens is, like, for example, if there is a call between a customer and an agent, and the agent forwards that call to multiple other agents, right, which happens normally if you have talking to a call center, right?
53 00:09:17.060 ⇒ 00:09:30.200 Ashwini Sharma: In those cases, the granularity at the call level does not indicate who was the guy to whom the customer talked to, right? It goes into a different data set, so I had to bring that, another data set, marry them together, and then figure out
54 00:09:30.950 ⇒ 00:09:47.140 Ashwini Sharma: who was the agent that was, you know, actually talking to the customer, right? And then, once that was done, some of the data that was missing in that had to come from primary source of data. So the primary source of data that we have is called loan serve data, which contains all the attributes of loans.
55 00:09:47.140 ⇒ 00:10:03.109 Ashwini Sharma: That comes out in the form of COBOL files, right? So, the software that we use, it’s from a Sagint company called SageInt, produces, it has a loan-serve software, and then it dumps the data periodically, every day, in the form of COBOL files. It’s a big mainframe application.
56 00:10:03.280 ⇒ 00:10:19.240 Ashwini Sharma: And so I started ingesting data from COBOL files into DataBricks, right? And then used that as a primary source of data for the various attributes that are missing on the call center data, brought them together, and then started analyzing, like, once they are in the raw layer, right?
57 00:10:19.240 ⇒ 00:10:31.119 Ashwini Sharma: And then I followed the, you know, normal, this dbt transformation, in the medallion architecture, cleaned the data, created staging tables, created data mods, facts, and dimensions, and then
58 00:10:31.130 ⇒ 00:10:48.390 Ashwini Sharma: You know, showed them the analysis, right? But it also helped with other kind of analysis, which we are not expecting at that point of time, which was mainly, like, you know, at what points are the call center agents most active, right? When are they overloaded? When are they underloaded, right?
59 00:10:48.390 ⇒ 00:11:05.710 Ashwini Sharma: By how much person they are overloaded. When they are overloaded, what is the average time that the customer has to wait for in order to get to the call, right? Those kind of things were there. And in the meantime, what we did was, while this entire thing was going on, you know, the other application development team
60 00:11:05.710 ⇒ 00:11:22.660 Ashwini Sharma: we decided to, you know, deploy a chatbot, right? Now, chatbot is basically the first interface to which the customers start to interact with. And they will ask questions to the chatbot, and chatbot will use the data that is there in Databricks plus in VectorDB, and then return that data, right?
61 00:11:22.660 ⇒ 00:11:39.450 Ashwini Sharma: To the customers. And sometimes that kind of, you know, interaction answers some of the questions that the customers may have. Like, for example, what is the payment amount, next payment amount that I have to make, right? A simple question like that. Or, when do I have to make the next payment? Things like that, right?
62 00:11:40.340 ⇒ 00:11:41.360 Ashwini Sharma: So…
63 00:11:41.360 ⇒ 00:12:00.460 Ashwini Sharma: I started bringing in that data, so the chat logs with the chatbot, right? And then used that data, married over… it over to Five9 data to do some other kind of analysis, which is like, once we deployed the chatbot, right, what was the reduction in the number of calls that we are getting in the customer care, right?
64 00:12:00.460 ⇒ 00:12:08.159 Ashwini Sharma: And, using that, in fact, like, the management was able to reduce the headcount in the customer care department.
65 00:12:08.160 ⇒ 00:12:15.479 Ashwini Sharma: As well as, you know, the overall wait times that the customers were having, that got reduced.
66 00:12:15.620 ⇒ 00:12:23.560 Ashwini Sharma: Yeah, so basically, yeah, that’s the kind of impact that, that was created using this, overall exercise.
67 00:12:24.440 ⇒ 00:12:42.769 Demilade Agboola: You know, one of the things I really like about, like, how you ended that is that you’re also able to, like, tie in the fact that it does business impact, because sometimes people do really cool stuff, and really, like, there’s no business impact, and, like, you know, because we work in, like, consulting, we need business impact, business impact as.
68 00:12:42.770 ⇒ 00:12:44.329 Ashwini Sharma: Yes, exactly, right.
69 00:12:44.710 ⇒ 00:12:45.250 Demilade Agboola: Definitely.
70 00:12:45.250 ⇒ 00:12:52.040 Ashwini Sharma: The only thing that matters, right? I mean, you do stuff in Excel, and do they make the impact?
71 00:12:52.340 ⇒ 00:13:00.299 Ashwini Sharma: Yeah, that will have a better visibility than do airflow and all I-Fi transformations and no impact, right?
72 00:13:00.660 ⇒ 00:13:02.930 Demilade Agboola: Exactly, I agree with, definitely agree with that.
73 00:13:03.130 ⇒ 00:13:17.219 Demilade Agboola: So I’m just gonna walk through a couple of scenarios that I, like, we’ve experienced, and I would like to hear your thoughts on how you would handle it, if, you know, or, like, even if it’s just, like, where would your mind go, and how would you try and tackle that problem?
74 00:13:17.220 ⇒ 00:13:17.940 Ashwini Sharma: Right?
75 00:13:18.720 ⇒ 00:13:21.650 Demilade Agboola: The first thing is going to be…
76 00:13:22.030 ⇒ 00:13:24.240 Demilade Agboola: So let’s start technical, slightly technical.
77 00:13:24.360 ⇒ 00:13:29.050 Ashwini Sharma: Let me take a pen and a paper so that I can write down some of the points that you…
78 00:13:29.180 ⇒ 00:13:31.570 Ashwini Sharma: Highlight, yeah. Yeah, go, go ahead, go ahead.
79 00:13:31.790 ⇒ 00:13:35.279 Demilade Agboola: This is slightly technical, nothing too crazy.
80 00:13:35.420 ⇒ 00:13:41.500 Demilade Agboola: DBT jobs running, and they seem to be taking a very long time.
81 00:13:41.500 ⇒ 00:13:42.320 Ashwini Sharma: Okay.
82 00:13:42.910 ⇒ 00:13:46.660 Demilade Agboola: How would you go about trying to reduce the time?
83 00:13:46.830 ⇒ 00:13:47.990 Demilade Agboola: of execution.
84 00:13:49.330 ⇒ 00:13:54.549 Demilade Agboola: Again, this is not specific, so it’s just, like, how does your mind work in that scenario?
85 00:13:54.550 ⇒ 00:13:56.780 Ashwini Sharma: Right, yeah, so if this is,
86 00:13:56.870 ⇒ 00:14:12.130 Ashwini Sharma: I’d like to look into the dbt queries that have been written, right? Now, there are two aspects to this, right? One is it is taking a long time because the query is really bad, right? It’s not optimal, it has not been…
87 00:14:12.170 ⇒ 00:14:19.950 Ashwini Sharma: well thought through, right? The joints are really bad, you’re looking at datasets… you’re pulling data sets that
88 00:14:20.220 ⇒ 00:14:37.539 Ashwini Sharma: need not be surfaced up, right? It could happen, like, generally, like, new developers, when they write queries, they will select everything, and then do… start doing joins, right? So the movement of data that that’s happening because of the query, that could cause a delay in,
89 00:14:37.590 ⇒ 00:14:48.800 Ashwini Sharma: In the overall… I mean, that could explain why dbt is taking a lot of time, right? That is one thing. So, basically, this… I want to put it into the category of bad queries, right?
90 00:14:49.270 ⇒ 00:14:49.660 Demilade Agboola: Okay.
91 00:14:50.650 ⇒ 00:15:00.569 Ashwini Sharma: The other part is, I would say, bad modeling, right? So when you are doing the DBT modeling, right, you might have… be doing the same transformation again and again.
92 00:15:00.610 ⇒ 00:15:11.499 Ashwini Sharma: In multiple use cases, right? So this is another thing that I would like to look into, and ensure that if we have a, you know, ensure that we follow the principles of don’t repeat yourself.
93 00:15:11.570 ⇒ 00:15:26.530 Ashwini Sharma: In the entire debit model, right? If there is one transformation that has been created, then make it generic enough so that it can be used across multiple other transformations. That’s the second aspect, right? So I’ll put it into the category of bad modeling.
94 00:15:28.660 ⇒ 00:15:32.880 Ashwini Sharma: Right? Now, the third aspect is, how is your data organized in the…
95 00:15:32.950 ⇒ 00:15:49.229 Ashwini Sharma: In the raw layer, right? In the source layer, from where dbt pulls it up. Now, it could be that, you know, the volume of data is huge, you have not followed a good partitioning strategy on the source layer, right? And by following a better partitioning strategy.
96 00:15:49.230 ⇒ 00:15:56.650 Ashwini Sharma: your queries could, you know, perform a bit faster, right? So let’s, let’s, put this into the category of,
97 00:15:56.810 ⇒ 00:16:00.529 Ashwini Sharma: You know, source, data organization at source, right?
98 00:16:04.550 ⇒ 00:16:05.340 Ashwini Sharma: Yeah.
99 00:16:05.460 ⇒ 00:16:15.149 Ashwini Sharma: So, I think these three categories I would like to explore into in order to figure out why the dbt jobs are taking a lot more time than what is expected.
100 00:16:15.810 ⇒ 00:16:21.530 Demilade Agboola: Yeah, those are very, like, those are actually very good answers. I know I have literally worked on
101 00:16:22.270 ⇒ 00:16:31.819 Demilade Agboola: projects with Brainforge where jobs were taking 30 minutes to run, and after optimization, we got it down to, like, 5 minutes, or even, like, 5 minutes.
102 00:16:32.920 ⇒ 00:16:39.050 Demilade Agboola: Yeah, and again, like, things you said are things I literally had to implement, you know.
103 00:16:39.300 ⇒ 00:16:44.989 Demilade Agboola: The distribution on keys, are we joining on the right keys? Are we… do we need to, like, make the distribution on keys
104 00:16:45.220 ⇒ 00:16:47.720 Demilade Agboola: like, the right keys that we join on further.
105 00:16:47.720 ⇒ 00:16:48.700 Ashwini Sharma: Right, yeah.
106 00:16:49.060 ⇒ 00:16:54.600 Demilade Agboola: So, like, ensure that instead of making it a table, for instance, you can make incremental.
107 00:16:54.600 ⇒ 00:16:59.960 Ashwini Sharma: Right, yeah, that’s another which I forgot, right? Yeah, incremental updates to table, yeah, yeah.
108 00:17:00.110 ⇒ 00:17:08.910 Demilade Agboola: Exactly. Things like that, like, how do we ensure that we’re not just using the same thing over and over, instead of select star, which is what they like to do.
109 00:17:08.910 ⇒ 00:17:10.069 Ashwini Sharma: Right, yeah.
110 00:17:10.280 ⇒ 00:17:21.569 Demilade Agboola: just 3 columns that we need, so we don’t have the entire. And just… this is just very random. One of the clients we consulted for, they had a table that had 402
111 00:17:21.849 ⇒ 00:17:22.790 Demilade Agboola: Columns.
112 00:17:23.210 ⇒ 00:17:24.710 Ashwini Sharma: Oh, man, yeah.
113 00:17:24.710 ⇒ 00:17:27.390 Demilade Agboola: So if you’re doing this to, like, star for that sort of table, you’re doing.
114 00:17:27.390 ⇒ 00:17:28.890 Ashwini Sharma: Oh, yeah, yeah.
115 00:17:29.050 ⇒ 00:17:41.249 Demilade Agboola: you’re really pushing the queries, so… Right. And like I said, like, you’re… you’re… I like to… I like where your mind was going. It was looking at the right things and trying to optimize the right.
116 00:17:41.250 ⇒ 00:17:41.800 Ashwini Sharma: Shut up.
117 00:17:41.810 ⇒ 00:17:43.500 Demilade Agboola: The right things.
118 00:17:43.720 ⇒ 00:17:48.630 Demilade Agboola: Another question, or another scenario I have is,
119 00:17:49.450 ⇒ 00:17:51.860 Demilade Agboola: How… so this is not technical, this is more like…
120 00:17:51.990 ⇒ 00:18:06.699 Demilade Agboola: interpersonal, how you solve things. When you’re working through problems, how do you prioritize what needs to be done? Like, because again, we’re consultants, we work across multiple clients.
121 00:18:06.940 ⇒ 00:18:14.430 Demilade Agboola: how does your mind prioritize work? How do you say, this is number one, number two, number three, I need to get done?
122 00:18:16.870 ⇒ 00:18:28.300 Ashwini Sharma: Yeah, so, like, normally, I mean, whenever we are doing any kind of work, right, I think of the MVP that we can deliver as soon as possible, right? And,
123 00:18:28.300 ⇒ 00:18:39.050 Ashwini Sharma: You know, I don’t like to, you know, plan it in a waterfall model, and then do one thing after the other, but I would like to do all the things in parallel, and
124 00:18:39.070 ⇒ 00:18:47.290 Ashwini Sharma: And ensure that, you know, something can be delivered as soon as possible, right? And that’s my working strategy, right?
125 00:18:47.480 ⇒ 00:18:54.300 Ashwini Sharma: In terms of prioritization, I’ll have to break it down into individual work items and try to see, you know, what
126 00:18:54.590 ⇒ 00:19:03.300 Ashwini Sharma: what, you know, what component, creation or development is going to impact us the most, right? Or, like, you know.
127 00:19:03.420 ⇒ 00:19:07.650 Ashwini Sharma: Find the component which is going to be dependent
128 00:19:07.760 ⇒ 00:19:18.429 Ashwini Sharma: which is going to… I mean, the other components are going to be dependent on that, right? So, if that is something, then I would like to address that first.
129 00:19:19.430 ⇒ 00:19:26.870 Ashwini Sharma: Yeah, I mean, if you can give me a specific example, then maybe I can walk through it, but I’m not really able to think through right now.
130 00:19:27.100 ⇒ 00:19:33.609 Demilade Agboola: Yeah, so, let’s say, for instance, and this, this literally happens to me a lot of the time.
131 00:19:33.610 ⇒ 00:19:34.160 Ashwini Sharma: Yup.
132 00:19:35.110 ⇒ 00:19:39.310 Demilade Agboola: I, personally, right now, I am on… 3 projects?
133 00:19:40.300 ⇒ 00:19:43.259 Demilade Agboola: Right, so as you can tell, it’s a…
134 00:19:43.260 ⇒ 00:19:45.240 Ashwini Sharma: That’s a lot, yeah, that’s a lot going on, right?
135 00:19:45.560 ⇒ 00:19:52.160 Demilade Agboola: Obviously, Something might break in one client. Right. Jobs are not running.
136 00:19:52.750 ⇒ 00:19:58.190 Demilade Agboola: Another client is saying, hey, we need this to be modeled, right?
137 00:19:58.430 ⇒ 00:20:03.780 Demilade Agboola: Another client is… Also seeing they need this to be modeled.
138 00:20:06.260 ⇒ 00:20:09.149 Demilade Agboola: How do you go about determining
139 00:20:11.110 ⇒ 00:20:14.269 Demilade Agboola: And there’s no, like, right or wrong answer, I just really just want to…
140 00:20:14.660 ⇒ 00:20:24.730 Demilade Agboola: how you think through these things? How do you go about determining, hey, today, this is what I need to do today, this I can push to tomorrow.
141 00:20:25.070 ⇒ 00:20:25.630 Demilade Agboola: Right?
142 00:20:25.630 ⇒ 00:20:26.110 Ashwini Sharma: yet.
143 00:20:26.110 ⇒ 00:20:37.420 Demilade Agboola: I’m trying to say, like, how do you go about that process? Like, I think I can push this tomorrow. And when you decide what you want to push to tomorrow, how do you then ensure that
144 00:20:37.750 ⇒ 00:20:41.100 Demilade Agboola: Everyone is satisfied at the end of the day.
145 00:20:41.100 ⇒ 00:21:00.560 Ashwini Sharma: Right? Yeah, it will all be impact-driven, right? For example, somebody’s job is not running, it means they are not getting fresh data, it means they are not able to make the decisions that they have to make on a… maybe on a day-to-day basis, or on a weekly basis, whatever is the cadence over there, right? So that takes a higher priority, right? If somebody’s job is not running.
146 00:21:00.560 ⇒ 00:21:12.229 Ashwini Sharma: fix it immediately. That’s the highest priority for me. Versus, like, if the other option was, you know, to create modeling for somebody else, right? That takes the second priority.
147 00:21:12.230 ⇒ 00:21:29.169 Ashwini Sharma: But even, in that also, like, you’ll have to… I think it’s on a use case-to-use case basis, right? If the job that is failing for some customer is a low-priority job, right? And, you know, they look at that data once a month, or maybe, you know, once in two weeks, then
148 00:21:29.170 ⇒ 00:21:38.420 Ashwini Sharma: probably the modeling would take more priority. I would say it’s on a use case-to-use case basis, based on the client, as well as based on
149 00:21:38.430 ⇒ 00:21:46.719 Ashwini Sharma: You know, if it is a very high-paying client, then obviously you have to, you know, look into their issue, even if it is a trivial one, right?
150 00:21:46.970 ⇒ 00:21:47.980 Demilade Agboola: Oh.
151 00:21:47.980 ⇒ 00:21:50.920 Ashwini Sharma: Yeah, in the end, it’s all about impact.
152 00:21:51.050 ⇒ 00:21:53.109 Ashwini Sharma: You know, you know, not…
153 00:21:53.230 ⇒ 00:22:01.279 Ashwini Sharma: not doing a certain thing, what is the impact of that versus not doing this thing? What is the impact, right? So, minimize
154 00:22:02.770 ⇒ 00:22:07.809 Ashwini Sharma: minimize the impact of not doing something, right? Let’s… let’s put it this way.
155 00:22:08.370 ⇒ 00:22:15.179 Demilade Agboola: That is fair, that is fair. Another thing that I was thinking about is…
156 00:22:15.890 ⇒ 00:22:24.439 Demilade Agboola: Just out of curiosity, what hours would you be, like, looking to work? Like, would you be able to work U.S. hours? How, like, time zone?
157 00:22:24.440 ⇒ 00:22:26.100 Ashwini Sharma: I’m open to anything.
158 00:22:26.360 ⇒ 00:22:28.490 Demilade Agboola: Okay, alright,
159 00:22:28.620 ⇒ 00:22:32.220 Demilade Agboola: And you’re… sorry, if you don’t mind, I didn’t ask at the beginning, where are you based?
160 00:22:32.220 ⇒ 00:22:33.579 Ashwini Sharma: I’m based out of India.
161 00:22:33.850 ⇒ 00:22:39.210 Demilade Agboola: He’s out of India, okay, so you’ll be… I know, like, Awish is in Pakistan, I don’t know if you… do you know that.
162 00:22:39.470 ⇒ 00:22:42.180 Ashwini Sharma: Oh, okay, I thought he was in EST, sorry.
163 00:22:42.390 ⇒ 00:22:52.070 Ashwini Sharma: Yeah, he’s actually in Pakistan, so… Oh, okay. No, I would have given a different time for him in that case, right? It is quite late for him. Where are you based out of? You are in EST.
164 00:22:52.640 ⇒ 00:23:00.090 Demilade Agboola: Oh, no, I’m actually in Malta, so I’m… do you know, so Malta is really close to Italy. It’s a… Italy.
165 00:23:00.090 ⇒ 00:23:00.900 Ashwini Sharma: Okay.
166 00:23:01.240 ⇒ 00:23:03.270 Demilade Agboola: And I’m not really close with Italy, so…
167 00:23:03.270 ⇒ 00:23:08.509 Ashwini Sharma: Okay, I would have given an earlier time in that case, sorry for that.
168 00:23:08.510 ⇒ 00:23:20.439 Demilade Agboola: No, no, no, it’s fine. Like, we… so, part of the reason why I’m asking is because we tend to work, like, US hours. I mean, obviously, if things are happening where we need to, like, block off time, or, like, hold the available
169 00:23:20.840 ⇒ 00:23:23.180 Demilade Agboola: After, like… ST?
170 00:23:23.330 ⇒ 00:23:24.939 Demilade Agboola: Like, that’s fine.
171 00:23:24.940 ⇒ 00:23:27.899 Ashwini Sharma: And that makes sense, yeah. The clients are all in the US, right?
172 00:23:28.620 ⇒ 00:23:29.720 Demilade Agboola: Yeah, so… An idiot.
173 00:23:29.720 ⇒ 00:23:30.300 Ashwini Sharma: So…
174 00:23:30.300 ⇒ 00:23:31.390 Demilade Agboola: Every time support.
175 00:23:32.260 ⇒ 00:23:33.869 Ashwini Sharma: I’m open to USRCs.
176 00:23:34.120 ⇒ 00:23:40.949 Demilade Agboola: Okay, alright, that’s fine. Also, I also wanted to say or ask,
177 00:23:42.100 ⇒ 00:23:46.649 Demilade Agboola: Shoot, that was a question that was in my mind. But effectively, right, I think…
178 00:23:47.350 ⇒ 00:23:54.899 Demilade Agboola: I like the way you work. I like the way you’re able to, like, think through problems and try and solve problems.
179 00:23:55.590 ⇒ 00:24:02.420 Demilade Agboola: I find that that’s usually the difference between someone who solves problems and can operate at a high level versus someone who
180 00:24:02.650 ⇒ 00:24:07.069 Demilade Agboola: Like, needs guidance, because it’s just, literally, how can you think to the problem?
181 00:24:07.270 ⇒ 00:24:07.750 Demilade Agboola: Right.
182 00:24:07.990 ⇒ 00:24:12.910 Demilade Agboola: the rest of it is implementation, like, how do you…
183 00:24:13.620 ⇒ 00:24:17.300 Demilade Agboola: Another thing I would… I was also going to ask, yes, I finally remembered the question.
184 00:24:17.300 ⇒ 00:24:18.610 Ashwini Sharma: In terms of…
185 00:24:21.650 ⇒ 00:24:23.510 Demilade Agboola: observability.
186 00:24:23.510 ⇒ 00:24:24.120 Ashwini Sharma: Yes.
187 00:24:24.310 ⇒ 00:24:29.730 Demilade Agboola: and ensuring data accuracy, how do you go about it? How do you ensure that
188 00:24:30.020 ⇒ 00:24:35.079 Demilade Agboola: You are the person who Detects that there’s a problem before the stakeholders.
189 00:24:35.080 ⇒ 00:24:47.839 Ashwini Sharma: Yeah, so there is a bunch of dbt tests that we have written in the data transformation, as well as in data ingestion pipelines, which detects the kind of, you know, the data quality that is being transformed or that is being ingested, right?
190 00:24:47.840 ⇒ 00:24:59.310 Ashwini Sharma: In terms of data ingestion, I generally have… I do not look into the data in the data ingestion pipelines, but I look at the metadata, right? Like, for example, did every table refresh?
191 00:24:59.360 ⇒ 00:25:15.300 Ashwini Sharma: on its, scheduled, update time, right? What was the row count? Was there any difference, significant difference in the row count ingested in each of the tables, right? And if the difference is beyond 6 times of the standard deviation of the number of rows that we have been getting for
192 00:25:15.500 ⇒ 00:25:20.780 Ashwini Sharma: specific table, that’s… that’s when I send out alerts, saying that, you know, we need to look into,
193 00:25:20.940 ⇒ 00:25:38.599 Ashwini Sharma: look into this table, this ingestion pipeline, it didn’t work as it is supposed to work. Something happened, right? At that point, we don’t know what happened, but definitely something has happened. The number of row counts we got is very less. In terms of DB test, we check, you know, if certain records which are not supposed to be null.
194 00:25:38.680 ⇒ 00:25:52.229 Ashwini Sharma: are null or not, right? If there is a duplicate records or not, duplicate primary keys, right? If there are column values that are, you know, very off of the certain range that they should be in.
195 00:25:52.230 ⇒ 00:26:04.510 Ashwini Sharma: Right? These kind of things that I look into. Also, look into things like referential integrity, right? So, for example, if there is a certain record that is referencing another table, a primary key in another table.
196 00:26:04.520 ⇒ 00:26:07.339 Ashwini Sharma: Is it there? Does it exist? Or…
197 00:26:07.960 ⇒ 00:26:10.570 Ashwini Sharma: Because in warehouse, like, we’ll have a lot of records that
198 00:26:10.680 ⇒ 00:26:21.009 Ashwini Sharma: that have such kind of things, right, issues. So these are the, you know, data quality issues, or, what I say, like, observability things that I have worked on.
199 00:26:21.120 ⇒ 00:26:30.670 Ashwini Sharma: In my projects. I’ve not used tools like, you know, Metaplane or Datadog, in terms of observing the data pipelines, but,
200 00:26:30.730 ⇒ 00:26:48.490 Ashwini Sharma: you know, like, the current company that I’m working is not… not a big company, it’s a small, cannot afford to buy these tools, right? And so the way I do is, you know, create some custom metadata, which I analyze myself. I run scripts to analyze, and then send out alerts based on
201 00:26:48.600 ⇒ 00:26:50.600 Ashwini Sharma: You know, what has happened with the data.
202 00:26:50.940 ⇒ 00:26:53.579 Demilade Agboola: But those are the points that I look into.
203 00:26:54.160 ⇒ 00:27:00.750 Demilade Agboola: Okay, I think my next question is, how do you ensure That whatever data you are…
204 00:27:02.700 ⇒ 00:27:05.079 Demilade Agboola: So not all tables are the same, right?
205 00:27:05.080 ⇒ 00:27:05.830 Ashwini Sharma: Yes,
206 00:27:06.000 ⇒ 00:27:06.750 Ashwini Sharma: Right.
207 00:27:06.750 ⇒ 00:27:09.909 Demilade Agboola: So, an error in a fact orders table…
208 00:27:10.780 ⇒ 00:27:18.560 Demilade Agboola: is not… is way higher than an error in, say, like, a deem customer table. Now, obviously, you would want to solve both of them.
209 00:27:18.560 ⇒ 00:27:19.440 Ashwini Sharma: Right. Right.
210 00:27:19.440 ⇒ 00:27:39.009 Demilade Agboola: But an increased row count in fact orders is way scarier than an increased order. Like, I mean, when you join them, you could also, like, have a join that spreads it out, but, like, ideally, you would want to ensure that there are certain tables that have higher priority. So how do you manage that? How do you ensure that, like, if…
211 00:27:39.010 ⇒ 00:27:44.920 Demilade Agboola: everything is burning. How do you ensure that the ones that are the most important are the ones that you can see?
212 00:27:46.210 ⇒ 00:27:55.239 Ashwini Sharma: So, yeah, I mean, like, you know, when I look at the number of errors that I’ve got for different tables, I already know which are of higher importance.
213 00:27:55.280 ⇒ 00:28:08.060 Ashwini Sharma: And then I can take action on it, in the order of priority, right? But if your question is, like, for a client, when you are not aware which is higher priority, is that correct? Am I understanding it correct?
214 00:28:08.510 ⇒ 00:28:14.710 Ashwini Sharma: I mean, usually once we get on clients, we get to know what is high priority, we figure it out, you know?
215 00:28:14.710 ⇒ 00:28:23.510 Demilade Agboola: But my question is, when you are coming in and setting up observability, and you’re just like, okay, so we need to be sure that
216 00:28:23.740 ⇒ 00:28:28.889 Demilade Agboola: the CEO is now reaching out to us and asking us questions about why is this data bad.
217 00:28:28.890 ⇒ 00:28:32.140 Ashwini Sharma: Or, you know, the CFO, or…
218 00:28:32.190 ⇒ 00:28:41.680 Demilade Agboola: whoever, like, high-important C-suite stakeholders are not reaching out to us and saying, that looks bad, dashboard is broken, the numbers don’t make sense.
219 00:28:42.220 ⇒ 00:28:47.490 Demilade Agboola: Or this number seemed inflated, or these numbers seem really low. Like, how do you ensure that, like.
220 00:28:49.790 ⇒ 00:28:53.130 Demilade Agboola: You have your eyes on the right tables, and you’re not just, like.
221 00:28:53.290 ⇒ 00:28:56.929 Demilade Agboola: losing track of that. I guess that’s… that’s basically my question.
222 00:28:57.350 ⇒ 00:29:16.829 Ashwini Sharma: So, yeah, normally that is done via data lineage, I would say, right? Like, for example, if there is a set of executive dashboards which are monitored by CEO and CFO, right? I have a lineage document that I’ve created on how the data is, you know, data on the dashboard comes through, comes,
223 00:29:16.880 ⇒ 00:29:27.799 Ashwini Sharma: through a set of transformations and a set of tables, right from the source till the end, right? So that’s where I start investigating, right? That becomes a higher priority, monitoring.
224 00:29:27.800 ⇒ 00:29:38.529 Ashwini Sharma: If somebody says that, you know, this… in this dashboard data is not correct, I immediately know which are the tables that I have to look into now. So, that is sort of, like, how…
225 00:29:38.940 ⇒ 00:29:45.630 Ashwini Sharma: how I work, I mean, somebody else could do it in a different way, but yeah…
226 00:29:46.070 ⇒ 00:29:51.669 Demilade Agboola: Yeah, that’s, that’s… Just a couple of things that, you know.
227 00:29:52.630 ⇒ 00:29:57.560 Demilade Agboola: could help, like, with monitoring is sometimes… you know DBT has, like, exposure?
228 00:29:58.530 ⇒ 00:30:07.600 Demilade Agboola: So you can actually expose the table, so you can actually see the lineage all the way to the table, so that also helps as well, in trying to do some of these things.
229 00:30:07.840 ⇒ 00:30:15.379 Demilade Agboola: But, yeah, ultimately, yeah, I do agree, that’s one of the tricky parts, is… because we do monitor some things.
230 00:30:15.380 ⇒ 00:30:16.000 Ashwini Sharma: Right.
231 00:30:16.000 ⇒ 00:30:22.600 Demilade Agboola: The hard part is… some things are not just important. Like, you can see an error.
232 00:30:22.960 ⇒ 00:30:23.630 Ashwini Sharma: Yeah.
233 00:30:23.630 ⇒ 00:30:29.539 Demilade Agboola: Or, like, website traffic data, but you know marketing isn’t using the dashboards. You know, like, we have more.
234 00:30:29.540 ⇒ 00:30:30.410 Ashwini Sharma: Alright.
235 00:30:30.810 ⇒ 00:30:34.100 Demilade Agboola: It’s not important, it’s an error, yes, but, like.
236 00:30:34.310 ⇒ 00:30:41.340 Demilade Agboola: We can deal that… we can push that further down the priority list, but we see some errors for, like, you know.
237 00:30:42.200 ⇒ 00:30:44.440 Demilade Agboola: De fact orders, or, you know…
238 00:30:44.440 ⇒ 00:30:45.130 Ashwini Sharma: Yep.
239 00:30:45.130 ⇒ 00:30:49.299 Demilade Agboola: You immediately know, like, no, this needs to be resolved immediately.
240 00:30:49.600 ⇒ 00:30:50.350 Ashwini Sharma: Yeah, true.
241 00:30:50.580 ⇒ 00:30:53.759 Demilade Agboola: So it’s just kind of, like, understanding, like, how…
242 00:30:54.270 ⇒ 00:30:59.610 Demilade Agboola: That can be resolved, and, like, which ones take the highest priorities versus the other one.
243 00:31:02.150 ⇒ 00:31:07.500 Demilade Agboola: But yeah, do you have any questions about, like, how we work? Because time is almost up, I just…
244 00:31:07.500 ⇒ 00:31:25.740 Ashwini Sharma: Yeah, I got the answers from Avais yesterday when I was interacting with him, so I got her understanding, yeah, you have a lot of context switching going on, working with multiple clients, that’s a challenge, right? And, yeah, I get the overall workflow that you are working with.
245 00:31:26.140 ⇒ 00:31:40.889 Demilade Agboola: Okay, no, that’s great, because I just wanted to be able to, you know, help you feel comfortable with the idea of what we do. But yes, there’s definitely a lot of, like, context switching. Like I told you, I’m on, like, 3 clients.
246 00:31:40.890 ⇒ 00:31:41.240 Ashwini Sharma: Right.
247 00:31:41.720 ⇒ 00:31:46.950 Demilade Agboola: Sometimes they ramp up or slow down, depending on what’s going on. If it’s a good week.
248 00:31:47.240 ⇒ 00:31:48.480 Demilade Agboola: Something else.
249 00:31:48.480 ⇒ 00:31:55.739 Ashwini Sharma: Yeah, so how do you maintain what is going on with each client? Like, do you write it down in a copy notebook, and then, you know,
250 00:31:56.130 ⇒ 00:32:02.489 Ashwini Sharma: Or what tools do you use to, you know, keep a track of what’s going on with each client, right?
251 00:32:03.140 ⇒ 00:32:08.630 Demilade Agboola: So usually, so what helps is that we have projects managers, so that helps. Okay.
252 00:32:08.790 ⇒ 00:32:17.249 Demilade Agboola: The product managers are usually the ones responsible for, like, allocation, keeping track of clients’ needs.
253 00:32:17.980 ⇒ 00:32:22.730 Demilade Agboola: So, what… what tends to happen is if there’s… if there are people, like.
254 00:32:22.750 ⇒ 00:32:38.460 Demilade Agboola: if they’re high-priority things, they might say, hey, this week, you might need to, like, focus really a lot on this, because we need this for this client. So some weeks, which is why I said the whole thing about, like, you might be on 3 clients, but ultimately, some weeks are slower with some clients, because…
255 00:32:38.850 ⇒ 00:32:56.929 Demilade Agboola: What might be heavy on that week is they need a lot of visualization and a lot of analysis done. So that week, we might not really do a lot of modeling, so it might just be… you’re… this… don’t focus heavy on this client unless there is an urgent request that we might need to jump in on.
256 00:32:57.110 ⇒ 00:32:57.840 Ashwini Sharma: Right.
257 00:32:58.860 ⇒ 00:33:05.439 Demilade Agboola: But for these other clients, you might need to be focusing on those two. So you have project managers that help with…
258 00:33:05.440 ⇒ 00:33:06.919 Ashwini Sharma: Okay, cool, cool, yeah.
259 00:33:07.580 ⇒ 00:33:11.059 Demilade Agboola: And also, part of why I also asked that question of, like.
260 00:33:11.230 ⇒ 00:33:18.960 Demilade Agboola: how do you handle and prioritize this? Because I also wanted you to be able to see if, like, you can interact with people, because that’s a very important part of
261 00:33:19.220 ⇒ 00:33:32.420 Demilade Agboola: what we do. So we need to be able to interact with the project managers and say, hey, I’m feeling overwhelmed on this project, or there’s a lot going on on this project. I will be able to handle this project right now. And then the project manager has to be able to figure out
262 00:33:32.630 ⇒ 00:33:36.260 Demilade Agboola: Whether they can reshuffle what is assigned to you to someone else.
263 00:33:36.330 ⇒ 00:33:37.250 Ashwini Sharma: Yep.
264 00:33:37.250 ⇒ 00:33:41.420 Demilade Agboola: Communicate the client that, hey, this won’t come in this week, this might come in next week.
265 00:33:41.550 ⇒ 00:33:44.539 Ashwini Sharma: Right, yeah, yeah. Communication is the key, definitely.
266 00:33:44.540 ⇒ 00:33:50.940 Demilade Agboola: Yeah, communication is key. So, as best as possible, it’s very helpful to not work in silos. It doesn’t really affect…
267 00:33:51.810 ⇒ 00:33:53.070 Ashwini Sharma: Right, right, true.
268 00:33:53.540 ⇒ 00:33:58.759 Demilade Agboola: Okay. Cool. Alright, it was fun talking to you. Feel free to reach out if you have any other questions.
269 00:33:58.760 ⇒ 00:33:59.610 Ashwini Sharma: Sure.
270 00:33:59.940 ⇒ 00:34:00.549 Demilade Agboola: Alright, thanks.
271 00:34:00.550 ⇒ 00:34:03.050 Ashwini Sharma: Same here, same here. Nice talking to you, Demilit.
272 00:34:03.450 ⇒ 00:34:04.970 Demilade Agboola: Alright, take care.
273 00:34:04.970 ⇒ 00:34:10.100 Ashwini Sharma: All right, yeah, you too have a nice day, nice afternoon, yes. Nice evening, sorry.
274 00:34:10.100 ⇒ 00:34:12.559 Demilade Agboola: It’s like 6, 16.
275 00:34:12.560 ⇒ 00:34:14.960 Ashwini Sharma: Okay, alright.
276 00:34:14.969 ⇒ 00:34:15.539 Demilade Agboola: Alright.
277 00:34:15.540 ⇒ 00:34:17.100 Ashwini Sharma: Thank you. Bye.