Meeting Title: Brainforge Interview w- Awaish Date: 2026-04-15 Meeting participants: Adetoro Adedeji, Awaish Kumar
WEBVTT
1 00:04:02.250 ⇒ 00:04:03.080 Awaish Kumar: Hello?
2 00:04:06.430 ⇒ 00:04:07.699 Adetoro Adedeji: Hello, good dear.
3 00:04:08.530 ⇒ 00:04:09.760 Awaish Kumar: Hi, how you doing?
4 00:04:10.610 ⇒ 00:04:12.010 Adetoro Adedeji: Boom, thank you.
5 00:04:12.230 ⇒ 00:04:13.720 Adetoro Adedeji: Trust yeah, good, as well.
6 00:04:15.610 ⇒ 00:04:17.929 Awaish Kumar: Okay, so where are you located?
7 00:04:19.240 ⇒ 00:04:21.290 Adetoro Adedeji: I’m located in Lagos, Nigeria.
8 00:04:23.110 ⇒ 00:04:23.900 Awaish Kumar: Okay.
9 00:04:24.150 ⇒ 00:04:28.050 Awaish Kumar: Okay, so what time is it in, Nigeria right now?
10 00:04:29.070 ⇒ 00:04:31.280 Adetoro Adedeji: It’s 7 or 2 p.m.
11 00:04:32.600 ⇒ 00:04:34.230 Adetoro Adedeji: 7pm in the evening.
12 00:04:36.570 ⇒ 00:04:37.340 Awaish Kumar: Okay.
13 00:04:38.410 ⇒ 00:04:41.809 Awaish Kumar: 7, okay, that’s… That’s fine, right? It’s…
14 00:04:43.950 ⇒ 00:04:46.080 Awaish Kumar: It’s 11 for me right now.
15 00:04:46.340 ⇒ 00:04:47.790 Awaish Kumar: Yeah.
16 00:04:48.460 ⇒ 00:04:49.430 Awaish Kumar: So…
17 00:04:49.640 ⇒ 00:04:58.310 Awaish Kumar: Yeah, I just wanted to brief you about this session. In this session, we are just going to talk about, like, a little bit more about your background.
18 00:04:58.460 ⇒ 00:05:00.449 Awaish Kumar: And what you have been doing so far?
19 00:05:00.620 ⇒ 00:05:08.350 Awaish Kumar: the projects you have worked on, and I’m here to answer any questions you have regarding I’ll bring forward.
20 00:05:10.130 ⇒ 00:05:13.130 Awaish Kumar: Okay, let’s get started with your introduction.
21 00:05:14.780 ⇒ 00:05:15.370 Adetoro Adedeji: Okay.
22 00:05:15.620 ⇒ 00:05:17.970 Adetoro Adedeji: Thank you very much. Thanks for having me here.
23 00:05:18.080 ⇒ 00:05:21.169 Adetoro Adedeji: So, my name is Esther Aditura DBJ.
24 00:05:21.630 ⇒ 00:05:26.959 Adetoro Adedeji: I’m a data professional with, about 6 years of experience,
25 00:05:27.460 ⇒ 00:05:40.079 Adetoro Adedeji: In analytics consulting, big data analytics, I am a Certified Senior Big Data Analyst with, Data Science Council of America.
26 00:05:40.470 ⇒ 00:05:43.679 Adetoro Adedeji: I have strong analytical background.
27 00:05:43.960 ⇒ 00:05:52.200 Adetoro Adedeji: And I have keen interest in data engineering, machine learning, and that has been my drive so far throughout my career.
28 00:05:52.470 ⇒ 00:06:10.559 Adetoro Adedeji: Because of consulting, I’ve been able to, you know, work across several data environments, going from telecommunication, to banking to the fintech space. Currently, I work in the fintech space as a senior big data analyst.
29 00:06:10.910 ⇒ 00:06:24.039 Adetoro Adedeji: And generally, I’m passionate about helping organizations to build and design scalable data systems that delivers clear and, you know, scalable insights, that helps
30 00:06:24.170 ⇒ 00:06:43.909 Adetoro Adedeji: departments be able to make data-driven decisions, and also even help senior stakeholders and management be able to make decisions smarter and faster, thereby also improving their revenue. And overall, I help organizations move their data maturity level forward.
31 00:06:45.050 ⇒ 00:06:51.680 Awaish Kumar: Okay, so how do you… what does the current development process looks like for you?
32 00:06:52.100 ⇒ 00:06:56.129 Awaish Kumar: Audio… Basically, work, how you develop…
33 00:06:57.540 ⇒ 00:06:58.290 Adetoro Adedeji: Okay.
34 00:06:58.520 ⇒ 00:07:03.210 Adetoro Adedeji: Currently, well, I think development is,
35 00:07:03.890 ⇒ 00:07:09.310 Adetoro Adedeji: Depending… dependent on the business problem at hand, right?
36 00:07:09.430 ⇒ 00:07:10.730 Adetoro Adedeji: And…
37 00:07:10.850 ⇒ 00:07:26.169 Adetoro Adedeji: Currently, as a senior data analyst in my current company, I own data product… project… products end-to-end, right? So, say, from the process, from the point of, you know.
38 00:07:26.850 ⇒ 00:07:30.550 Adetoro Adedeji: ingesting the data from… the sauce…
39 00:07:30.960 ⇒ 00:07:33.869 Adetoro Adedeji: In this case, say, a real-time system.
40 00:07:34.150 ⇒ 00:07:45.890 Adetoro Adedeji: And then collecting it into the staging environment, or the first landing area, for example. I build pipelines that help, you know, carry those data to
41 00:07:46.040 ⇒ 00:08:01.979 Adetoro Adedeji: data warehouses, where I then model it into maybe data mats, depending on the business problem I’m trying to solve, or the context, or the department I’m filling for. And I currently use tools like, because for all these processes I’ve mentioned.
42 00:08:02.110 ⇒ 00:08:15.449 Adetoro Adedeji: Most analytics tools are majorly dependent on SQL, Python, so, I mean, there are several tools that then leverage these languages, you know, to run the processes, but that’s how development has been formed so far.
43 00:08:15.620 ⇒ 00:08:18.939 Awaish Kumar: Like, I mean, what do you use as an ID?
44 00:08:19.710 ⇒ 00:08:20.899 Awaish Kumar: For your development.
45 00:08:22.580 ⇒ 00:08:27.379 Adetoro Adedeji: Okay, currently CSQL, MSSQL.
46 00:08:27.550 ⇒ 00:08:32.520 Adetoro Adedeji: is the major idea. And I also use, Visual Studio Code.
47 00:08:33.330 ⇒ 00:08:35.290 Adetoro Adedeji: For the ingestion pipeline.
48 00:08:35.709 ⇒ 00:08:39.159 Awaish Kumar: Have you been using AI to help with development?
49 00:08:40.530 ⇒ 00:08:41.409 Adetoro Adedeji: Yes.
50 00:08:41.780 ⇒ 00:08:56.599 Adetoro Adedeji: Yes, I use AI daily in both my, you know, work and in my daily life. Majorly, what I use AI for… I personally believe that AI is only as smart as, you know, the data you provide it.
51 00:08:56.660 ⇒ 00:09:10.929 Adetoro Adedeji: I believe that AI is… it’s only as smart as the prompts you give to it, right? So, what I use AI for daily is around, say, I want to, maybe review,
52 00:09:11.020 ⇒ 00:09:26.090 Adetoro Adedeji: a script, I want to, say, carry some… carry out some debugging, I want to review some documentation, or even draft technical specifications. Those are things I typically use AI.
53 00:09:26.200 ⇒ 00:09:27.919 Awaish Kumar: Excuse me.
54 00:09:28.350 ⇒ 00:09:30.809 Awaish Kumar: write code, basically. That is my question.
55 00:09:31.450 ⇒ 00:09:32.709 Adetoro Adedeji: Sorry, what’d you just say?
56 00:09:33.550 ⇒ 00:09:35.839 Awaish Kumar: Did you use the eye to write code?
57 00:09:36.660 ⇒ 00:09:38.879 Adetoro Adedeji: Do I use AI to write code?
58 00:09:39.110 ⇒ 00:09:54.799 Adetoro Adedeji: Well, once the business context is clear, and I’m able to communicate what I want, I use it to develop my code, maybe not necessarily write into where and to go. AI does not know the business context, so I provide the… I provide this,
59 00:09:55.230 ⇒ 00:09:58.599 Adetoro Adedeji: the templates, so I use AI to just develop
60 00:09:58.790 ⇒ 00:10:10.619 Adetoro Adedeji: develop my scripts, and then, of course, make it faster and better. If there’s an assistant that can help do that, why spend a lot of time on something that, you know, I can leverage AI on? Of course, I leverage AI.
61 00:10:10.800 ⇒ 00:10:11.490 Awaish Kumar: Okay.
62 00:10:11.680 ⇒ 00:10:16.519 Awaish Kumar: So, as an analytics engineer, Have you used,
63 00:10:17.580 ⇒ 00:10:20.650 Awaish Kumar: Now, what have you been using for data transformation?
64 00:10:22.070 ⇒ 00:10:23.140 Adetoro Adedeji: Okay.
65 00:10:23.240 ⇒ 00:10:35.869 Adetoro Adedeji: For data transformation… I’ve used, tools like SSIS, I’ve used tools like different agencies, I’ve used dbt.
66 00:10:36.060 ⇒ 00:10:48.760 Adetoro Adedeji: at some point, especially when I was building a data product that helps to track data lineage and also more like data cataloging. So, at that point, I was able to test run some things with dbt.
67 00:10:48.940 ⇒ 00:11:04.780 Adetoro Adedeji: But currently on my… my current stack in my, you know, that I use there today now, dbt is not part of my current stack, but I have used dbt before to, you know, experiment, a data cataloging and data lineage work.
68 00:11:04.910 ⇒ 00:11:13.089 Adetoro Adedeji: And I know about dbt. I’m also, strengthening my skills on dbt, because I know that it is key
69 00:11:13.500 ⇒ 00:11:15.320 Adetoro Adedeji: And that’s exchange in there.
70 00:11:16.050 ⇒ 00:11:23.000 Awaish Kumar: Okay, so what is, Dimensional modeling, and…
71 00:11:24.170 ⇒ 00:11:28.990 Awaish Kumar: And how you would use it in your… Day-to-day work.
72 00:11:30.590 ⇒ 00:11:35.069 Adetoro Adedeji: Caleb, thank you very much for that question. There is no…
73 00:11:35.450 ⇒ 00:11:46.649 Adetoro Adedeji: Dimensional modeling is the bedrock of any transformation that would eventually become scalable and efficient, right? Especially for downstreams like dashboarding, reporting.
74 00:11:46.820 ⇒ 00:11:51.779 Adetoro Adedeji: Right? And for dimension now modeling, basically the concept is
75 00:11:51.920 ⇒ 00:12:11.480 Adetoro Adedeji: I have this, big set of data, right? How do I ensure that the downstream, the dashboard in this case, or reporting to, say, Power BI, Looker, as the case may be, how do I ensure that they can… those tools can consume the data in a format that
76 00:12:11.650 ⇒ 00:12:19.690 Adetoro Adedeji: is usable, especially in relation to the business requirement or context. So, when it comes to modeling,
77 00:12:19.830 ⇒ 00:12:26.820 Adetoro Adedeji: there are different approach, depending on the business problem. Is it past humor? Is it Snowflake? Of course.
78 00:12:27.470 ⇒ 00:12:38.609 Adetoro Adedeji: I have to normalize my data, right? There are different… I don’t know what exactly you want me to talk about around it, but those are more like the major approach I use around.
79 00:12:38.610 ⇒ 00:12:43.309 Awaish Kumar: install schema, Snowflake schema, so when you would…
80 00:12:43.750 ⇒ 00:12:51.900 Awaish Kumar: Can you give me real-world examples of when you should use star schema and when to use Snowflake schema?
81 00:12:52.880 ⇒ 00:12:59.609 Adetoro Adedeji: Okay, I make use of that schema when my focus is, efficiency aspect.
82 00:12:59.930 ⇒ 00:13:04.370 Adetoro Adedeji: Because star schema is normalized, right?
83 00:13:04.510 ⇒ 00:13:12.730 Adetoro Adedeji: And it’s… it’s, the dimensions are directly into the facts table, so there are fewer joins.
84 00:13:12.880 ⇒ 00:13:17.510 Adetoro Adedeji: And if there are no… if the dimensions does not have hierarchy.
85 00:13:17.680 ⇒ 00:13:23.859 Adetoro Adedeji: Star schema is the best, but… and also when the dimensions are slow-changing, right?
86 00:13:23.860 ⇒ 00:13:39.029 Adetoro Adedeji: But in case when maybe, the volume of data is very large, and the dimensions also contain some hierarchy, then Snowflake would be the best, because you don’t necessarily have to join individual hierarchy to the fact table.
87 00:13:39.500 ⇒ 00:13:52.760 Adetoro Adedeji: a dimension can then belong to another dimension, which in turn is then linked to the fact table. So, for different cases like that, when you have maybe dimensions that are fast-changing, you know, that’s to ensure that you don’t get to change
88 00:13:52.790 ⇒ 00:13:59.679 Adetoro Adedeji: To reduce redundancy, basically, then we will snowflake, so it depends on what the business focuses per time.
89 00:14:00.160 ⇒ 00:14:07.859 Awaish Kumar: Okay, and so, for example, if you have a table, Which is really big.
90 00:14:08.260 ⇒ 00:14:14.670 Awaish Kumar: And, that’s why… The carries, downstream carries are…
91 00:14:14.850 ⇒ 00:14:17.510 Awaish Kumar: Are… are really slow because of that.
92 00:14:17.990 ⇒ 00:14:23.600 Awaish Kumar: So… How would you optimize the query performance?
93 00:14:25.200 ⇒ 00:14:29.120 Adetoro Adedeji: if I have a… did you say if I have a query that is large, right?
94 00:14:29.570 ⇒ 00:14:39.529 Awaish Kumar: We have a table with, like, which has maybe hundreds of millions of rows, a really big table.
95 00:14:39.900 ⇒ 00:14:48.070 Awaish Kumar: And we have downstream query that is using that table, but it is really very slow because the data volume is big.
96 00:14:48.270 ⇒ 00:14:53.189 Awaish Kumar: And, it is taking more than, you can say, like, 5 minutes to execute.
97 00:14:54.080 ⇒ 00:15:01.259 Awaish Kumar: So… and that’s really slow, so I wanted to be optimized and be under… under a minute.
98 00:15:01.420 ⇒ 00:15:04.070 Awaish Kumar: So, how would you optimize your carry?
99 00:15:05.300 ⇒ 00:15:13.290 Adetoro Adedeji: Okay. So, first of all, since there’s already a downstream pre-reading, it means that there are some specific columns that are being called.
100 00:15:13.410 ⇒ 00:15:23.040 Adetoro Adedeji: So, I would introduce indexing to help make my query reading faster, right? So…
101 00:15:23.040 ⇒ 00:15:28.749 Awaish Kumar: Yeah, let’s say, if you introduce indexing, then how do you decide for columns?
102 00:15:29.750 ⇒ 00:15:31.629 Awaish Kumar: Like, which column should have indexing?
103 00:15:32.560 ⇒ 00:15:43.130 Adetoro Adedeji: Yes, that’s why I said. Luckily, we already have a downstream query, so I would check the downstream query to know the columns, you know, that are being called there.
104 00:15:43.390 ⇒ 00:15:44.710 Adetoro Adedeji: So…
105 00:15:45.010 ⇒ 00:15:46.470 Awaish Kumar: My question is.
106 00:15:46.810 ⇒ 00:15:58.899 Awaish Kumar: If you have a dynamic system query, and you know the fields that are being used, will you qualify all the fields that are being used and apply indexes, or what will be your strategy for that?
107 00:15:59.740 ⇒ 00:16:08.020 Adetoro Adedeji: Okay, mostly the fields that are in the filtering, that are involved in filtering, right? So I’ll focus on, you know, my recommendation.
108 00:16:08.020 ⇒ 00:16:21.160 Adetoro Adedeji: what are those key columns that, you know, are consistently, you know, being filtered? Those are the ones that will be important to be in the sense indexing works, you know, the filtering-based, process like that.
109 00:16:21.170 ⇒ 00:16:23.860 Adetoro Adedeji: So, that’s what I would do.
110 00:16:24.070 ⇒ 00:16:33.830 Adetoro Adedeji: Asides, indexing, I would also ensure that… I would also look at my downstream query to ensure that I’m not calling unnecessary columns.
111 00:16:34.110 ⇒ 00:16:35.120 Adetoro Adedeji: Right.
112 00:16:35.240 ⇒ 00:16:40.530 Adetoro Adedeji: I would streamline it to only what’s the requirements, you know.
113 00:16:40.750 ⇒ 00:16:47.190 Adetoro Adedeji: would need at that time. I would also ensure that
114 00:16:47.430 ⇒ 00:16:57.780 Adetoro Adedeji: I mean, I’m not, maybe, using some conditions or some functions, like IFNOT, NOT, IFNOL,
115 00:16:57.780 ⇒ 00:17:14.040 Adetoro Adedeji: I would use if exists to replace that, because I know that that is more efficient. It just depends on what the query contains. There are several ways to optimize different queries, so depending on what the query currently contains, right, will determine
116 00:17:14.569 ⇒ 00:17:16.799 Adetoro Adedeji: I’ll push the optimization.
117 00:17:17.770 ⇒ 00:17:18.510 Awaish Kumar: Okay.
118 00:17:18.510 ⇒ 00:17:24.969 Adetoro Adedeji: And of course, there could also be the, option of partitioning the big table itself.
119 00:17:25.380 ⇒ 00:17:30.770 Adetoro Adedeji: If the main table is too voluminous, then it’s not even scalable.
120 00:17:31.120 ⇒ 00:17:34.350 Adetoro Adedeji: You might consider the option of partitioning as well.
121 00:17:35.780 ⇒ 00:17:44.490 Awaish Kumar: Okay, let’s… yeah, you mentioned two different strategies here. One is indexing, and another one is partitioning. So…
122 00:17:45.420 ⇒ 00:17:48.149 Awaish Kumar: What is the difference between these two, and…
123 00:17:48.970 ⇒ 00:17:51.580 Awaish Kumar: When to use which one, or, like…
124 00:17:53.730 ⇒ 00:18:08.089 Adetoro Adedeji: Okay. So, again, it depends on the business use case, right? I have worked with several datasets where the data is partitioned monthly. I’ve even worked with the data set that is partitioned daily because of the volume and the need.
125 00:18:08.310 ⇒ 00:18:12.649 Adetoro Adedeji: Because a lot of downstream, query needs to…
126 00:18:12.710 ⇒ 00:18:30.140 Adetoro Adedeji: touch base on that, table every day. So it is easier to call just a single-day partition, rather than, you know, monthly. But if I know that my downstream use case is maybe mostly quarterly, weekly, or monthly, then there won’t be a need for maybe even partitioning
127 00:18:30.160 ⇒ 00:18:42.230 Adetoro Adedeji: in… so that minute, we don’t calling it daily. So, it depends on the business need, and then for… you mentioned indexing, that’s one route. I use indexing, right?
128 00:18:43.140 ⇒ 00:18:45.530 Adetoro Adedeji: I was using indexing on…
129 00:18:45.780 ⇒ 00:18:52.189 Adetoro Adedeji: Frequently filtered columns to ensure that querying is faster. Reading from that table is faster.
130 00:18:52.190 ⇒ 00:18:56.960 Awaish Kumar: Yeah, for example, if a table has a date column, And there is…
131 00:18:57.350 ⇒ 00:19:02.320 Awaish Kumar: Happening fil… filtering happening on that column. A lot.
132 00:19:02.450 ⇒ 00:19:10.199 Awaish Kumar: So, what strategy you would apply for that date column? Would you apply indexes, or would you apply partitioning?
133 00:19:11.830 ⇒ 00:19:14.909 Adetoro Adedeji: But it’s all mostly partitioning, yes.
134 00:19:15.150 ⇒ 00:19:32.449 Adetoro Adedeji: That’s why I measured that of, depending on how often, if it’s the daily bit, then it is better to just, you know, direct the query to each of the partitioners. Instead of querying the entire data sets, you direct the query to maybe today’s filter, yesterday’s filter, as the case may be.
135 00:19:33.910 ⇒ 00:19:38.740 Awaish Kumar: Okay, I’m gonna… Wadi… I’ll… and…
136 00:19:38.850 ⇒ 00:19:41.620 Awaish Kumar: Which data warehouses you have experience with?
137 00:19:43.710 ⇒ 00:19:46.550 Adetoro Adedeji: Used BigQuery.
138 00:19:46.930 ⇒ 00:19:56.879 Adetoro Adedeji: on the project when I was with, say, a consulting firm, so just a one-time project, some years ago, commonly, I use,
139 00:19:57.230 ⇒ 00:19:59.520 Adetoro Adedeji: I use SPM7.
140 00:20:00.490 ⇒ 00:20:03.770 Adetoro Adedeji: Okay, so it’ll do it to our system.
141 00:20:04.370 ⇒ 00:20:08.969 Adetoro Adedeji: Yeah, majorly. I know about, redshift.
142 00:20:09.520 ⇒ 00:20:11.650 Adetoro Adedeji: I’ve not actually used it before.
143 00:20:11.650 ⇒ 00:20:14.120 Awaish Kumar: Have you… do you have experience working with BigQuery?
144 00:20:15.420 ⇒ 00:20:19.819 Adetoro Adedeji: Yes, I said I’ve used it one time on the project quite a while, but…
145 00:20:20.000 ⇒ 00:20:26.069 Awaish Kumar: Yeah, like, maybe it’s a recent experience, you know about it, or it’s something that you just worked in the past?
146 00:20:26.900 ⇒ 00:20:30.770 Adetoro Adedeji: I’ve worked on it in the past, is what I said while I was in consulting. BigQuery.
147 00:20:31.740 ⇒ 00:20:34.389 Adetoro Adedeji: It’s a cloud-based system, yes.
148 00:20:34.910 ⇒ 00:20:38.070 Awaish Kumar: Okay, so let’s talk about BigQuery. How…
149 00:20:41.190 ⇒ 00:20:48.330 Awaish Kumar: the… optimization works in BigQuery, so… Same table.
150 00:20:49.040 ⇒ 00:20:56.840 Awaish Kumar: Like, what optimization techniques there are in BigQuery that you apply… that you can apply on any given table?
151 00:21:00.170 ⇒ 00:21:08.009 Adetoro Adedeji: Okay, I will not remember in-depth reading the core concept. It’s been about 4 years ago that I worked on it, honestly.
152 00:21:08.010 ⇒ 00:21:11.589 Awaish Kumar: I was trying to clarify, like, if it is a recent experience, or it’s…
153 00:21:11.590 ⇒ 00:21:17.180 Adetoro Adedeji: Oh yeah, no, like I said, I was in… I was in consulting about 4 years ago, so that was when I had the experience.
154 00:21:17.640 ⇒ 00:21:18.260 Awaish Kumar: Okay.
155 00:21:19.660 ⇒ 00:21:23.940 Adetoro Adedeji: But I’m sure it’s not something that would be so hard to, you know, get the angle.
156 00:21:25.390 ⇒ 00:21:33.160 Awaish Kumar: Okay, no worries. But, yeah, do you have an understanding of how data warehouses work?
157 00:21:35.010 ⇒ 00:21:36.440 Adetoro Adedeji: Yes, I do.
158 00:21:36.740 ⇒ 00:21:37.460 Awaish Kumar: Okay.
159 00:21:38.210 ⇒ 00:21:50.730 Awaish Kumar: like, like, the… all the clients that we have, normally we are using warehouses, like, very…
160 00:21:50.940 ⇒ 00:21:55.120 Awaish Kumar: Rarely will anyone be using, any…
161 00:21:55.630 ⇒ 00:21:58.830 Awaish Kumar: database, like SQL Server or Postgres.
162 00:21:59.060 ⇒ 00:22:06.590 Awaish Kumar: It’s possible they might be using that for transactional system, but we will bring that into some warehouse, and then…
163 00:22:06.970 ⇒ 00:22:09.260 Awaish Kumar: Do all the transformations.
164 00:22:09.410 ⇒ 00:22:10.790 Awaish Kumar: In the warehouse.
165 00:22:11.850 ⇒ 00:22:18.550 Awaish Kumar: Okay, so do you have any experience with any of the ingestion tools?
166 00:22:20.330 ⇒ 00:22:22.890 Awaish Kumar: You mentioned that you’ve worked on ingestion of the data.
167 00:22:23.690 ⇒ 00:22:24.500 Adetoro Adedeji: Yes.
168 00:22:24.850 ⇒ 00:22:27.659 Awaish Kumar: Yeah, like, do you have experience with any of the ingestion tools?
169 00:22:28.950 ⇒ 00:22:33.890 Adetoro Adedeji: Okay, like, in my day-to-day, currently.
170 00:22:34.070 ⇒ 00:22:43.210 Adetoro Adedeji: We’re just using… sometimes SSIS, we’re just sometimes using… Spac?
171 00:22:44.010 ⇒ 00:22:46.310 Adetoro Adedeji: Using Airflow for orchestration.
172 00:22:47.000 ⇒ 00:22:50.609 Adetoro Adedeji: Yeah, mostly that’s what I usually get to do.
173 00:22:51.170 ⇒ 00:22:52.770 Awaish Kumar: Okay, you mentioned airflow?
174 00:22:53.460 ⇒ 00:22:54.190 Adetoro Adedeji: Yes.
175 00:22:54.610 ⇒ 00:22:57.300 Awaish Kumar: Okay, so… Is it something, like.
176 00:22:58.650 ⇒ 00:23:04.819 Awaish Kumar: Is part of your team, like, somebody’s using… somebody’s, like…
177 00:23:05.160 ⇒ 00:23:11.349 Awaish Kumar: Supporting you on that part, or is it… you are the one owning the, like, the airflow itself?
178 00:23:12.300 ⇒ 00:23:25.329 Adetoro Adedeji: So, my team consists of analysts, engineers, and data science writers. When I am to work on a data product, I like to engineer it myself, just so that I, I can take full ownership of the entire pipeline.
179 00:23:25.430 ⇒ 00:23:26.970 Adetoro Adedeji: So, I…
180 00:23:27.160 ⇒ 00:23:38.759 Adetoro Adedeji: take ownership of my ingestion process as well, and after building the PySpark scripts, after, you know, running it, running the tests, validations, and all, I host it on Airflow myself.
181 00:23:39.400 ⇒ 00:23:45.479 Adetoro Adedeji: Okay. Well, it’s not that. There are no engineers that, you know, do that as well, but I can’t do that, and I do that for my projects.
182 00:23:45.980 ⇒ 00:23:46.630 Awaish Kumar: Okay.
183 00:23:47.170 ⇒ 00:23:54.060 Awaish Kumar: Okay, so do you use open source Airflow, or do you use a managed… Version of it?
184 00:23:54.960 ⇒ 00:23:56.260 Adetoro Adedeji: We use your process.
185 00:23:59.910 ⇒ 00:24:05.470 Awaish Kumar: Okay, so do you know… Like, what are the different components of airflow?
186 00:24:08.180 ⇒ 00:24:10.270 Adetoro Adedeji: into different components, how do you win?
187 00:24:12.100 ⇒ 00:24:19.360 Awaish Kumar: like… Airflow has different components that, put together, makes it a tool. Like, you have something,
188 00:24:20.180 ⇒ 00:24:25.199 Awaish Kumar: UI to see there is something running in the background, so there are different…
189 00:24:26.120 ⇒ 00:24:30.029 Awaish Kumar: Components, and that is part of its core architecture.
190 00:24:30.470 ⇒ 00:24:33.479 Awaish Kumar: So, like, do you know what are… what are the…
191 00:24:35.350 ⇒ 00:24:50.549 Adetoro Adedeji: Okay, maybe the lingual, I may not know, but I can know you through the process, right? I write my PySpark scripts, and maybe run it on a Linux terminal, so at least maybe test for a day or two, and validate that my
192 00:24:50.550 ⇒ 00:25:01.909 Adetoro Adedeji: the output is what I expected. I value it with the source, and that’s everything I wanted, is what I’m getting. The output’s basically, is consistent throughout. And then for…
193 00:25:02.220 ⇒ 00:25:07.690 Adetoro Adedeji: Instead of maybe, like, posting it as maybe a cron job, for example.
194 00:25:07.920 ⇒ 00:25:11.779 Adetoro Adedeji: I use EFU to maybe schedule it.
195 00:25:12.250 ⇒ 00:25:32.049 Adetoro Adedeji: And because of it, a PySpark script, for example, would contain maybe different steps, different functions, so there is an order of, run name that Airflow now is able to, you know, execute in those steps.
196 00:25:32.210 ⇒ 00:25:46.420 Adetoro Adedeji: So, basically, I know there’s a UI where you monitor, you can even see the step that broke. It can… and airflow is self-healing, so when it, you know, fails, it can rerun by itself. I know that UI, you check the logs.
197 00:25:46.590 ⇒ 00:25:53.840 Adetoro Adedeji: Yukon, basically, it’s quite, it’s quite,
198 00:25:54.050 ⇒ 00:26:04.530 Adetoro Adedeji: what was it called? Detailed, because it’s how you can show you the logs, it can show you the steps that failed, you can run a particular step if it fails. I may not know those terminologies, right? But…
199 00:26:04.640 ⇒ 00:26:09.890 Adetoro Adedeji: I… I know the use, I know the know-how, I know how to also need that.
200 00:26:10.040 ⇒ 00:26:15.499 Adetoro Adedeji: then maybe people… I may not know what the terminology, it’s not my core, per se.
201 00:26:15.930 ⇒ 00:26:16.590 Adetoro Adedeji: I believe…
202 00:26:16.590 ⇒ 00:26:17.000 Awaish Kumar: Okay.
203 00:26:17.000 ⇒ 00:26:21.580 Adetoro Adedeji: It’s because of interest, basically, that I’m having to, you know, do it.
204 00:26:22.500 ⇒ 00:26:30.190 Awaish Kumar: Okay, yeah, I think I’m okay, I’m good with my questions, so I will leave some time for you to ask any question.
205 00:26:31.560 ⇒ 00:26:41.290 Adetoro Adedeji: All right, thank you very much. It was exciting to know about, bring forth. Okay, I think the question I would like to ask is,
206 00:26:41.400 ⇒ 00:26:44.509 Adetoro Adedeji: I think I’m aware there is a data engineering team.
207 00:26:44.660 ⇒ 00:26:55.470 Adetoro Adedeji: And there is, maybe they work together, though, yeah, there is analytics engineering. Are there data scientists, or maybe even data analysts that would also work.
208 00:26:55.610 ⇒ 00:27:03.989 Adetoro Adedeji: Alongside those people, or is it the analytics engineer that gets to, you know, do the downstream view of the dashboard, as the case may be?
209 00:27:05.250 ⇒ 00:27:14.969 Adetoro Adedeji: Or do we even offer that kind of service, or are our service mainly streamlined to maybe the engineering, transformation, modeling part of things, and maybe AI?
210 00:27:16.010 ⇒ 00:27:23.269 Awaish Kumar: So we… In the… in the data, data is… Divided into 3 different categories.
211 00:27:23.580 ⇒ 00:27:29.879 Awaish Kumar: In the data team, like data engineering, and then we have data analytics engineers, and then we have,
212 00:27:30.810 ⇒ 00:27:34.789 Awaish Kumar: team of… We’ll call it a strategy team.
213 00:27:35.290 ⇒ 00:27:41.499 Awaish Kumar: But… Like, that is the one that is… that is… Team of data analysts.
214 00:27:42.330 ⇒ 00:27:49.620 Awaish Kumar: Right, so… but we call it a strategy team, because they… they are responsible for building dashboards.
215 00:27:49.790 ⇒ 00:27:54.229 Awaish Kumar: They are… but they are also responsible for, like, figuring out insights.
216 00:27:56.220 ⇒ 00:28:00.289 Awaish Kumar: Do research on data, go deep dive in the data.
217 00:28:00.420 ⇒ 00:28:03.540 Awaish Kumar: Come up with some findings, effects, and…
218 00:28:03.680 ⇒ 00:28:09.610 Awaish Kumar: And that can help, like, drive a strategy for any Company.
219 00:28:10.760 ⇒ 00:28:11.590 Awaish Kumar: So…
220 00:28:13.110 ⇒ 00:28:19.189 Adetoro Adedeji: Okay, so who gets to, like, yeah. Oh, okay, sorry, I didn’t know you were still speaking, sorry.
221 00:28:20.590 ⇒ 00:28:23.640 Awaish Kumar: Yeah, I was just mentioning that’s, like, how our…
222 00:28:24.360 ⇒ 00:28:26.370 Awaish Kumar: I’ll be stuck to it.
223 00:28:27.150 ⇒ 00:28:35.560 Adetoro Adedeji: Okay, so what team particularly gets to, like, do the reporting, or is that not part of the services that,
224 00:28:35.750 ⇒ 00:28:38.439 Adetoro Adedeji: the BrainForge team offers to our clients.
225 00:28:39.760 ⇒ 00:28:46.470 Awaish Kumar: Yeah, we have, like, data engineering services, like, like, AI services.
226 00:28:46.840 ⇒ 00:28:53.630 Awaish Kumar: Then you have, like, like, like, building of data warehouse, like, establish, establishing the data foundation.
227 00:28:53.800 ⇒ 00:28:59.040 Awaish Kumar: for a company that doesn’t have it, right? Then that involves, like, that can involve
228 00:28:59.210 ⇒ 00:29:02.020 Awaish Kumar: support from multiple teams, so we have data
229 00:29:02.220 ⇒ 00:29:12.399 Awaish Kumar: engineers that can help you build the infra, build the… get the data ingested, then we will have some time from data analytics engineers, they… they’re going to come in and build the
230 00:29:12.640 ⇒ 00:29:20.870 Awaish Kumar: Warehouse and the mods, and then finally we’ll hand it over to analysts that basically derive the strategy.
231 00:29:21.830 ⇒ 00:29:24.420 Adetoro Adedeji: Oh, okay, so there are analysts within it, too.
232 00:29:25.240 ⇒ 00:29:31.870 Awaish Kumar: But, as I mentioned, there are data analysts, but we call them as a part of a strategy team, because
233 00:29:33.050 ⇒ 00:29:35.499 Awaish Kumar: Coding, but they also do the analysis.
234 00:29:35.500 ⇒ 00:29:36.830 Adetoro Adedeji: Oh, yeah, one sec.
235 00:29:38.240 ⇒ 00:29:38.720 Awaish Kumar: you do it.
236 00:29:38.720 ⇒ 00:29:45.099 Adetoro Adedeji: provides insights and findings from the data that’s important.
237 00:29:45.480 ⇒ 00:29:46.270 Adetoro Adedeji: Good evening.
238 00:29:46.460 ⇒ 00:29:51.789 Adetoro Adedeji: Okay, I think, my other question would be…
239 00:29:52.700 ⇒ 00:30:05.809 Adetoro Adedeji: what are, like, the expectations you… the company has, you know, for anybody that will be filling the analytics engineering role? I don’t know how many roles it may be, either one or…
240 00:30:05.880 ⇒ 00:30:13.110 Adetoro Adedeji: whoever it’s in, maybe. What are the expectations the company have for the person? What’s the expectation of the person we didn’t see?
241 00:30:13.110 ⇒ 00:30:13.620 Awaish Kumar: Trishing…
242 00:30:13.620 ⇒ 00:30:14.630 Adetoro Adedeji: Right, CDs.
243 00:30:15.230 ⇒ 00:30:18.860 Awaish Kumar: Expectation is that it is someone who is,
244 00:30:19.850 ⇒ 00:30:22.860 Awaish Kumar: Open to, like, dynamic, open to learn.
245 00:30:22.960 ⇒ 00:30:26.739 Awaish Kumar: Fast learner, and can work with uncertainties.
246 00:30:27.060 ⇒ 00:30:43.590 Awaish Kumar: I think that’s… that’s what it is. So, someone comes in, and he’s assigned to a client, then the, like, the first 30 days is, like, you understand the client, understand their business, understand the… the existing code data infrastructure, code, whatever we have.
247 00:30:43.690 ⇒ 00:30:49.420 Awaish Kumar: And start delivering, like, Some tickets, like, small tickets for the client.
248 00:30:49.610 ⇒ 00:30:56.670 Awaish Kumar: That’s… that’s a success, but… But to grow in that role is more like, if you have,
249 00:30:57.080 ⇒ 00:30:59.920 Awaish Kumar: capability to… True.
250 00:31:00.140 ⇒ 00:31:06.710 Awaish Kumar: Without, like, to work with uncertainties, you have a data, and we can tell you, like, okay, this is the data.
251 00:31:06.910 ⇒ 00:31:14.869 Awaish Kumar: We don’t know questions. Client doesn’t know questions. What client is looking for, we don’t know. Just look at the data, come up with a case.
252 00:31:15.270 ⇒ 00:31:17.409 Awaish Kumar: That you can present to the client, and…
253 00:31:17.560 ⇒ 00:31:20.049 Awaish Kumar: That will help… that will help them, like.
254 00:31:20.280 ⇒ 00:31:23.579 Awaish Kumar: Set a strategy for their company, and then they can say, oh, wow, like.
255 00:31:23.770 ⇒ 00:31:30.260 Awaish Kumar: you have… you’ve done a great job, right? So this is something, like, you don’t… without any requirements, you must have a
256 00:31:30.600 ⇒ 00:31:34.810 Awaish Kumar: like, with your gut feelings, with using AI, whatever.
257 00:31:34.910 ⇒ 00:31:36.640 Awaish Kumar: Like, come up with something.
258 00:31:37.740 ⇒ 00:31:38.410 Awaish Kumar: Yep.
259 00:31:38.610 ⇒ 00:31:43.680 Adetoro Adedeji: All right, all right. Thank you so much. Thank you.
260 00:31:43.840 ⇒ 00:31:45.830 Adetoro Adedeji: I think there’s a microscopes for now.
261 00:31:46.100 ⇒ 00:31:49.339 Awaish Kumar: Okay, thank you, for your time.
262 00:31:49.760 ⇒ 00:31:55.449 Awaish Kumar: And yeah, I will submit my feedback to our recruiter, Kayla, and then…
263 00:31:55.760 ⇒ 00:31:59.270 Awaish Kumar: She was… she will get back to you as soon as possible.
264 00:31:59.590 ⇒ 00:32:00.880 Awaish Kumar: Okay, thank you.
265 00:32:00.880 ⇒ 00:32:01.960 Adetoro Adedeji: Thank you so much.