Meeting Title: Data Engineer Interview (Ademola Adekilekun) Date: 2025-08-25 Meeting participants: Awaish Kumar, Ademola, read.ai meeting notes


WEBVTT

1 00:01:37.010 00:01:38.599 Ademola: Hi, Orange, good morning.

2 00:01:45.280 00:01:47.149 Awaish Kumar: Hello, how are you doing?

3 00:01:48.710 00:01:49.849 Ademola: I’m fine, good morning.

4 00:01:50.010 00:01:51.380 Ademola: I’m trying to slow down my comment.

5 00:01:53.930 00:01:54.970 Ademola: Yeah, good morning.

6 00:01:57.070 00:02:03.709 Awaish Kumar: Hi, so… How are you doing?

7 00:02:04.970 00:02:06.460 Ademola: I’m fine, how are you?

8 00:02:07.410 00:02:11.969 Awaish Kumar: I’m good as well. So where are you located?

9 00:02:13.390 00:02:15.390 Ademola: I’m located in Nigeria, West Africa.

10 00:02:17.960 00:02:18.950 Awaish Kumar: Nigeria.

11 00:02:19.620 00:02:20.210 Ademola: Yes.

12 00:02:20.720 00:02:23.240 Awaish Kumar: Okay, so you’re currently living in Nigeria?

13 00:02:24.100 00:02:24.910 Ademola: Nice, Ranger.

14 00:02:24.910 00:02:25.450 Awaish Kumar: Okay.

15 00:02:27.730 00:02:30.669 Awaish Kumar: So, by the way, how is it?

16 00:02:32.410 00:02:37.769 Ademola: … So, we are currently on WATC, which is what?

17 00:02:38.070 00:02:39.229 Ademola: That’s the time zone.

18 00:02:39.770 00:02:40.710 Ademola: Right?

19 00:02:42.470 00:02:48.930 Ademola: And the weather is always great, you know. I work at one of the biggest fintechs in Africa right now.

20 00:02:49.190 00:02:50.560 Ademola: As he did engineer.

21 00:02:51.270 00:02:52.090 Ademola: So….

22 00:02:52.690 00:02:57.479 Awaish Kumar: No, I mean, like, how is it just generally living in there, like….

23 00:02:57.640 00:03:07.000 Ademola: Oh, okay, okay, okay, okay, so… oh, oh, okay, so Nigeria is a great place, right? There’s a lot of bustle and bustle here.

24 00:03:07.790 00:03:16.980 Ademola: We, in the, … In this area, usually, like, we… we… …

25 00:03:17.120 00:03:20.380 Ademola: Just basically try to do our day-to-day activities.

26 00:03:21.880 00:03:23.100 Ademola: Don’t go to church.

27 00:03:23.540 00:03:24.610 Ademola: This kind of stuff.

28 00:03:24.820 00:03:26.600 Ademola: And the weather is usually great.

29 00:03:28.080 00:03:30.199 Ademola: Yeah, that’s basically it, you know.

30 00:03:32.900 00:03:44.159 Awaish Kumar: Yeah, okay, … So… I would just like to share the… What’s going to happen?

31 00:03:44.300 00:03:51.570 Awaish Kumar: In this interview, … So, yeah, in this call, we are going to discuss

32 00:03:53.740 00:04:03.809 Awaish Kumar: And initially, I’m just going to describe what I do, and what Brainforce does, and then we are going to get your introduction, and then we are going to

33 00:04:03.990 00:04:09.680 Awaish Kumar: Deep dive, into your experiences, and what you have worked on.

34 00:04:10.790 00:04:18.259 Awaish Kumar: And then you can ask any questions if you have. Yeah, that will be the… then it will be the end of the session.

35 00:04:18.430 00:04:30.450 Awaish Kumar: So yeah, I will start. My name is Vish Kumar, and I’m the engineering manager, here at Brainforce. What… and what Brainforce does is provides,

36 00:04:31.160 00:04:36.070 Awaish Kumar: Ddot… That’s the data and AI services.

37 00:04:36.320 00:04:40.820 Awaish Kumar: To their clients, … Spanning across different industries.

38 00:04:41.500 00:04:43.480 Awaish Kumar: So we are a consultancy firm.

39 00:04:43.660 00:04:45.430 Awaish Kumar: And, …

40 00:04:45.890 00:04:59.169 Awaish Kumar: Most of our clients right now are from the US, so we expect anyone who tries to be… at least overlap, have some overlapping hours with the US time zone, but, yeah.

41 00:04:59.640 00:05:05.129 Awaish Kumar: Yeah, that’s mainly it. Yeah, if you can introduce yourself.

42 00:05:06.010 00:05:24.140 Ademola: Okay. Alright, so my name is Ademola, Ademola Adigil. I’m based in Nigeria. I have a degree in finance, and I’ve been working for the past 4 years. I started my career as a data analyst, where I worked in a consulting firm, just like Brainforge.

43 00:05:24.250 00:05:31.569 Ademola: I then moved to a small fintech where I built reports and dashboards using Power BI and SQL.

44 00:05:31.710 00:05:38.810 Ademola: Right? From there, I moved to a bank here in Nigeria. Then, I spent 10 months. Then I got my…

45 00:05:38.910 00:05:43.690 Ademola: current role as a data engineer at one of the biggest fintechs in Nigeria.

46 00:05:43.900 00:05:53.699 Ademola: and in West Africa to be… and I deal with a lot of transactional data, right? I work with tools like Airflow, PySpark.

47 00:05:54.450 00:06:02.620 Ademola: And I also viewed… Reports by the side, just to, … Make our stakeholders happy.

48 00:06:02.730 00:06:09.960 Ademola: Right, and… I am great at communicating, and I have… Critical thinking skills.

49 00:06:10.080 00:06:11.200 Ademola: As an engineer.

50 00:06:11.510 00:06:12.640 Ademola: Yeah, that’s it for me.

51 00:06:14.240 00:06:15.000 Awaish Kumar: Okay.

52 00:06:15.470 00:06:17.730 Awaish Kumar: So, yeah.

53 00:06:17.840 00:06:23.860 Awaish Kumar: Nice to hear that You have, like, around 40 years of experience?

54 00:06:24.520 00:06:25.230 Ademola: Yes.

55 00:06:25.380 00:06:25.890 Ademola: Yes.

56 00:06:25.890 00:06:30.010 Awaish Kumar: in total, right? From data analysts to… Oh, dude.

57 00:06:30.010 00:06:30.780 Ademola: Engineering.

58 00:06:31.560 00:06:36.500 Awaish Kumar: So, why, you… chose to move from REDA

59 00:06:36.770 00:06:39.110 Awaish Kumar: Being a data analyst to a data engineer.

60 00:06:40.460 00:06:56.130 Ademola: Okay, so from my experience as an analyst, right, I didn’t usually care about where the data came from. Mine was just to plug into the data, build reports, build my dashboards and all, but I was curious to understand

61 00:06:56.420 00:07:06.810 Ademola: how data was being made, or how these datasets got into the databases, or got into where they are used. So I took up, courses online to

62 00:07:06.930 00:07:19.519 Ademola: understand how that is done, right? And I found it interesting, and since then, I’ve been applying for data engineering roles. That was how I was able to get my, current through as a data engineer at InterSwitch.

63 00:07:19.650 00:07:25.419 Ademola: And the experience, my learning experience, helped me, and that has also helped me with

64 00:07:25.930 00:07:33.640 Ademola: my current job, because I’m a very fast learner, right? And I pick up things fast, so it has been easy for me.

65 00:07:33.940 00:07:39.479 Ademola: So the main reason was just because I found it interesting, and I found it challenging at the same time.

66 00:07:40.070 00:07:45.320 Ademola: Right, so… It was more challenging than data analysis. Are you 19 data analyst?

67 00:07:46.410 00:07:51.420 Ademola: Yeah, I liked being a data analyst, because I was, you know, I was…

68 00:07:51.480 00:08:06.990 Ademola: facing the client, or the stakeholders that I was presenting the dashboards to, so I’ll feel seen that way, right? They get to know who I am, they get to know the person behind the reports, right? But as a data engineer, I’m at the back.

69 00:08:07.530 00:08:25.090 Ademola: building the pipelines for the data analysts of nowadays. But even at my organization, we don’t just… data engineers don’t just build pipelines. We also build dashboards, right? So, I would say I love being an analyst, and I also love being a data engineer at the same time.

70 00:08:26.650 00:08:33.469 Awaish Kumar: Okay, and … So… What tech stack you are using right now?

71 00:08:34.679 00:08:41.929 Ademola: So right now, we use, PySpark for our ETAs, because we deal with very large amounts of data.

72 00:08:42.069 00:08:46.799 Ademola: Right? How long do you use… So I’m talking about,

73 00:08:47.529 00:08:50.719 Ademola: 100 million to 200 million in a… in a week.

74 00:08:51.699 00:08:52.389 Ademola: Rights.

75 00:08:52.390 00:08:53.710 Awaish Kumar: 100 million tools?

76 00:08:54.500 00:08:55.970 Ademola: 200 million rows.

77 00:08:57.010 00:08:57.600 Awaish Kumar: Okay.

78 00:08:57.600 00:08:58.270 Ademola: week.

79 00:08:58.770 00:09:00.780 Ademola: Yeah, so we deal with transactions.

80 00:09:00.780 00:09:05.670 Awaish Kumar: I mean, if I ask in terms of terabytes, how….

81 00:09:07.200 00:09:15.769 Ademola: So it’s… so it’s a lot, right? We… it can be up to 10GB to 20 gigabytes.

82 00:09:16.290 00:09:23.359 Ademola: Right? But everything is being stored… stored in… we use file systems, actually, to store our data.

83 00:09:23.900 00:09:27.610 Ademola: So we use something called… we use file systems.

84 00:09:28.150 00:09:34.809 Ademola: Right? We use something called my Power File System, which is my Power FS, to store, our data sources.

85 00:09:34.920 00:09:42.700 Ademola: Right? Then, we then write PySpark scripts to extract those, …

86 00:09:43.070 00:09:46.070 Ademola: Data from the file system to a database.

87 00:09:46.310 00:09:50.380 Ademola: when our stakeholders request for a dashboard. So we viewed

88 00:09:50.640 00:09:52.440 Ademola: New data mats for every new data.

89 00:09:52.440 00:09:55.769 Awaish Kumar: Did you have, real-time streaming?

90 00:09:56.490 00:09:57.400 Ademola: Yes.

91 00:09:57.820 00:10:03.940 Ademola: Yes, so… yes, so we have a project we are currently working on that we use, Kafka.

92 00:10:04.220 00:10:08.999 Ademola: Right? We, we, we connected Kafka topic to our…

93 00:10:09.500 00:10:15.010 Ademola: applications, right? Then from our applications, we… How to introduce ClickOs.

94 00:10:15.290 00:10:17.420 Ademola: I’m sure you know Clickout’s database.

95 00:10:18.250 00:10:19.030 Awaish Kumar: No, yeah.

96 00:10:19.030 00:10:24.870 Ademola: Yeah, so we didn’t use SSMAs, because SSMAs might not handle Real-time data.

97 00:10:25.160 00:10:26.869 Ademola: as fast as Clink House would.

98 00:10:27.180 00:10:35.840 Ademola: Right, so we introduced Click House to connect to the topics, then we connected Power BI to Click House using DirectQuery, so that

99 00:10:36.040 00:10:42.969 Ademola: The dashboard can update in real time, instead of loading, like, maybe one day after, or two days after.

100 00:10:43.260 00:10:46.360 Ademola: Right? So that’s the use case right now.

101 00:10:47.610 00:10:56.860 Awaish Kumar: Okay, so how do you, like, how this streaming works, how data, goes to… Kafka, for example.

102 00:10:58.210 00:11:04.309 Ademola: Okay, so… For… for what’s… we are at work right now, we…

103 00:11:04.980 00:11:09.570 Ademola: have our applications, right? Our applications are, …

104 00:11:09.740 00:11:12.340 Ademola: We call it Super Switch and Mega Switch.

105 00:11:12.930 00:11:25.360 Ademola: Right? So, we use the ports. The ports from… from those applications. The ports will be connected to Kafka, so that once an application is done by a customer.

106 00:11:25.910 00:11:30.970 Ademola: it goes to the application. From the application, it goes to the Kafka topic.

107 00:11:31.120 00:11:34.579 Ademola: Then, from the Kafka topic, it goes to clickhouse.

108 00:11:34.790 00:11:37.770 Ademola: Then from Click House, once Power BI is refreshed.

109 00:11:39.140 00:11:50.879 Ademola: will get it immediately. And that will mean that when the customer makes a transaction, I’m able to see, or my stakeholders are able to see that transaction within 5 minutes of refresh, instead of…

110 00:11:51.240 00:11:52.910 Ademola: Within the OD.

111 00:11:53.330 00:11:54.480 Ademola: For them to see it.

112 00:11:56.900 00:11:59.639 Awaish Kumar: Okay, and … huh?

113 00:12:00.160 00:12:01.450 Awaish Kumar: So…

114 00:12:01.760 00:12:05.860 Awaish Kumar: Like, you’re not using PySpark for streaming.

115 00:12:07.640 00:12:15.680 Ademola: We use… so, we use PySpark for streaming. We use, … so I think there are two methods we try, because

116 00:12:15.810 00:12:17.520 Ademola: Right now, what we’re trying to do.

117 00:12:17.520 00:12:23.830 Awaish Kumar: For example, you described a pipeline, where iScope fits in in that pipeline?

118 00:12:24.540 00:12:30.279 Ademola: Okay, so the one that I am on, with the project I am on, because I’m not the one on the

119 00:12:30.390 00:12:38.160 Ademola: streaming, project, right? The project I’m on, where we use SpySpark, basically. We, …

120 00:12:38.500 00:12:46.859 Ademola: connect to our file systems, right, which is the map power sources, then we introduce, Spark SQL.

121 00:12:47.320 00:13:01.080 Ademola: Right? So, I know there are two ways to do transformation. It’s either you are using DataFrame APIs, or you’re using PySpark SQL, and that Spark SQL will need you to write your transformation using SQL.

122 00:13:01.300 00:13:18.060 Ademola: Right? So you’re writing a select statement, and you’re joining those tables with each other. So anywhere you need to add a column, you add a column. Anywhere you need to filter, use the where statement. Anywhere you need to join, you use the join statement, right? You are joining, or you are writing the queries just to

123 00:13:18.140 00:13:24.580 Ademola: gather data from different file systems, right? Then you then… …

124 00:13:24.900 00:13:28.389 Ademola: Save it into, a data frame.

125 00:13:28.670 00:13:33.999 Ademola: Basically, from that data frame, you then… we then move it into, …

126 00:13:34.580 00:13:38.839 Ademola: Our databases by either coaling or repartitioning.

127 00:13:39.420 00:13:44.889 Ademola: Right? It’s either you are… so, most of the processes we do is either we are overwriting the table.

128 00:13:44.890 00:13:46.629 Awaish Kumar: From what database you’re using?

129 00:13:47.590 00:13:50.450 Ademola: SQL Server Management Studio, SSMS.

130 00:13:51.210 00:13:58.230 Ademola: So it’s either we are overwriting, or it’s either we are pending, right? But the transformation side has to do with SQL.

131 00:13:59.570 00:14:04.080 Awaish Kumar: So why, like, for one pipeline.

132 00:14:04.830 00:14:10.280 Awaish Kumar: Like, you said you have two projects now. One is real-time seeming…

133 00:14:10.700 00:14:14.809 Awaish Kumar: Batch is streaming. When you’re streaming, it is going to click house.

134 00:14:15.040 00:14:17.939 Awaish Kumar: Investreaming, it is going to escalate somewhere.

135 00:14:18.400 00:14:20.360 Awaish Kumar: Yes. Why is that?

136 00:14:21.660 00:14:22.740 Ademola: Sorry?

137 00:14:23.090 00:14:26.669 Awaish Kumar: Why it is going to different, like, the databases.

138 00:14:26.670 00:14:34.190 Ademola: Okay. Okay, so, for the real time, ClickUp has a way of handling columnar data.

139 00:14:34.310 00:14:36.540 Ademola: compared to SSMS.

140 00:14:36.660 00:14:42.069 Ademola: Right? SSM is… And those… It’s in a different way.

141 00:14:42.580 00:14:49.250 Ademola: You understand? And when it comes to real-time also, … Treathouse performs better.

142 00:14:49.350 00:14:50.529 Ademola: Based on how I….

143 00:14:50.530 00:14:54.830 Awaish Kumar: I mean, SQL Server is mainly used for OLTB,

144 00:14:54.940 00:14:58.999 Awaish Kumar: Why not also put me…

145 00:14:59.190 00:15:01.759 Awaish Kumar: The data which you are processing, you know.

146 00:15:02.050 00:15:07.360 Awaish Kumar: You know, like, as a batch processing, why not… that is going to kill cows as well.

147 00:15:08.150 00:15:15.669 Ademola: So, you know, the way it was built, you know, ClickHouse is more of a cloud data warehouse.

148 00:15:15.910 00:15:21.390 Ademola: Right? And our customers need to see their data in real time.

149 00:15:21.940 00:15:27.090 Ademola: So, like I said, it is easier for clickouts to undo real-time streaming.

150 00:15:27.320 00:15:28.390 Ademola: Compared to SSC.

151 00:15:28.390 00:15:29.549 Awaish Kumar: That’s my question, right?

152 00:15:29.900 00:15:32.969 Awaish Kumar: Why even using SQL Server? Like, why not…

153 00:15:33.160 00:15:35.250 Awaish Kumar: All the data goes to Kilikaos.

154 00:15:36.090 00:15:49.480 Ademola: Oh, okay, okay, okay. So, we have been using… before I joined my company, right, we have been using SSMS for the longest time. So, we are trying to pick up this Kafka project, right? So that was when Clickout got introduced.

155 00:15:49.990 00:15:58.969 Ademola: So, we’ve been using SSMS for over 10 years before I joined. So, they can migrate their data from SSMS to ClickHouse.

156 00:16:00.130 00:16:05.450 Ademola: So they, they, they, they, they intentionally choose Clickouts for the streaming purpose.

157 00:16:07.790 00:16:08.540 Awaish Kumar: Okay.

158 00:16:08.870 00:16:15.250 Awaish Kumar: So… and how the orchestration works?

159 00:16:15.580 00:16:16.120 Awaish Kumar: Fireball.

160 00:16:16.120 00:16:17.120 Ademola: Okay.

161 00:16:17.460 00:16:22.300 Ademola: So we use Airflow. So before I joined, we used to use Chrome.

162 00:16:22.600 00:16:29.380 Ademola: But Krohn was not too effective. Like, you have to go there manually to check when it fails.

163 00:16:29.500 00:16:37.520 Ademola: or you don’t get a notification, unlike Airflow. So we use Airflow to trigger, our commands.

164 00:16:37.700 00:16:52.599 Ademola: such that, on a daily basis, we set it to on a daily basis, because for our batch processing, it kicks off… for some of our pipelines, it kicks off by 12AM, some kick off by 8am, some kick off by 7am, right? So we wrote….

165 00:16:52.600 00:16:58.910 Awaish Kumar: Are you using, … Are you using open source or managed version of Airflow?

166 00:17:00.050 00:17:05.390 Ademola: So we are using the managed version, because we work on a cluster, we use a cluster, we have our own cluster.

167 00:17:06.410 00:17:07.329 Ademola: In our company.

168 00:17:07.339 00:17:09.929 Awaish Kumar: You are using open source care flow?

169 00:17:11.060 00:17:12.590 Ademola: Yeah, open source airflow, yes.

170 00:17:13.520 00:17:16.599 Awaish Kumar: Okay, so who is managing that?

171 00:17:16.750 00:17:17.579 Awaish Kumar: who’s managing.

172 00:17:17.589 00:17:18.039 Ademola: You’re good.

173 00:17:18.040 00:17:19.030 Awaish Kumar: development.

174 00:17:20.329 00:17:23.629 Ademola: Our infra team manages the,

175 00:17:23.819 00:17:30.119 Ademola: Airflow side. So we just write the DAGs, right? Write the DAGs and write the commands.

176 00:17:30.259 00:17:36.909 Ademola: Then write the, task for the airflow job Right, when it comes to…

177 00:17:37.679 00:17:47.739 Ademola: the technical, technical backside, we don’t do that. I just undo the DAGs. Anytime a DAG fails, I go back to the scripts to see why.

178 00:17:48.249 00:17:49.119 Ademola: basically, basically.

179 00:17:49.120 00:17:55.899 Awaish Kumar: And how… fit for the… Can you describe Airflow architecture?

180 00:17:57.500 00:18:08.509 Ademola: Oh, okay, so the way Airflow works, from our understanding, or the way we, have Airflow work for us is we have multiple pipelines, right? And some of those pipelines depend on….

181 00:18:08.510 00:18:09.990 Awaish Kumar: Just airflow.

182 00:18:10.190 00:18:12.179 Awaish Kumar: architecture of care for itself.

183 00:18:13.680 00:18:17.900 Ademola: Okay, okay, so… Airflow is an orchestration tool.

184 00:18:18.300 00:18:22.040 Ademola: It is used to, trigger…

185 00:18:22.480 00:18:29.920 Ademola: It’s used to… it can be used for ETL at some point, and… but it’s mainly used for orchestration.

186 00:18:30.160 00:18:36.360 Ademola: Right? And in Airflow, we have the workers, we have the… …

187 00:18:36.840 00:18:38.719 Ademola: I think we have the workers.

188 00:18:39.270 00:18:42.319 Ademola: We have the operators, we have…

189 00:18:42.440 00:18:46.199 Ademola: I just can’t remember the technical side of…

190 00:18:46.650 00:18:51.229 Ademola: Those, those, airflow parts right now.

191 00:18:51.510 00:18:52.200 Ademola: Oh, fuck.

192 00:18:52.200 00:18:52.540 Awaish Kumar: picked up.

193 00:18:52.540 00:18:53.040 Ademola: Thank you.

194 00:18:53.040 00:18:59.399 Awaish Kumar: Okay, that’s… for example, I’m writing a tag, And I have a task.

195 00:18:59.690 00:19:06.870 Awaish Kumar: Which… he gets the… updated weather information.

196 00:19:08.150 00:19:09.350 Awaish Kumar: from Android.

197 00:19:09.490 00:19:13.159 Awaish Kumar: And then downstream… there are the downstream tasks.

198 00:19:13.880 00:19:21.610 Awaish Kumar: So what I want… I have a, like, airflow tank. In there, there are multiple tasks. The first task is to get that

199 00:19:21.880 00:19:27.470 Awaish Kumar: updated weather information, and then there are some downstream tasks. So, for example.

200 00:19:27.660 00:19:36.599 Awaish Kumar: But that… that… that task, which is basically getting the updated weather information, sometimes gets stuck.

201 00:19:37.220 00:19:39.050 Awaish Kumar: Right? So…

202 00:19:40.300 00:19:49.619 Awaish Kumar: what I want is, if it succeeds, it updates the table, and my downstream task will use that table, everything is good. But, and if it fails.

203 00:19:49.840 00:19:53.929 Awaish Kumar: … for example.

204 00:19:53.930 00:19:54.859 Ademola: Because I’m dead on YouTube.

205 00:19:54.860 00:19:55.500 Awaish Kumar: Right.

206 00:19:55.960 00:20:00.290 Awaish Kumar: Whatever sends an alert or whatever, but that’s okay.

207 00:20:00.290 00:20:00.760 Ademola: Thank you.

208 00:20:00.760 00:20:03.079 Awaish Kumar: Mario, so… and it, …

209 00:20:08.100 00:20:13.280 Awaish Kumar: It still proceeds, but… but if, like, the pipeline still proceeds.

210 00:20:13.430 00:20:15.360 Awaish Kumar: Even if it fails.

211 00:20:15.500 00:20:20.110 Awaish Kumar: And it will use the table which was updated, whatever updated information it has.

212 00:20:20.240 00:20:27.649 Awaish Kumar: So… So, how can you do that, for example, like…

213 00:20:28.680 00:20:37.379 Awaish Kumar: like, what is the setting of, like, what is the parameter you provide so that the pipeline runs, even if Geotask succeeds or fails?

214 00:20:37.630 00:20:41.489 Awaish Kumar: All the downstream tasks should run. How would you do that?

215 00:20:43.170 00:20:51.340 Ademola: So… I also believe you use, in Python, I know it’s Python, Problem.

216 00:20:51.500 00:20:57.790 Ademola: You can use a… Try… try on Kyle.

217 00:20:57.790 00:20:58.699 Awaish Kumar: No, no, no, no.

218 00:20:59.200 00:21:04.969 Awaish Kumar: I’m not talking about Python as a… itself, as a language, like, I’m using Festro right now.

219 00:21:05.450 00:21:09.940 Awaish Kumar: Careful provides tasks, like, like, … for different.

220 00:21:09.940 00:21:10.350 Ademola: Yes.

221 00:21:10.350 00:21:11.949 Awaish Kumar: For our different operators, right?

222 00:21:12.180 00:21:12.830 Awaish Kumar: Yes.

223 00:21:12.830 00:21:13.460 Ademola: Yes.

224 00:21:14.190 00:21:19.660 Awaish Kumar: So, for example, I’m using, … Python operator, right?

225 00:21:19.660 00:21:20.180 Ademola: Okay.

226 00:21:20.180 00:21:23.269 Awaish Kumar: And then there are some other Python operators downstream.

227 00:21:23.270 00:21:24.509 Ademola: Yes. Yes.

228 00:21:24.510 00:21:27.520 Awaish Kumar: This Python operator gets data from somewhere.

229 00:21:28.320 00:21:32.140 Awaish Kumar: scrap some data from Internet and load into a table.

230 00:21:32.640 00:21:49.250 Awaish Kumar: My task is that if it succeeds, my table is updated, downstream pipeline is running, but even if it fails, it does not matter much, because my table already has some weather information. I still want my pipeline to go.

231 00:21:49.620 00:21:53.730 Awaish Kumar: how… how can I do that? In… in use… using Airflow.

232 00:21:53.960 00:21:57.070 Awaish Kumar: Config… operators, like, parameters and things.

233 00:21:57.230 00:21:59.209 Awaish Kumar: Any setting, if you know.

234 00:22:02.500 00:22:06.560 Ademola: I’m not sure I’m familiar with… outside of airflow.

235 00:22:06.560 00:22:07.010 Awaish Kumar: Okay.

236 00:22:07.010 00:22:07.710 Ademola: boots.

237 00:22:07.710 00:22:09.970 Awaish Kumar: I have a more question in it.

238 00:22:10.280 00:22:12.510 Awaish Kumar: If a, if a task strucks.

239 00:22:12.830 00:22:15.850 Awaish Kumar: So, this same task, same scenario, same task.

240 00:22:16.930 00:22:23.209 Awaish Kumar: It got stuck, like, it does not succeed, it did not fail. It just got stuck. How can I solve that?

241 00:22:24.190 00:22:29.250 Ademola: So, the way we usually do that at my organization is there’s an Airflow UI,

242 00:22:29.460 00:22:36.099 Ademola: Right? You go to the FLO UI to check the reason why it’s stuck, or why it failed. Sometimes it could be….

243 00:22:36.100 00:22:42.289 Awaish Kumar: Like, for example, it is stuck, but I, I don’t… like, I don’t know, I’m not…

244 00:22:42.660 00:22:45.149 Awaish Kumar: I don’t want to, like, go to the UIA.

245 00:22:45.670 00:22:46.659 Awaish Kumar: Hold on.

246 00:22:47.190 00:22:53.479 Awaish Kumar: to figure out, like, if it succeeded, if it failed, if it didn’t stuck, what happened, I don’t know, I’m…

247 00:22:53.670 00:22:57.159 Awaish Kumar: like, I’m a… for example, I’m a sole data engineer in my company.

248 00:22:57.430 00:23:00.300 Awaish Kumar: I have lots of different tasks to do, so…

249 00:23:00.420 00:23:04.639 Awaish Kumar: It’s very hard for me to go in and continuously monitor the pipeline.

250 00:23:04.780 00:23:07.000 Awaish Kumar: So, what is the best way to…

251 00:23:08.330 00:23:11.940 Awaish Kumar: to handle that. If something is stuck, Huh.

252 00:23:12.480 00:23:15.730 Awaish Kumar: do something so that I know something happened.

253 00:23:19.300 00:23:21.539 Ademola: Oh, the best way I probably know

254 00:23:22.130 00:23:27.570 Ademola: is to go to the UI, to just check what the error is, then go back to fix your….

255 00:23:27.570 00:23:37.159 Awaish Kumar: Okay, it’s not, like, the TFL provides a way a parameter, called SLA.

256 00:23:38.050 00:23:39.060 Ademola: Okay.

257 00:23:39.060 00:23:46.509 Awaish Kumar: If you provide the timeout, if your task is stuck for more than 10 minutes, For agility.

258 00:23:46.690 00:23:54.470 Awaish Kumar: 10 or whatever time you set, it can trigger a… it can fail automatically, it can succeed automatically, it can….

259 00:23:54.470 00:23:55.110 Ademola: Okay.

260 00:23:55.340 00:23:57.730 Awaish Kumar: Trigger, notification, and whatever.

261 00:23:57.920 00:23:59.199 Awaish Kumar: So, yeah, you can….

262 00:23:59.200 00:24:05.350 Ademola: Like, I’m not just… I’m not just familiar with, because maybe, probably because of the scope of my own job, we…

263 00:24:05.570 00:24:08.040 Ademola: We didn’t have, like, a use case for that yet.

264 00:24:08.330 00:24:11.730 Ademola: So, probably… Yeah, definitely true.

265 00:24:12.440 00:24:19.710 Awaish Kumar: Okay, yeah, I guess, I want to know, like, I just want to understand how complex pipelines…

266 00:24:20.370 00:24:27.600 Awaish Kumar: Should have built using Airflow, Spark, PySpark, and whatever you are choosing.

267 00:24:29.270 00:24:30.710 Ademola: So, most of….

268 00:24:30.710 00:24:34.650 Awaish Kumar: Give me an example of a data pipeline, which is really, really complex.

269 00:24:35.860 00:24:38.859 Awaish Kumar: Not just simple, like, …

270 00:24:39.680 00:24:47.170 Awaish Kumar: Write a script or something, like, which is some level of… if you think there is some level of complexity, you can give that example.

271 00:24:48.920 00:24:54.460 Ademola: Okay, so, I would say it’s probably… which one can I think of right now?

272 00:24:54.840 00:24:58.390 Ademola: I would say probably the one where I had to extract data from an API.

273 00:24:58.720 00:25:16.919 Ademola: Right? Using, Python, because this one I was not connecting to… I was connecting to an external source, basically, so I had to, like, write, a lot of Python. I had to even implement using Postman to test first, to test the API. Then when testing, I wrote the script.

274 00:25:17.070 00:25:20.620 Ademola: In this case, I had to use, DataFrame APIs, and…

275 00:25:21.090 00:25:23.489 Ademola: I had to do a lot of complex cleaning.

276 00:25:24.000 00:25:31.180 Ademola: To be honest, because, it was not in a… and Coulomb form.

277 00:25:31.400 00:25:34.430 Ademola: it was more of JSIM form, and I had to, like, do…

278 00:25:34.740 00:25:41.620 Ademola: a lot of flattening. You know, when… if you are familiar with Jason, sometimes…

279 00:25:41.880 00:25:44.109 Ademola: And they are always, …

280 00:25:44.680 00:25:52.470 Ademola: in, should I say, in an umbrella of different structure. There’s not in a structure that… sequel, I understand.

281 00:25:52.660 00:25:59.380 Ademola: Right? So I had to, like, write Python scripts to flatten the do you do that?

282 00:25:59.610 00:26:05.539 Ademola: So that it can then be moved to… to our database.

283 00:26:05.660 00:26:08.969 Ademola: But when you now came to orchestration, we basically just…

284 00:26:09.260 00:26:17.359 Ademola: our pipelines on a daily basis, so that everyday updates, and on days that it fails, we…

285 00:26:17.500 00:26:20.890 Ademola: just go back to the UI to pick it up and run it.

286 00:26:24.590 00:26:26.070 Ademola: That’s basically….

287 00:26:27.240 00:26:34.849 Awaish Kumar: How would you rank… rate yourself in Python and SQL, out of 10?

288 00:26:36.300 00:26:40.209 Ademola: Well, I’m, I’m going to see my Python, …

289 00:26:40.330 00:26:46.789 Ademola: I would say probably 5 over 10, if I’m being honest, right, because Python is…

290 00:26:47.440 00:26:56.490 Ademola: quite difficult sometimes. You know, there are a lot of problems that one can solve, but with MySQL, I’ll give myself, like, an 8, 9.

291 00:26:56.860 00:26:59.110 Ademola: Because I’ve been writing SQL for the longest time.

292 00:27:00.660 00:27:01.480 Awaish Kumar: Okay.

293 00:27:02.080 00:27:12.369 Awaish Kumar: So… and, our… … If, for example, If I ask…

294 00:27:13.730 00:27:21.900 Awaish Kumar: Quick question from Python, for example. Have you used context managers in Python.

295 00:27:22.490 00:27:23.360 Ademola: Sorry?

296 00:27:23.800 00:27:27.129 Awaish Kumar: Have you ruled context managers in Python?

297 00:27:29.140 00:27:34.059 Ademola: Why haven’t… Okay. My Python skills are not really strong right now.

298 00:27:34.520 00:27:35.250 Ademola: Basically.

299 00:27:35.250 00:27:37.009 Awaish Kumar: So, as usual as a DBT.

300 00:27:38.800 00:27:42.550 Ademola: For… for… for learning, because I’ve not really…

301 00:27:43.490 00:27:47.329 Ademola: worked on real-world projects. I’ve used dbt when I was learning.

302 00:27:47.600 00:27:49.270 Ademola: My data engineering skills.

303 00:27:50.700 00:28:01.180 Awaish Kumar: Okay, how… so you have read RSF and ASQL as well, so is it just writing ASQL, or you’re talking about, databases as well?

304 00:28:01.860 00:28:09.199 Ademola: Oh, okay, so yeah, so, you know, I know, I understand that in SQL, there are two sides, right? It’s either you’re doing the analytics.

305 00:28:09.410 00:28:11.959 Ademola: Or we are doing, database.

306 00:28:11.960 00:28:21.200 Awaish Kumar: Is that, like, escrow means… Having knowledge of database, having knowledge about what is the normalization.

307 00:28:21.650 00:28:23.469 Ademola: Yeah, yeah, yeah, so I have knowledge of that.

308 00:28:23.470 00:28:30.210 Awaish Kumar: Where it is used, different, optimizing… optimization veggies, right?

309 00:28:30.210 00:28:30.560 Ademola: Yeah.

310 00:28:30.560 00:28:39.610 Awaish Kumar: So, different data retrieval strategies. So, yeah, I wanted to under… I want to understand how… …

311 00:28:40.530 00:28:44.040 Awaish Kumar: Like, what do you know about indexes, and how they are used?

312 00:28:45.560 00:28:53.440 Ademola: Okay, so my knowledge of indexing has to do with to making your… queries faster.

313 00:28:53.560 00:28:56.730 Ademola: Right? If you index a table.

314 00:28:57.100 00:29:07.519 Ademola: And compared to a table that is not indexed, right, it is faster when it is indexed. That’s your output when you query a table, right? So you index… Why is it faster? What is indexed?

315 00:29:07.550 00:29:09.860 Awaish Kumar: And why it makes queries faster.

316 00:29:10.230 00:29:13.240 Ademola: Okay, so the way indexes work, it’s just like…

317 00:29:13.690 00:29:17.470 Ademola: Helping you to search through your….

318 00:29:17.710 00:29:19.660 Awaish Kumar: Your values faster.

319 00:29:19.660 00:29:23.180 Ademola: You know, so the way I see it is, you have… you have a book.

320 00:29:23.480 00:29:28.660 Ademola: And the book shows you, like, an introduction to all the chapters in that book.

321 00:29:28.900 00:29:29.290 Awaish Kumar: I mean….

322 00:29:29.290 00:29:29.780 Ademola: to….

323 00:29:29.780 00:29:36.499 Awaish Kumar: So I understand, that, but it makes things faster.

324 00:29:36.690 00:29:37.570 Awaish Kumar: Right?

325 00:29:37.570 00:29:38.000 Ademola: Yay.

326 00:29:38.000 00:29:45.269 Awaish Kumar: But I… I want more technical answer, that how is that making indexes faster?

327 00:29:46.420 00:29:48.309 Awaish Kumar: What? Internally, what?

328 00:29:48.470 00:29:49.489 Awaish Kumar: It is doing.

329 00:29:49.490 00:29:51.939 Ademola: log rules are on. Yeah, okay.

330 00:29:52.400 00:30:00.089 Ademola: I know there are different methods for, indexes. I know of one called B3.

331 00:30:00.390 00:30:10.479 Ademola: So, so it’s… It just basically points you to the unique Values in your…

332 00:30:10.590 00:30:12.909 Ademola: In your data sets, right?

333 00:30:13.230 00:30:17.459 Ademola: It makes you… so the unique values are the ones that are indexed as unique columns.

334 00:30:17.460 00:30:22.050 Awaish Kumar: What is the difference between cluster indexing and non-clustered indexing?

335 00:30:23.720 00:30:33.029 Ademola: so… Clustered index, I want to believe, yeah, clustering Yeah, basically clustering

336 00:30:33.250 00:30:35.750 Ademola: The unique values, just one single color.

337 00:30:37.110 00:30:41.559 Ademola: While non-clustered, you are using multiple columns for your indexing.

338 00:30:43.150 00:30:44.810 Awaish Kumar: That’s multi-column index.

339 00:30:45.050 00:30:51.120 Awaish Kumar: But I’m not talking about… Yeah, cluster indexing, …

340 00:30:51.290 00:30:56.360 Awaish Kumar: Yeah, so okay, we can… Move ahead, …

341 00:30:59.580 00:31:00.520 Awaish Kumar: prom.

342 00:31:01.960 00:31:07.740 Awaish Kumar: Yeah, you told me about that you have worked a lot with transactions… transactional data.

343 00:31:08.150 00:31:16.180 Awaish Kumar: Yes, yes. So, when you are working with transactional data, What are the, like, properties, …

344 00:31:16.350 00:31:18.449 Awaish Kumar: A transactional system should have.

345 00:31:20.520 00:31:24.520 Ademola: So, when I mean transaction, did I mean financial transactions?

346 00:31:25.030 00:31:25.990 Ademola: Right?

347 00:31:26.120 00:31:30.090 Ademola: transactions, like, I work in a fintech. I work in a fintech.

348 00:31:30.290 00:31:35.350 Awaish Kumar: Okay, so you… okay, so you didn’t mean the transactional systems.

349 00:31:35.350 00:31:38.210 Ademola: I mean, ….

350 00:31:38.210 00:31:41.769 Awaish Kumar: So, we have some transactional system, and we have some analytical systems.

351 00:31:41.770 00:31:45.209 Ademola: Analytical, yes. So that’s OLTP and OLAP.

352 00:31:45.210 00:31:49.500 Awaish Kumar: Yeah, so for the transactional systems, what are some,

353 00:31:49.730 00:31:53.859 Awaish Kumar: Properties that the transaction system, should have.

354 00:31:54.970 00:31:58.099 Ademola: Okay, so transactional systems help you with…

355 00:31:58.410 00:32:01.940 Ademola: Fast reads and fast writes.

356 00:32:02.410 00:32:08.540 Ademola: Right? So it’s… it’s easier for you to see your line-by-line transactions.

357 00:32:08.800 00:32:16.509 Ademola: on your OLTP systems, right? You can easily query transactions. They are not aggregated on, like, OLAP systems.

358 00:32:16.660 00:32:21.399 Ademola: You know, in OLAB systems, your data… OLAB system actually gives you data from… Yeah, those are….

359 00:32:21.600 00:32:24.140 Awaish Kumar: There’s this called as imperative.

360 00:32:24.370 00:32:26.069 Awaish Kumar: You know, acid propuls….

361 00:32:27.150 00:32:27.920 Ademola: Sorry?

362 00:32:28.380 00:32:31.290 Awaish Kumar: You… do you know about acid properties?

363 00:32:31.780 00:32:39.670 Ademola: Oh, acid, okay, yes, so that’s atomicity… consistency, isolation and…

364 00:32:40.230 00:32:45.299 Ademola: durability, right? So, I think OLAP… those are what you’re talking about, OLAP has….

365 00:32:45.300 00:32:47.460 Awaish Kumar: What are different levels of isolation?

366 00:32:51.410 00:33:00.280 Ademola: The way it works is your transactions… transactions that happen Right? … isolated.

367 00:33:00.810 00:33:06.699 Ademola: Right? They actually happen on… What’s the word?

368 00:33:07.350 00:33:10.860 Ademola: They happen… any transaction that happens, Rights.

369 00:33:11.470 00:33:12.830 Ademola: Happens uniquely.

370 00:33:14.050 00:33:18.630 Awaish Kumar: I mean, that’s… Let’s talk about… …

371 00:33:20.390 00:33:23.119 Awaish Kumar: Are you familiar with the data… data modeling?

372 00:33:24.090 00:33:25.900 Awaish Kumar: Yeah, little moving, yes.

373 00:33:26.210 00:33:27.540 Ademola: So we have….

374 00:33:28.710 00:33:29.620 Awaish Kumar: Wow.

375 00:33:30.330 00:33:39.370 Awaish Kumar: So… for… Transactional systems, we have, ….

376 00:33:39.480 00:33:43.680 Ademola: for example, ERDs and things like that, right?

377 00:33:43.930 00:33:44.600 Ademola: Yes.

378 00:33:45.130 00:33:48.989 Awaish Kumar: entity relationship modeling, and then we build the RD diagram.

379 00:33:49.330 00:33:59.059 Awaish Kumar: And from… For data warehouse system, We have, … Different kind of modeling.

380 00:34:00.220 00:34:02.019 Awaish Kumar: So, if you can describe any….

381 00:34:03.600 00:34:07.140 Ademola: But for, analytical, we have the star schema.

382 00:34:07.290 00:34:08.980 Ademola: And the snowflake.

383 00:34:09.330 00:34:10.159 Ademola: schema.

384 00:34:10.639 00:34:16.430 Awaish Kumar: Okay, that is… that is part of the dimensional modeling. These both are…

385 00:34:16.639 00:34:19.130 Awaish Kumar: part of diamond modeling. Do you know any?

386 00:34:19.380 00:34:22.520 Awaish Kumar: Other data modeling than dimensional modeling.

387 00:34:29.090 00:34:33.890 Ademola: … Aside dimensional modeling.

388 00:34:35.469 00:34:44.219 Awaish Kumar: Yeah, dimension modeling is star schema, snowflake schema that you just… what you just said, but is there any other data modeling?

389 00:34:44.369 00:34:48.499 Awaish Kumar: For data… for, analytical systems.

390 00:34:51.080 00:34:55.959 Ademola: Can we say Imnon and Kimball? I know Kimball… Kimball is also, like, a dimensional.

391 00:34:55.960 00:34:57.540 Awaish Kumar: In ballers, 10 ballers.

392 00:34:58.250 00:34:59.650 Awaish Kumar: Sasukemonite.

393 00:35:00.390 00:35:01.100 Ademola: Yeah.

394 00:35:02.850 00:35:06.439 Ademola: I’m not sure I know any other type of modeling assigned.

395 00:35:06.940 00:35:07.710 Awaish Kumar: So…

396 00:35:10.760 00:35:13.650 Awaish Kumar: When you should use, style schema.

397 00:35:14.450 00:35:18.109 Awaish Kumar: And when you should use, Snowflake schema.

398 00:35:18.470 00:35:21.970 Awaish Kumar: Can you just give any examples for…

399 00:35:22.350 00:35:25.010 Awaish Kumar: One example for each of these.

400 00:35:25.690 00:35:30.940 Ademola: Okay, so for star schemas, star schemas usually have one fact table.

401 00:35:31.170 00:35:41.160 Ademola: Right, and multiple dimensions. So, I mean, sales, for example, we have… … Your transactional table.

402 00:35:41.570 00:35:47.120 Ademola: Right? And the dimensions that speak to those… transactions.

403 00:35:47.350 00:35:51.249 Ademola: And they are joined by a foreign key and a primary key.

404 00:35:51.470 00:35:58.590 Ademola: Right? But for a dimensional… sorry, for… but for a snowflake, Murio.

405 00:35:59.170 00:36:08.970 Ademola: One factor is also joined to a dimension, which is also joined to another dimension, which is known as an hierarchy.

406 00:36:09.010 00:36:23.819 Ademola: Right? There’s introduction of a new table. For example, in sales, product table, and the product category, right? The product category will be connected to the product table, then the product table will also be connected to

407 00:36:24.680 00:36:27.600 Ademola: The transaction table, which is the facts.

408 00:36:27.920 00:36:34.699 Ademola: So that’s, like, the difference. There’s an introduction of a new table that does a hierarchy.

409 00:36:34.840 00:36:37.490 Awaish Kumar: I want to understand, like, when to use

410 00:36:37.660 00:36:42.040 Awaish Kumar: Which, if I am… for example, you are a data modeler.

411 00:36:42.990 00:36:48.330 Awaish Kumar: at Brain Forge, a client comes in, with his, requirements.

412 00:36:48.520 00:36:51.719 Awaish Kumar: And you are expected to build some data model.

413 00:36:52.360 00:36:59.289 Awaish Kumar: So, as a data modeler, now it is your choice if you are going to make a star schema or a snowflake schema.

414 00:36:59.850 00:37:06.570 Awaish Kumar: Hot… Like, what will be your… Currently.

415 00:37:07.170 00:37:11.889 Awaish Kumar: Like, what will be your thinking, thought process? Like, how would you make decisions?

416 00:37:12.150 00:37:20.740 Awaish Kumar: Like, should I make a star schema? Should I make a snowflake schema? So what will be some, things which will drive your decision?

417 00:37:24.630 00:37:27.150 Ademola: Firstly, I’ll look at the size of the data.

418 00:37:27.490 00:37:31.790 Ademola: Right? So the size of data also determine Boom.

419 00:37:32.470 00:37:33.949 Ademola: One model to use.

420 00:37:34.160 00:37:39.069 Ademola: And… While looking at the size of the data, also look at the categories involved.

421 00:37:39.730 00:37:43.019 Ademola: Right? When there are too many categories.

422 00:37:43.450 00:37:48.840 Ademola: I’ll just… Stick to using, … star schema.

423 00:37:48.990 00:37:52.790 Ademola: But when the categories are alerts, when they are…

424 00:37:53.370 00:38:03.600 Ademola: for example, when the product table also has more hierarchies, right, there are more… columns that describe B…

425 00:38:03.740 00:38:09.740 Ademola: product table. I will just introduce a new table for you to be joined to

426 00:38:10.260 00:38:14.510 Ademola: be productive so that we can have a star schema, because I believe that will be faster.

427 00:38:15.200 00:38:16.109 Ademola: On that case.

428 00:38:20.060 00:38:23.189 Awaish Kumar: So, you haven’t considered the system itself?

429 00:38:23.970 00:38:26.909 Awaish Kumar: how bigger the system bank has.

430 00:38:27.930 00:38:34.969 Awaish Kumar: You haven’t considered, like, the system, right? The… the… The database, the data warehouse.

431 00:38:35.300 00:38:45.900 Awaish Kumar: the power of data warehouse, what it is, then the size of the data, then what are the requirements from the client, like, what he needs? Is he… does he need a…

432 00:38:46.550 00:38:48.120 Awaish Kumar: reliable data.

433 00:38:48.650 00:38:56.260 Awaish Kumar: non-redundant, reliable, and with a lot of integrity, or he requires more, like, faster processing.

434 00:38:56.260 00:38:57.069 Ademola: Festival, yeah.

435 00:38:57.070 00:38:59.600 Awaish Kumar: Like….

436 00:39:00.460 00:39:10.020 Ademola: Those ones are important, so I think I forgot to add all of those. So, the first thing is to consider the, the data warehouse, like you said.

437 00:39:10.200 00:39:15.159 Ademola: Right? Because it is important to know Understand the requirement deeply.

438 00:39:15.620 00:39:23.540 Ademola: I know you guys do a lot of data gathering, data requirements, meetings with your stakeholders for them to understand their pain points.

439 00:39:23.700 00:39:33.760 Ademola: Right? Once all of those is understood, I can now… stats, … working on… requirements, basically.

440 00:39:36.980 00:39:37.760 Awaish Kumar: Okay.

441 00:39:38.130 00:39:46.110 Awaish Kumar: So have you worked on any project where you are directly gathering requirements from stakeholders?

442 00:39:47.360 00:39:48.780 Ademola: Oh, yes, yes I am.

443 00:39:50.580 00:39:51.300 Ademola: So….

444 00:39:51.300 00:39:51.830 Awaish Kumar: just built it.

445 00:39:51.830 00:39:52.470 Ademola: Anyways….

446 00:39:52.470 00:39:55.249 Awaish Kumar: Have you made a decision on data modeling?

447 00:39:56.750 00:40:00.870 Ademola: On data modeling, … So, TD…

448 00:40:01.070 00:40:08.890 Ademola: The type of modeling I’ve done is on Power BI. You know, Power BI does… we do a lot of data modeling when it comes to fact line dimensions.

449 00:40:09.500 00:40:13.600 Ademola: Alright, it’s… but… but typically….

450 00:40:13.600 00:40:16.569 Awaish Kumar: I’m more interested in, like, data warehouse modeling, right?

451 00:40:17.620 00:40:19.269 Awaish Kumar: Thank you.

452 00:40:19.680 00:40:23.900 Awaish Kumar: Where you have some scale, of your….

453 00:40:24.680 00:40:26.789 Awaish Kumar: system. The data is big.

454 00:40:27.140 00:40:31.059 Awaish Kumar: And we have quite a few hundreds of tables we are dealing with.

455 00:40:31.640 00:40:32.750 Awaish Kumar: Beautiful.

456 00:40:32.750 00:40:35.349 Ademola: Oh, okay, yeah, I’ve, I’ve not, I’ve not really…

457 00:40:35.770 00:40:39.600 Ademola: done that, like, on a face-to-face level. I’ve done that in the team.

458 00:40:39.810 00:40:41.059 Ademola: Where they lead.

459 00:40:41.260 00:40:56.079 Ademola: Right? So, when we gather the requirements, I’ve just told, okay, this is what the clients want, or this is what our stakeholders want. Then we are building based on the requirements. I’ve not been the one, like, doing the interfacing.

460 00:40:56.410 00:40:58.469 Ademola: with the stakeholders.

461 00:40:59.430 00:41:00.200 Awaish Kumar: Okay.

462 00:41:02.450 00:41:08.719 Awaish Kumar: But do you feel comfortable facing directly stakeholders.

463 00:41:09.190 00:41:13.049 Ademola: Oh, yes, yes, I feel comfortable, because I understand that

464 00:41:13.510 00:41:17.410 Ademola: To solve any problem, you have to be able to communicate really well.

465 00:41:17.660 00:41:28.239 Ademola: I’m not very good at that, right? And communication is not just about you speaking, it’s also about understanding your stakeholders, right?

466 00:41:28.620 00:41:34.129 Ademola: I feel if I’m even given time to, like, brush upon.

467 00:41:34.470 00:41:36.260 Ademola: Many of the things you guys do.

468 00:41:36.520 00:41:41.939 Ademola: I’m sure I can pick it up in no time, because I’m a fast learner, and…

469 00:41:42.270 00:41:45.030 Ademola: when I watch you do it, because I know you’re a master in this view.

470 00:41:45.430 00:41:49.370 Ademola: I’ll be able to, like, dendribo.

471 00:41:50.540 00:41:51.220 Ademola: Trust.

472 00:41:53.800 00:42:01.440 Awaish Kumar: Okay, and … What a… what is a carrier path?

473 00:42:01.640 00:42:02.780 Awaish Kumar: from…

474 00:42:06.180 00:42:11.030 Awaish Kumar: Like, you made a decision to become a data engineer from data analyst?

475 00:42:11.370 00:42:15.740 Awaish Kumar: Where do you see yourself in next, 2 to 3 years?

476 00:42:16.890 00:42:22.100 Ademola: Yeah, so in the next 2-3 years, I also… I would love to continue as a data engineer.

477 00:42:22.340 00:42:26.200 Ademola: But right now, in my country, Nigeria, we…

478 00:42:26.410 00:42:30.080 Ademola: Are not too focused on the modern data stack.

479 00:42:30.260 00:42:37.379 Ademola: Right? Many of the tools we use are, like, … legacy systems, SSMS, you know.

480 00:42:37.490 00:42:47.699 Ademola: old, old tools. But these days, if you go to the West, in Canada, places like Canada, the US, many of them are adopting tools like dbt, Cloud.

481 00:42:48.370 00:42:53.139 Ademola: Snowflake, BigQueries, tools like that. I’m looking for where I can

482 00:42:53.500 00:42:57.679 Ademola: role so that I can be a master in that kind of… in those kind of tools.

483 00:42:57.910 00:43:04.580 Ademola: Right, so that in the next 3 years, I’ll probably be… Data engineer, just like you.

484 00:43:06.880 00:43:13.140 Awaish Kumar: Okay. … So that’s it from my side. Do you have any questions?

485 00:43:14.760 00:43:17.230 Ademola: Well, first question would be…

486 00:43:17.680 00:43:24.980 Ademola: So, this Brainforge, is it, like, a full-time role, or is it… Contractor, is it part-time?

487 00:43:25.470 00:43:29.690 Awaish Kumar: … Okay, if… yeah, like, if…

488 00:43:30.080 00:43:32.769 Awaish Kumar: If hired, like, it can be a full-time.

489 00:43:33.910 00:43:38.130 Awaish Kumar: Normally, it starts with a part-time.

490 00:43:40.020 00:43:42.880 Awaish Kumar: Who, like, as, as you…

491 00:43:43.260 00:43:49.149 Awaish Kumar: Like, as, like, part of hiring process, you can say, like, you join part-time, you work with us.

492 00:43:49.420 00:43:56.479 Awaish Kumar: It will be paid, right, obviously, but you are going to understand the company, and we are going to understand you, and then

493 00:43:56.810 00:43:58.740 Awaish Kumar: Yeah, you can be made.

494 00:43:58.890 00:44:01.340 Awaish Kumar: For Diamond and Lamb.

495 00:44:02.010 00:44:05.250 Ademola: Yeah, that makes sense. I think my second question would be.

496 00:44:05.530 00:44:10.849 Ademola: For you guys right now, what would you say is your biggest problems that you’re trying to solve?

497 00:44:11.050 00:44:12.909 Ademola: As he did that seem.

498 00:44:14.310 00:44:19.300 Awaish Kumar: Yeah, like, what are tickets? Like, we are looking, like, we are…

499 00:44:21.050 00:44:23.750 Awaish Kumar: A startup company, so there are a lot of

500 00:44:24.030 00:44:31.700 Awaish Kumar: Things, new, things coming every day, new challenges, new clients, new data.

501 00:44:31.840 00:44:33.759 Awaish Kumar: So, yeah.

502 00:44:34.050 00:44:35.679 Awaish Kumar: Like, we have…

503 00:44:35.800 00:44:47.090 Awaish Kumar: For example, new clients coming in, maybe different from different industries, so we need someone who is a fast learner, can adopt to new client systems and their business.

504 00:44:47.220 00:44:56.460 Awaish Kumar: Quickly, and can, like, wrap up, things faster and, deliver value, to the clients quickly.

505 00:44:56.770 00:45:01.710 Awaish Kumar: And then, like, we are making sure that,

506 00:45:02.390 00:45:08.590 Awaish Kumar: The comp… the team, in our company, like, performs well,

507 00:45:09.750 00:45:15.220 Awaish Kumar: Like, yeah, like, continuity solution with, like, best practices.

508 00:45:15.360 00:45:24.980 Awaish Kumar: And, … And deliver the best possible solutions in a reasonable… diagram.

509 00:45:25.430 00:45:29.910 Awaish Kumar: So… That’s, like, yeah, it’s just used…

510 00:45:30.070 00:45:44.269 Awaish Kumar: startup things, which we have, like, it’s a lot of… lot of things to grow, fast… it’s fast-pacing. Sometimes you have, obviously, pressure from clients, some deadlines to meet, but that’s the part of the…

511 00:45:44.700 00:45:45.979 Awaish Kumar: Part of the road, yeah.

512 00:45:47.760 00:45:55.539 Ademola: That makes sense. Like, is there also a structure to the data engineering team? Like, are there, like, other data engineers aside you?

513 00:45:55.970 00:45:58.219 Ademola: Doing the data engineering work.

514 00:46:02.810 00:46:04.179 Awaish Kumar: Sorry, can you repeat?

515 00:46:04.800 00:46:08.540 Ademola: So, I was saying, is there, like, a structure to the data engineering team?

516 00:46:08.800 00:46:14.380 Ademola: Like, are there other data engineers, or data analysts, or analytics engineer?

517 00:46:14.380 00:46:22.540 Awaish Kumar: I must… yeah, we have data analysts, we have data analytics engineers, we have data engineers, I, myself.

518 00:46:23.010 00:46:26.860 Awaish Kumar: like, as a data engineer, but I’m kind of an…

519 00:46:27.290 00:46:30.900 Awaish Kumar: From, like, what to say, Full stack.

520 00:46:31.040 00:46:36.109 Awaish Kumar: that I can do… all of… any of these, I can be in.

521 00:46:36.110 00:46:37.430 Ademola: And these rules.

522 00:46:37.430 00:46:37.900 Awaish Kumar: Yeah.

523 00:46:37.900 00:46:39.089 Ademola: End to amps.

524 00:46:39.690 00:46:51.189 Awaish Kumar: And the same, like, with our CEO. He’s, he has been a data engineer himself, but he can be put in a role anywhere, in the data pipeline.

525 00:46:51.250 00:47:04.540 Awaish Kumar: So we… mainly, we are the data engineers here. Then we have some, specific data analytics engineers, then we have some, dedicated, data analysts as well.

526 00:47:04.690 00:47:12.730 Awaish Kumar: But it’s like, sometimes you can be put into the different roles, because, as I mentioned, we are a startup, and …

527 00:47:13.210 00:47:14.330 Awaish Kumar: …

528 00:47:14.490 00:47:23.819 Awaish Kumar: consultancy firm. So, for example, you are hired… if you are hired as a full-time, as a data engineer, but we don’t… if we don’t have, for example,

529 00:47:25.330 00:47:32.949 Awaish Kumar: But if we have more requirement from some client to do some AE work or something, like, we stay…

530 00:47:33.170 00:47:35.970 Awaish Kumar: Into other roles as well, sometimes.

531 00:47:36.330 00:47:43.310 Awaish Kumar: So, while doing data engineering, you might get something… do some data modeling for your client, or something like that.

532 00:47:44.490 00:47:47.380 Ademola: Yeah, I like that, I like that. That makes sense.

533 00:47:48.200 00:47:50.609 Ademola: Let’s no… that would be all my questions.

534 00:47:50.880 00:47:57.360 Ademola: So, like, is there any, like, what’s the next stage to this? Will I get a response again?

535 00:47:57.360 00:48:07.910 Awaish Kumar: So… next stage would be the… I think, Rico from our… operations. He’s… he’s going to…

536 00:48:08.100 00:48:09.240 Awaish Kumar: …

537 00:48:10.170 00:48:23.900 Awaish Kumar: like, take the… like, we’re going to, like, tell you about the next steps, and what that be, and, maybe he, he, he will do it by, kind of, end of this week, maybe.

538 00:48:25.660 00:48:27.049 Ademola: That makes sense, alright.

539 00:48:27.420 00:48:32.130 Ademola: So, thank you so much. I connected with you on LinkedIn also, I hope you saw my connection request.

540 00:48:33.220 00:48:39.800 Awaish Kumar: Yeah, yeah, thank you. It was nice meeting you, and … yeah, I’ll pass my feedback.

541 00:48:40.080 00:48:43.839 Awaish Kumar: Today, and then we’ll see, like, our team is going to connect with you.

542 00:48:44.030 00:48:44.790 Awaish Kumar: Thank you.

543 00:48:44.790 00:48:47.300 Ademola: Alright, thank you so much, thank you. Bye-bye.