Meeting Title: Brainforge Interview w- Awaish Date: 2026-02-19 Meeting participants: Smiti Kothari, Awaish Kumar


WEBVTT

1 00:02:20.350 00:02:21.200 Awaish Kumar: Hello?

2 00:02:21.490 00:02:23.639 Smiti Kothari: Hi, Aish, how are you doing?

3 00:02:24.010 00:02:25.360 Awaish Kumar: I’m good. How about you?

4 00:02:25.700 00:02:28.880 Smiti Kothari: I’m doing good. I just finished a class.

5 00:02:29.800 00:02:34.800 Awaish Kumar: Okay, okay, are you, like, in university or something?

6 00:02:35.190 00:02:37.400 Smiti Kothari: Yeah, I’m in my library.

7 00:02:38.760 00:02:42.110 Awaish Kumar: I mean, like, are you… what are you studying right now?

8 00:02:42.110 00:02:45.700 Smiti Kothari: Oh, okay, I’m doing Master’s in Computer Science, it’s my last sim.

9 00:02:46.390 00:02:48.420 Awaish Kumar: Okay, and where are you studying?

10 00:02:48.960 00:02:53.819 Smiti Kothari: At North Carolina State University. It’s in Raleigh, North Carolina.

11 00:02:54.330 00:02:54.980 Awaish Kumar: Okay.

12 00:02:55.890 00:02:59.409 Awaish Kumar: Yeah, nice to meet you, and…

13 00:02:59.920 00:03:05.090 Awaish Kumar: Yeah, in this… basically, in this interview, I’m just going to…

14 00:03:05.610 00:03:13.200 Awaish Kumar: I’ll give a little… a brief intro of myself and the company, and then we can start with your introduction, and

15 00:03:13.540 00:03:25.800 Awaish Kumar: After that, we are just going to chat about your experiences and your… what you have been doing so far. So, my name is Arish Kumar, and I’m kind of leading data engineering.

16 00:03:27.390 00:03:28.360 Awaish Kumar: here.

17 00:03:28.900 00:03:29.850 Awaish Kumar: Oh, no doubt.

18 00:03:30.960 00:03:35.610 Awaish Kumar: Basically, I have around 8 to 10 years of experience working as a data engineer.

19 00:03:35.840 00:03:42.210 Awaish Kumar: At Brainforge, I’ve been for a year now, and

20 00:03:42.470 00:03:48.469 Awaish Kumar: Basically, BrainForge provides, data and AI consistency services

21 00:03:48.930 00:03:53.109 Awaish Kumar: To mid- to large-scale organizations.

22 00:03:53.300 00:03:56.750 Awaish Kumar: Most of our clients currently are in the United States.

23 00:03:56.750 00:03:58.289 Smiti Kothari: But we are…

24 00:03:59.010 00:04:08.980 Awaish Kumar: But, yeah, the BrainForge works remotely, so we have employees across the world, so from US, Europe.

25 00:04:09.310 00:04:11.760 Awaish Kumar: Asia from a river.

26 00:04:13.280 00:04:16.029 Awaish Kumar: Yeah, so… and because it’s a…

27 00:04:16.649 00:04:25.689 Awaish Kumar: like, the completely remote company. We normally are flexible with how people work, but yeah, we require some…

28 00:04:26.420 00:04:31.089 Awaish Kumar: Some few hours of overlap, with the clients and the team.

29 00:04:31.350 00:04:33.519 Awaish Kumar: So, yeah, that’s basically it.

30 00:04:34.370 00:04:36.180 Awaish Kumar: So, yeah, you can start.

31 00:04:36.370 00:04:51.300 Smiti Kothari: Yeah, sure. So, basically, I am a master’s student in computer science at NC State, and throughout my career, I have worked on a mix of startup and enterprise organizations, where my experience is basically focused on building data pipelines and analytical systems.

32 00:04:51.300 00:04:57.919 Smiti Kothari: So basically, recently, during this summer, I worked at Assured Guarantee, where I built Python snowflake-based

33 00:04:57.920 00:05:04.059 Smiti Kothari: pipelines that automated Hazard exposure analysis for over, like, 10,000 assets, and

34 00:05:04.240 00:05:09.399 Smiti Kothari: Before that, I also worked at Northern Trust, where I worked on Kafka-based validation systems.

35 00:05:09.490 00:05:28.590 Smiti Kothari: to improve reliability across financial data streams. So, like, I’m interested in building scalable data systems that directly support the business decisions, and I, went through the website, and, like, I saw that… that the company is trying to help companies use their data to actually move their business forward.

36 00:05:29.380 00:05:40.649 Smiti Kothari: And it also, like, works with clients and focuses on tailored solutions, so, like, this is interesting to me because I enjoy understanding the business problem first, and then designing a solution around it.

37 00:05:43.070 00:05:44.000 Awaish Kumar: Okay.

38 00:05:44.110 00:05:51.360 Awaish Kumar: Can we talk a little bit more about, your work experiences? Like… How…

39 00:05:51.670 00:05:54.050 Awaish Kumar: How many years of experience do you have, I know?

40 00:05:54.410 00:06:12.280 Smiti Kothari: So, basically, I, have internship experience, I have no work experience, basically. I did my bachelor’s from 2020 to 2024, and did my master’s from 2024 to 2026. So, there is no work experience, but I have worked, internship… I have done internships at various startups and enterprise organizations.

41 00:06:12.280 00:06:15.129 Awaish Kumar: Okay, and you have done your bachelor’s in the US also?

42 00:06:15.130 00:06:16.789 Smiti Kothari: No, Bachelor’s in India.

43 00:06:17.250 00:06:18.100 Awaish Kumar: Okay.

44 00:06:18.840 00:06:27.799 Awaish Kumar: So… Since you mentioned some… about some of your… projects, so can you…

45 00:06:28.020 00:06:31.340 Awaish Kumar: Like, just take one project as an example.

46 00:06:31.820 00:06:32.280 Smiti Kothari: Oh, God.

47 00:06:32.280 00:06:36.880 Awaish Kumar: Just walk me through it, with exactly mentioning where you…

48 00:06:37.920 00:06:40.820 Awaish Kumar: you were working. Like, in a project, you might have…

49 00:06:41.040 00:06:44.389 Awaish Kumar: Multiple people, but you just, like, have to…

50 00:06:44.870 00:06:52.290 Awaish Kumar: Defined, where actually you have been hands-on doing the work, and where the team was supporting you.

51 00:06:53.020 00:07:09.559 Smiti Kothari: Yeah, so I will just list my summer internship experience, which I did at New York in a financial company. It was… the name of the company was Assure Guarantee. So, basically, the problem started, I had to build a geospatial risk analytics project.

52 00:07:09.560 00:07:21.209 Smiti Kothari: This started after a major flood event exposed that how manual and slow the assessment process was. Like, whenever something like a flood or wildfire happened in the US,

53 00:07:21.210 00:07:35.449 Smiti Kothari: The company had to manually reach out to asset owners to figure out that whether their insured facilities were affected. So, that process was obviously time-consuming. So, the goal was to automate it and make it scalable for the future event. So, I was the only intern working there.

54 00:07:35.450 00:07:42.870 Smiti Kothari: And I built a Python-based pipeline, that would pull, Hazard data from external sources like NOAA and FEMA.

55 00:07:42.920 00:07:48.420 Smiti Kothari: These are the data sites that are used, widely in the US for Hazard.

56 00:07:48.500 00:08:05.579 Smiti Kothari: And since this APIs pulled… published both real-time and historical updates, I added logic to track whenever new Hazard data was released, so the system would automatically detect new or updated datasets, and validate the schema to make sure nothing had changed unexpectedly.

57 00:08:06.350 00:08:10.309 Smiti Kothari: And then I would load that data into Snowflake in a structured format.

58 00:08:10.920 00:08:16.790 Smiti Kothari: And I also maintained the raw layer so that, we could always trace what was ingested and when.

59 00:08:17.030 00:08:29.419 Smiti Kothari: So this could help in the reliability. And once the Hazard data was stored, the pipeline ran special interactions between the Hazard zones and the geolocated insured assets.

60 00:08:29.610 00:08:36.779 Smiti Kothari: So that allowed us to calculate impact metrics, such as whether an Hazard was affected, and how much of it overlapped with an Hazard zone.

61 00:08:40.710 00:08:41.720 Smiti Kothari: So that was…

62 00:08:42.760 00:08:51.050 Awaish Kumar: Yeah, my question is, I want to… yeah, like, although you mentioned Python-based scripts, my question is…

63 00:08:51.530 00:08:59.970 Awaish Kumar: like, how… like, that’s my question, like, how did you implement those scripts? What kind of… Oh, like, the…

64 00:09:01.360 00:09:03.660 Awaish Kumar: But you see, architecture you followed.

65 00:09:03.870 00:09:09.470 Awaish Kumar: How exactly did you ingest it, how those scripts were orchestrated, right?

66 00:09:10.120 00:09:21.010 Smiti Kothari: So, basically, the… I use those datasets, Fema and NOAA. So, these are… these are publicly available data sets. So, whenever… whenever they publish new datasets, I, put…

67 00:09:21.010 00:09:21.989 Awaish Kumar: What are you gonna detect it?

68 00:09:23.070 00:09:28.569 Smiti Kothari: I… basically, I do this using tracking the APIs. They are the APIs.

69 00:09:29.370 00:09:32.479 Awaish Kumar: Okay, how do you track that? That’s the question.

70 00:09:33.750 00:09:35.330 Smiti Kothari: So I use those APIs.

71 00:09:36.950 00:09:41.680 Awaish Kumar: Yeah, yeah, like, you use APIs, I understand that, but, like,

72 00:09:42.350 00:09:49.479 Awaish Kumar: what was the thing that actually triggers? Like, okay, there is some data, but API doesn’t, like…

73 00:09:49.800 00:09:55.019 Awaish Kumar: to run on its own, like, you have to have some infra, Which used to run.

74 00:09:55.660 00:10:01.930 Smiti Kothari: So, basically, It fetches, like, whenever a new data comes, the API

75 00:10:02.110 00:10:08.439 Smiti Kothari: has some new data, so I detect that if a new data has come, I, push those data into a folder.

76 00:10:08.850 00:10:12.520 Awaish Kumar: Yeah, but how that detection happened, that’s the question.

77 00:10:18.640 00:10:19.320 Smiti Kothari: Thanks.

78 00:10:21.640 00:10:22.740 Awaish Kumar: Oh, okay.

79 00:10:22.740 00:10:23.370 Smiti Kothari: using it?

80 00:10:24.010 00:10:28.159 Awaish Kumar: Yeah, like, data is pulled through API, I understand that.

81 00:10:28.260 00:10:31.910 Awaish Kumar: But how do I know that that new data has arrived? I have to pull.

82 00:10:32.330 00:10:36.120 Awaish Kumar: I don’t know, right? I have to have a system to know that.

83 00:10:40.370 00:10:43.860 Smiti Kothari: So, basically, I read that whenever

84 00:10:44.470 00:10:49.730 Smiti Kothari: I use the API, and I track that whenever the new data comes, I push that data into the folders.

85 00:10:50.520 00:10:58.110 Awaish Kumar: No, no, that’s okay. Like, we are not managing their system, right? As you mentioned, few names, like Noah and…

86 00:10:58.110 00:10:59.000 Smiti Kothari: Yeah.

87 00:10:59.000 00:11:01.960 Awaish Kumar: Yeah, that they push some data.

88 00:11:02.190 00:11:02.550 Smiti Kothari: Yeah.

89 00:11:02.960 00:11:07.760 Awaish Kumar: somewhere in their database, and we have access to API to pull it from…

90 00:11:07.940 00:11:22.269 Awaish Kumar: Yeah. From their database. But then, I don’t know, like, I have an API, I can access it using my… this API, I can access their system to get their data, but how do I know that the new system, new data has arrived?

91 00:11:22.370 00:11:27.019 Awaish Kumar: I can actually… I can run it on some schedule, but I don’t know the…

92 00:11:27.320 00:11:32.259 Smiti Kothari: So, like, till I remember, I used a synchronous events pipeline to detect the new data.

93 00:11:33.830 00:11:40.330 Awaish Kumar: Okay, then, yeah, you can actually move on, from this. So, for example.

94 00:11:40.450 00:11:45.360 Awaish Kumar: how you were orchestrating your pipeline, for example, you wrote Python scripts.

95 00:11:45.690 00:11:48.189 Awaish Kumar: How… how they were being executed.

96 00:11:49.240 00:11:52.800 Awaish Kumar: Were you executing on your local machine, or… Yeah.

97 00:11:52.800 00:11:57.439 Smiti Kothari: So, like, I ran the code in VS Code, are you asking that?

98 00:11:58.230 00:12:01.219 Awaish Kumar: I mean, did you deploy your project?

99 00:12:02.450 00:12:05.149 Smiti Kothari: I deployed the dashboard.

100 00:12:05.620 00:12:06.600 Smiti Kothari: Which I created.

101 00:12:06.600 00:12:11.560 Awaish Kumar: I mean, that project needs to… ransomware.

102 00:12:11.870 00:12:17.040 Awaish Kumar: in production, right? In your local machine, you can run it once.

103 00:12:17.040 00:12:24.950 Smiti Kothari: So basically, I made the code in Python, and then I connected it to a dash… using a Streamlit dashboard. So I deployed that dashboard.

104 00:12:26.260 00:12:27.649 Smiti Kothari: If that makes sense.

105 00:12:29.140 00:12:36.560 Awaish Kumar: it does make sense. You deployed Streamlit API to visualize your data, Right?

106 00:12:36.560 00:12:51.639 Smiti Kothari: To visualize that… so, basically, the dashboard was to visualize how many, obligates were affected, which counties were affected, the number of states, and it showed in the map the… how much, like, exactly how much percent it was.

107 00:12:51.640 00:12:55.400 Awaish Kumar: What Streamlote app was connected to what database?

108 00:12:57.100 00:12:59.299 Smiti Kothari: to the Snowflake database.

109 00:12:59.370 00:13:03.129 Smiti Kothari: Like, basically, what was happening was that

110 00:13:03.130 00:13:22.050 Smiti Kothari: there was a full Python script, the data, I stored it in Snowflake, the Hazard data, which I took from Ferma and NOA, and from that, the Python script ran, and it gave me the values, which… the values, I pushed it onto the streamline dashboard.

111 00:13:25.170 00:13:28.269 Awaish Kumar: Okay, and

112 00:13:30.470 00:13:38.920 Awaish Kumar: So, in that process of building that project, you were the only one working on that project?

113 00:13:39.220 00:13:40.030 Smiti Kothari: Yeah.

114 00:13:40.530 00:13:41.740 Awaish Kumar: Okay, and.

115 00:13:41.820 00:13:47.570 Smiti Kothari: My manager was there, but I was the one working on it, and I had to just give them updates.

116 00:13:49.270 00:13:54.870 Awaish Kumar: And… So how were you… how did you deploy SteelNet app?

117 00:13:55.640 00:14:06.419 Smiti Kothari: So, they basically have a… the organization has, his, her, their own, website, and it was a… it was of internal use, so I just,

118 00:14:06.590 00:14:09.020 Smiti Kothari: Deployed their dashboard on their internal website.

119 00:14:11.530 00:14:11.990 Awaish Kumar: Okay.

120 00:14:11.990 00:14:13.870 Smiti Kothari: Like, the short guarantee.

121 00:14:14.520 00:14:20.369 Awaish Kumar: organization might have their own internal website, right? As you have…

122 00:14:20.570 00:14:25.980 Awaish Kumar: But… and you deployed on that, so my question is, how did you deploy it?

123 00:14:26.750 00:14:35.780 Smiti Kothari: Okay, so using GitHub, so I pushed all the chain… all the things to GitHub, and then GitHub gives the link, right, when we try to deploy.

124 00:14:37.840 00:14:42.020 Smiti Kothari: So I just ingested that link into their website portal.

125 00:14:44.300 00:14:45.010 Awaish Kumar: Okay.

126 00:14:46.230 00:14:48.900 Smiti Kothari: Like.github pages, it gives the link, right?

127 00:14:50.020 00:14:51.189 Awaish Kumar: Yeah, yeah, okay.

128 00:14:51.820 00:14:55.389 Awaish Kumar: So you actually just created GitHub badges?

129 00:14:57.140 00:15:07.570 Smiti Kothari: I wrote the code, I pushed it onto the GitHub repository, and then, we have the func- we have the thing of having the option to deploy.

130 00:15:08.440 00:15:14.189 Smiti Kothari: So I just did that, and pushed the link to the dashboard.

131 00:15:14.190 00:15:14.750 Awaish Kumar: Okay.

132 00:15:17.290 00:15:19.289 Awaish Kumar: So once,

133 00:15:19.470 00:15:27.040 Awaish Kumar: Okay, what would the… like, apart from Python and Snowflake, is there… was there any other tool used?

134 00:15:27.370 00:15:28.720 Awaish Kumar: In that project.

135 00:15:28.720 00:15:29.550 Smiti Kothari: No.

136 00:15:30.310 00:15:31.130 Awaish Kumar: Okay.

137 00:15:34.490 00:15:39.249 Awaish Kumar: Great. So, like, have you used… ever worked with SQL?

138 00:15:40.290 00:15:44.779 Smiti Kothari: Yeah, I have worked with, SQL, but,

139 00:15:45.990 00:15:50.199 Smiti Kothari: I don’t think I’ve worked at any organization, specifically.

140 00:15:51.170 00:15:51.790 Awaish Kumar: Hmm.

141 00:15:52.280 00:15:56.610 Awaish Kumar: So… Like, how would you rate yourself,

142 00:15:57.060 00:16:00.389 Awaish Kumar: out of 10, in both Python and SQL.

143 00:16:01.050 00:16:05.450 Smiti Kothari: So, I guess 8 and 7? Python 8, SQL 7?

144 00:16:07.640 00:16:08.370 Awaish Kumar: Okay.

145 00:16:09.500 00:16:15.169 Awaish Kumar: So, okay, moving on,

146 00:16:15.310 00:16:20.639 Awaish Kumar: How did… how were you sending… giving your updates to your manager?

147 00:16:20.820 00:16:25.960 Awaish Kumar: What was the… Your, like, the way to update your manager.

148 00:16:26.280 00:16:44.199 Smiti Kothari: like, we had daily stand-up meetings in which I would, share what has been done, what are some of the blockers that I am facing, and, like, if any blockers I am facing, I would just, say that I was facing this problem, and I tried to find the solutions, either from Google or from using their co-pilot, and…

149 00:16:44.200 00:16:49.310 Smiti Kothari: would explain the approaches that I have founded, and we had a discussion based on that.

150 00:16:49.590 00:16:54.150 Awaish Kumar: Okay, so that was… that all just happens in a stand-up meeting?

151 00:16:55.030 00:16:58.180 Smiti Kothari: No, this was what… it… it depends.

152 00:16:59.240 00:17:03.679 Awaish Kumar: That’s… then I want to know how… what all means did you use?

153 00:17:04.750 00:17:05.510 Smiti Kothari: Water?

154 00:17:05.949 00:17:12.839 Awaish Kumar: Not all means that you use to communicate with your manager, or you would use to communicate with your team.

155 00:17:13.249 00:17:15.949 Smiti Kothari: So it was an on-site, internship?

156 00:17:16.260 00:17:16.859 Smiti Kothari: So…

157 00:17:16.869 00:17:20.159 Awaish Kumar: I don’t know, yeah, it could… it is announced, I just stand up.

158 00:17:20.319 00:17:32.149 Awaish Kumar: on-site internship, you are working in the company, you’ve figured out some issues, you… one way to tell your team is, is this stand-up, right? What are other ways, like, that you’ve utilized?

159 00:17:33.060 00:17:37.140 Smiti Kothari: We, I can, tell it through Slack.

160 00:17:38.040 00:17:43.510 Smiti Kothari: So the company basically used Slack for, communication, so…

161 00:17:43.760 00:17:51.790 Smiti Kothari: If… if the… if the manager is busy, or I can’t have a meet in person, I would just let him know the updates through Slack.

162 00:17:52.900 00:17:57.160 Awaish Kumar: Okay. But did you wrote any documentation, or…

163 00:17:57.880 00:17:58.700 Smiti Kothari: Yeah, so before.

164 00:17:58.700 00:18:00.050 Awaish Kumar: Cool living?

165 00:18:00.050 00:18:04.419 Smiti Kothari: leaving the internship, the project which I built, I…

166 00:18:04.550 00:18:09.270 Smiti Kothari: had a… I added a README, which had all the steps that

167 00:18:09.650 00:18:12.779 Smiti Kothari: what I have done, and…

168 00:18:13.290 00:18:16.060 Smiti Kothari: How, how to basically run it.

169 00:18:18.450 00:18:33.050 Awaish Kumar: Okay, have you done any root cause analysis? Like, for example, in your intern… like, not just this one, but on any of your internships or your projects, if you find out some issues or some bug, or…

170 00:18:33.220 00:18:34.919 Awaish Kumar: are some critical…

171 00:18:35.260 00:18:43.229 Awaish Kumar: Thing. And then, somebody asked to, okay, do a root cause analysis to figure out what was… what happened, right?

172 00:18:43.530 00:18:46.130 Awaish Kumar: So how would you go for that?

173 00:18:49.560 00:18:57.299 Smiti Kothari: Basically, I can do testing of, like, I could… First, like, suppose

174 00:18:57.750 00:19:03.540 Smiti Kothari: I could just do testing to figure out that… where the exit issue is. Like, if…

175 00:19:03.970 00:19:07.769 Smiti Kothari: if the more… if the code is modular, then I could just…

176 00:19:07.970 00:19:15.830 Smiti Kothari: write tests for each module and check which module is actually causing the Error, or…

177 00:19:16.270 00:19:25.910 Smiti Kothari: which module is basically failing, and then I could, go deep into that code and find what exactly is happening.

178 00:19:28.720 00:19:30.190 Awaish Kumar: Okay.

179 00:19:31.250 00:19:38.200 Awaish Kumar: And, She mentioned that you have a… Data…

180 00:19:38.630 00:19:40.849 Awaish Kumar: You want to build data systems.

181 00:19:41.140 00:19:44.660 Awaish Kumar: In the data world, we have a lot of different…

182 00:19:46.880 00:19:55.380 Awaish Kumar: what you see, like, the roles you can work on. You can become a data engineer, analytics engineer, data analyst, strategist.

183 00:19:56.310 00:20:03.440 Awaish Kumar: business intelligence, right, person. So, what are you… Planning.

184 00:20:03.890 00:20:10.060 Awaish Kumar: To… to go in… in your, like, the… maybe what you’re planning to do in your next role?

185 00:20:12.140 00:20:21.060 Smiti Kothari: So, to be honest, I had applied, for the data engineer role in this, so I would definitely like to do that.

186 00:20:21.560 00:20:22.370 Awaish Kumar: Okay.

187 00:20:22.510 00:20:23.930 Awaish Kumar: No, I mean,

188 00:20:24.990 00:20:36.120 Awaish Kumar: Okay, for the data… like, you… like, that’s what I want to understand. What… what you would like to… what do you enjoy doing the most? Like, in a sense that if you…

189 00:20:37.510 00:20:42.529 Awaish Kumar: If you are working in a backend, how… how comfortable are you with your…

190 00:20:43.210 00:20:47.559 Awaish Kumar: With meeting with the clients, with presenting your work.

191 00:20:48.320 00:20:59.090 Smiti Kothari: Yeah, like, in this internship also, I worked… I explained my dashboard to non-technical stakeholders. So, like, I’m comfortable talking to the customers as well.

192 00:21:01.880 00:21:02.620 Awaish Kumar: Okay.

193 00:21:02.920 00:21:07.370 Awaish Kumar: Okay, you can give me an example. How… how would you…

194 00:21:07.870 00:21:12.940 Awaish Kumar: Communicate, with the non-technical stakeholders some of your findings.

195 00:21:13.330 00:21:32.599 Smiti Kothari: So basically, I learned this lesson from my manager as well. Like, when you are talking to non-technical stakeholders, then, like, you should focus on the impact, that is created. Like, what time would it save, what, money it could have been saved, rather than,

196 00:21:32.810 00:21:52.769 Smiti Kothari: Like, specifying the technical details. So basically, when I created this, dashboard, like, I showed them that how much total insured exposure would be saved if we, already predicted that, how can obligors, that the financial obligors be, saved from the Hazards?

197 00:21:54.190 00:22:00.329 Smiti Kothari: So I feel that we could just focus more on the impact that the project is trying to create.

198 00:22:00.840 00:22:17.640 Awaish Kumar: My question, then, next question would be, it is easy to communicate with your non-technical stakeholders that, okay, I’m just going to do this, and this will result in thousands of dollars

199 00:22:17.640 00:22:34.619 Awaish Kumar: In savings, or it could help us optimize, and do it much faster. Like, these things will actually make them excited about what you’re doing, but the hard part is, when you are at the end of it, and you didn’t meet

200 00:22:34.730 00:22:49.689 Awaish Kumar: the numbers you claimed initially, maybe because of data uncertainties. How would you then communicate with your non-technical stakeholders who does not understand data uncertainties? That, why do you didn’t reach

201 00:22:49.880 00:22:52.270 Awaish Kumar: On… with the numbers you claimed.

202 00:22:53.810 00:22:59.240 Smiti Kothari: So, like, I would try to explain in my reasoning in,

203 00:22:59.700 00:23:11.549 Smiti Kothari: in a layman language, basically not getting deep into that why data was, not appropriate for the task or something, but, like, I would explain my reasoning to why…

204 00:23:11.880 00:23:13.650 Smiti Kothari: This is the result?

205 00:23:14.250 00:23:19.820 Smiti Kothari: And… like, I wouldn’t basically blame on the data.

206 00:23:19.940 00:23:28.569 Smiti Kothari: Because they won’t understand what is going on, but I could just explain my reasoning behind why the result is this.

207 00:23:29.500 00:23:31.839 Awaish Kumar: How would you reason, like, that…

208 00:23:31.980 00:23:34.660 Awaish Kumar: You have to reason with…

209 00:23:34.910 00:23:44.879 Awaish Kumar: And you have to let them know, what exactly the issue is. Either it’s… it’s the team that didn’t perform, or it’s the data.

210 00:23:45.410 00:23:46.260 Awaish Kumar: So…

211 00:23:48.430 00:23:55.730 Smiti Kothari: That’s a tough one. I would… like…

212 00:23:59.620 00:24:03.230 Smiti Kothari: Maybe then, explain that…

213 00:24:03.850 00:24:12.860 Smiti Kothari: the data sources that I used had some missing data, and had some faulty data, which maybe

214 00:24:13.120 00:24:18.860 Smiti Kothari: led to these results, but I… like, I would have to explain them in…

215 00:24:19.340 00:24:24.550 Smiti Kothari: like, using examples that why I… Why the results were this?

216 00:24:25.700 00:24:26.620 Awaish Kumar: Okay.

217 00:24:26.990 00:24:38.190 Awaish Kumar: I think, that’s okay. And then, on the data, injury, side, I…

218 00:24:39.190 00:24:48.130 Awaish Kumar: like, I just would like to know if you… if you can… if you know about some of your… some of the ETL tools, or maybe have…

219 00:24:48.960 00:24:51.039 Awaish Kumar: Some little bit know-how of…

220 00:24:51.940 00:25:10.659 Awaish Kumar: of some of the tools, apart from Snowflake and Python that you already mentioned, but, you might not have used, as I know… as you mentioned, that you don’t have experiences, but are there any tools that you use in your… any of your projects, or in any, of your

221 00:25:10.830 00:25:12.420 Awaish Kumar: Home assignments, or whatever.

222 00:25:16.020 00:25:20.960 Smiti Kothari: So, I… I know the ATL2 is, like, the airflow and the DBT.

223 00:25:21.270 00:25:37.129 Smiti Kothari: Airflow is basically used for orchestration, and DPT is basically used for transformations, but to be honest, I haven’t used any of those, in any of my projects. And also, there are some big data tools like Hadoop, which I studied back in my undergrad.

224 00:25:37.820 00:25:47.129 Smiti Kothari: And I’ve also, done a project using Kafka. Like, it is not much relevant to data engineering, but somewhat… and, I’ve also used…

225 00:25:47.130 00:25:51.260 Awaish Kumar: So, did you use Kafka? Did you… Are you just…

226 00:25:52.690 00:25:58.439 Awaish Kumar: kind of read… like, did you actually, set up the Kafka, or you just read data from.

227 00:25:59.060 00:26:08.899 Smiti Kothari: I just read data from it. So, basically, this was my first internship at Northern Trust, where, like,

228 00:26:09.260 00:26:19.370 Smiti Kothari: I had to, like, where the comp… like, the company was basically facing problem, in,

229 00:26:19.720 00:26:23.250 Smiti Kothari: Data loss and downstream performance issues, like the…

230 00:26:23.470 00:26:26.590 Smiti Kothari: Basically, the thing was that…

231 00:26:26.870 00:26:41.609 Smiti Kothari: there were a lot of out-of-order messages coming in, which led to the data loss, so this project was basically to address that. So, for that, I just built a Kafka consumer that would read all the messages directly from a specific topic, and for that, I used offset.

232 00:26:41.940 00:26:58.469 Smiti Kothari: So basically, offset, helps in determining the position of the messages in a Kafka partition, and then, like, due to this offset management, it makes sure that the messages are not lost, or are even not re-duplicated.

233 00:26:58.910 00:27:10.819 Smiti Kothari: So, like, I… the… like, if a consumer fails, then what does the offset do is, the offset commits the last… suppose the consumer fails at 101 message, so the…

234 00:27:12.330 00:27:24.449 Smiti Kothari: it commits till 100 message, and then, after the consumer starts back, we note that the… till only 100 message we have sent. So, now we need to send from the 101th message.

235 00:27:25.340 00:27:26.050 Awaish Kumar: Okay.

236 00:27:26.790 00:27:29.599 Awaish Kumar: Boom. Okay, I think I’m… I’m with the…

237 00:27:30.560 00:27:36.139 Awaish Kumar: end of my questions. Now we still have a few minutes, if you have any questions.

238 00:27:36.780 00:27:53.600 Smiti Kothari: Yeah, sure. So, like, I would definitely ask that you have a lot of experience, as you told and I also went through on LinkedIn, so, like, what is one of the most challenging things you have faced during these careers?

239 00:27:55.320 00:27:58.310 Awaish Kumar: Challenging things, a lot, quite a lot of things.

240 00:28:00.400 00:28:05.879 Awaish Kumar: Like, during the… especially during the… the… whenever you have to do migration.

241 00:28:06.050 00:28:07.690 Awaish Kumar: Of your data pipelines.

242 00:28:10.230 00:28:15.960 Awaish Kumar: And, most… like, that is the most… always a challenging task?

243 00:28:16.620 00:28:24.480 Awaish Kumar: and then, yeah, the… in the data carrier, one of the hardest things is to manage data quality.

244 00:28:27.070 00:28:34.050 Awaish Kumar: There are quite a lot of tools, and that there is a lot of… conversations.

245 00:28:34.620 00:28:46.740 Awaish Kumar: happens on how we can have better data quality, how can we manage it, and there are… everything else can… we have a lot… lot better and mature tool.

246 00:28:46.900 00:28:50.439 Awaish Kumar: For data engineering, for modeling, for…

247 00:28:50.610 00:29:02.879 Awaish Kumar: In terms of data warehouses, we have databases, all these tools are, like, are mature, and… and, all those chunks of… of the… of the…

248 00:29:05.430 00:29:13.419 Awaish Kumar: of the… like, the… all those action items from the data engineering work are quite easy, I would say.

249 00:29:13.980 00:29:27.070 Awaish Kumar: And are really achievable if you just put a few hours of work. But the data quality has been the hardest part, because nobody is really… no tool is really,

250 00:29:27.440 00:29:45.949 Awaish Kumar: At the… at the point where we could say, okay, just use this tool, and you will have better data quality, because the data quality and the definition of data changes from company to company, data source to data source, and that’s why it’s… it’s… it’s just dynamic, and…

251 00:29:46.100 00:29:47.130 Awaish Kumar: in harm.

252 00:29:49.060 00:29:49.750 Smiti Kothari: Okay.

253 00:29:51.060 00:29:52.999 Awaish Kumar: Okay, are there any other question?

254 00:29:53.800 00:29:54.310 Smiti Kothari: Butter.

255 00:29:54.310 00:30:02.439 Awaish Kumar: Okay, I think, then… I would… after my… after I submit my feedback, Rico from our operations.

256 00:30:02.590 00:30:08.510 Awaish Kumar: We’ll be able to get back to you, and maybe in the next… Hmm, weak.

257 00:30:09.060 00:30:12.619 Awaish Kumar: With the… with the next steps, okay?

258 00:30:12.860 00:30:13.460 Awaish Kumar: Thank you.

259 00:30:13.460 00:30:14.300 Smiti Kothari: Nurses.