Meeting Title: Brainforge Data Engineering Interview Date: 2026-03-11 Meeting participants: Awaish Kumar, Ahmed Alkarboly


WEBVTT

1 00:00:09.150 00:00:10.370 Awaish Kumar: Hi, Emma.

2 00:00:10.370 00:00:10.720 Ahmed Alkarboly: How are you?

3 00:00:10.720 00:00:11.150 Awaish Kumar: Thank you.

4 00:00:11.150 00:00:12.819 Ahmed Alkarboly: Hey, Oasce, how are you, man?

5 00:00:13.380 00:00:14.120 Awaish Kumar: I’m good.

6 00:00:15.060 00:00:15.690 Ahmed Alkarboly: Good.

7 00:00:16.079 00:00:18.139 Awaish Kumar: So, like, where are you located?

8 00:00:19.130 00:00:21.229 Ahmed Alkarboly: I’m in Austin, Texas, man. How about you?

9 00:00:21.880 00:00:23.750 Awaish Kumar: Okay, good. I’m enjoying.

10 00:00:25.250 00:00:26.400 Ahmed Alkarboly: You’re in the Emirates?

11 00:00:26.680 00:00:27.640 Awaish Kumar: Yes.

12 00:00:28.100 00:00:31.570 Ahmed Alkarboly: Oh, wow, sorry about the… the bombs, man.

13 00:00:32.759 00:00:35.099 Awaish Kumar: No worries, we are safe here.

14 00:00:35.390 00:00:35.980 Awaish Kumar: Oh.

15 00:00:35.980 00:00:37.279 Ahmed Alkarboly: I hear it. I’m glad.

16 00:00:38.930 00:00:43.340 Awaish Kumar: Okay, let’s get started on… on this.

17 00:00:43.460 00:00:46.789 Awaish Kumar: My name is Avesh Kumar, and I’m…

18 00:00:46.970 00:00:50.040 Awaish Kumar: And leading the data engineering team at Brain Forge.

19 00:00:50.510 00:00:51.360 Ahmed Alkarboly: Nice.

20 00:00:51.360 00:00:58.700 Awaish Kumar: Yeah, for the Brain Forge, it’s basically a data and AI consistency services company, which provides

21 00:00:58.810 00:01:06.500 Awaish Kumar: service to… services to the mid- to large scale enterprises. Most of our clients are

22 00:01:06.610 00:01:12.979 Awaish Kumar: in the US, but, but yeah, we operate remotely, and

23 00:01:13.110 00:01:17.619 Awaish Kumar: And our employees are spread across the world.

24 00:01:18.610 00:01:19.170 Ahmed Alkarboly: Boom.

25 00:01:20.680 00:01:24.060 Awaish Kumar: Yeah, let’s dive into… Your introduction.

26 00:01:24.950 00:01:31.780 Ahmed Alkarboly: Sure, yeah, my name is Ahmed Karbuli, I am a systems… industrial and systems engineer by education.

27 00:01:32.140 00:01:36.549 Ahmed Alkarboly: I’ve worked in manufacturing, telecom, and now government.

28 00:01:36.680 00:01:50.680 Ahmed Alkarboly: I started out building Power BI dashboards, and then naturally, you progressed your way up, like, where’s this data coming from? How’s it being generated? So that’s kind of how I transitioned into data engineering. Started managing data warehouses.

29 00:01:51.500 00:01:59.529 Ahmed Alkarboly: at Procter & Gamble. I moved to DISH, I manage the data warehouse for the research and development team. I got my hands dirty doing,

30 00:01:59.860 00:02:04.240 Ahmed Alkarboly: Basically, doing large-scale data migrations from vendors into…

31 00:02:04.530 00:02:09.870 Ahmed Alkarboly: AWS storing it in S3 buckets and stuff like that.

32 00:02:10.310 00:02:17.939 Ahmed Alkarboly: And now for the DMV, I’m a part of a cloud migration, basically the thing that runs the DMV, the mainframe.

33 00:02:18.110 00:02:32.820 Ahmed Alkarboly: is old, and they would like to move to something robust, obviously. So they’re moving to Azure, and so my part in that is configuring the data warehouse, configuring Databricks, all the networking, all of the RBAC rules that go with that.

34 00:02:32.960 00:02:41.700 Ahmed Alkarboly: And then setting up the Unity catalog, making sure that their governance folks have full visibility and purview.

35 00:02:41.760 00:02:48.569 Ahmed Alkarboly: And I’m moving into, as far as the DMV’s concerned, starting to do some Agentic workflows for them.

36 00:02:48.610 00:02:52.260 Ahmed Alkarboly: So, one feature that we would like is on the…

37 00:02:52.300 00:03:11.779 Ahmed Alkarboly: Department of Motor Vehicle website, which I’m not sure how familiar you are with it, but they’re the people that give you your driver’s license or permits for your car tags. If you buy a car or sell a car, you know, they’re the ones processing your title. So that’s all run on the mainframe, and we’re moving it to the cloud, and we’re…

38 00:03:11.940 00:03:15.079 Ahmed Alkarboly: I would say 3 quarters of the way done with that transition.

39 00:03:15.580 00:03:20.160 Ahmed Alkarboly: And so, like, literally my day-to-day now is… Hey.

40 00:03:20.370 00:03:28.380 Ahmed Alkarboly: We have the source needs to be ingested into the data warehouse, because the analytics folks, our data analytics

41 00:03:28.650 00:03:39.420 Ahmed Alkarboly: team. There’s 7 people for the DMV. They do most of the reporting in Power BI, and then I would say there’s probably one data scientist who’s doing some statistics

42 00:03:39.620 00:03:42.070 Ahmed Alkarboly: Machine learning type stuff.

43 00:03:42.320 00:03:49.500 Ahmed Alkarboly: who we give access to the data through Synapse, actually, which is kind of like Databricks, just worse, a lot worse, in my opinion.

44 00:03:49.610 00:03:55.720 Ahmed Alkarboly: I don’t know if you’ve used Synapse before, but it’s… Very inferior to Databricks.

45 00:03:56.430 00:04:00.389 Awaish Kumar: And, why… are you looking for a new role right now?

46 00:04:01.240 00:04:05.610 Ahmed Alkarboly: You guys actually reached out to me. I’m not. So…

47 00:04:06.570 00:04:14.599 Awaish Kumar: Okay, yeah, I don’t know, like, yeah, our recruiters are actively hiring, so that’s the… that’s probably the reason.

48 00:04:15.160 00:04:24.660 Ahmed Alkarboly: Yeah, that is the reason. We had a conversation, and I told her that I am working full-time for the DMV, I’m a critical part of this process, but…

49 00:04:24.970 00:04:36.470 Ahmed Alkarboly: I told her, if you think it makes sense, if you think there’s an opportunity where I could actually come over and do good work, I’m open to learning about it, and I made it clear, as long as it’s not a waste of your time.

50 00:04:36.900 00:04:40.350 Ahmed Alkarboly: I’m more than happy to interview and see if it’s a good fit.

51 00:04:41.530 00:04:43.649 Awaish Kumar: Okay, yeah,

52 00:04:43.810 00:04:50.539 Awaish Kumar: So, yeah, if we… like, can you… can we dig deeper into one of your latest projects?

53 00:04:50.680 00:04:56.700 Awaish Kumar: So, like, is it a, like, team effort, or…

54 00:04:56.980 00:05:00.139 Awaish Kumar: How… what the team looks like, and what are your…

55 00:05:00.830 00:05:03.480 Awaish Kumar: We were, like, the contributions in it?

56 00:05:03.670 00:05:10.630 Ahmed Alkarboly: Definitely a team effort. So the people that are actually doing the migration as a third party,

57 00:05:11.380 00:05:19.580 Ahmed Alkarboly: what does a migration entail? You have a mainframe, they wrote some proprietary software that moves data from the mainframe into something that is like a SQL server.

58 00:05:19.850 00:05:27.569 Ahmed Alkarboly: The data warehouse, to me, that’s just one input into my data warehouse, of which I have 100 data sources that manage all kinds of stuff.

59 00:05:28.020 00:05:28.970 Ahmed Alkarboly: So…

60 00:05:29.150 00:05:41.729 Ahmed Alkarboly: That is just the SQL Server. My part in it is actually writing out all the pipelines in Synapse, and I’m moving everything over to Databricks, because Synapse is… I don’t know if you’ve ever used Synapse, it’s terrible, and I’ll continue to say it’s not a good product.

61 00:05:41.940 00:05:48.520 Ahmed Alkarboly: So we’re moving all of our workflows into Databricks, and so I’m actually the one writing all that PySpark, SQL.

62 00:05:48.620 00:05:51.210 Ahmed Alkarboly: Figuring out how we want to…

63 00:05:52.640 00:05:57.010 Ahmed Alkarboly: Build something that’s easy to manage, basically templatize our pipelines.

64 00:05:57.180 00:06:07.490 Ahmed Alkarboly: I’m doing the networking, putting in the requests to IT, so, you know, as far as my scope and that, I’m putting in requests to networking teams.

65 00:06:07.510 00:06:24.270 Ahmed Alkarboly: I am coordinating with Kendrell, who’s actually doing the data migration. I am fully responsible for the data warehouse, which includes Databricks, it includes our blob storage, it includes all the networking infrastructure, it includes any virtual machines that we might need to run compute.

66 00:06:24.570 00:06:30.639 Ahmed Alkarboly: In addition to that, I… and the ad hoc admin for our Power BI tenant.

67 00:06:31.000 00:06:38.250 Ahmed Alkarboly: So I set up all the data flows, and then our analysts use the… they have access to the data flows, but they just use the models that we prepare for them.

68 00:06:38.500 00:06:44.909 Ahmed Alkarboly: So, all the way from right before the analyst picks up the data to the data’s ingested.

69 00:06:45.420 00:06:57.559 Ahmed Alkarboly: That is my purview. I also work with the governance team to make sure that the data warehouse is… that the catalog is structured correctly, so that they have easy governance, you know, to answer

70 00:06:57.810 00:06:59.360 Ahmed Alkarboly: Whatever question they might have.

71 00:07:00.560 00:07:05.720 Awaish Kumar: Okay, and do you have any, like, experience with Snowflake, dbt.

72 00:07:06.870 00:07:18.450 Ahmed Alkarboly: I mean, just never Snowflake explicitly, but, I mean, it was around in every single job I did. I was just never responsible for working inside of Snowflake. It was always…

73 00:07:18.910 00:07:27.409 Ahmed Alkarboly: we need some Python outside that is grabbing something from somewhere and somewhere. I never worked in a clean environment like Snowflake.

74 00:07:29.090 00:07:30.100 Awaish Kumar: How about…

75 00:07:30.100 00:07:31.090 Ahmed Alkarboly: snowflake.

76 00:07:32.000 00:07:34.540 Ahmed Alkarboly: I don’t even know what DBT is, can you say more?

77 00:07:35.230 00:07:42.460 Awaish Kumar: Oh, okay, so… tool for data transformations, like when you’re writing… when you are writing a SQL,

78 00:07:42.610 00:07:46.310 Awaish Kumar: To transform your data, it is just drive… a way to…

79 00:07:46.470 00:07:51.320 Awaish Kumar: modularize, and version control your SQL changes.

80 00:07:52.550 00:07:55.769 Ahmed Alkarboly: I’ve never done that, no. I’m actually gonna Google that right now.

81 00:07:56.480 00:07:57.109 Ahmed Alkarboly: You know, if you don’.

82 00:07:57.110 00:07:57.720 Awaish Kumar: I’m land.

83 00:07:59.710 00:08:14.960 Awaish Kumar: Okay, so… For that migration project where you’re working, so, like, how…

84 00:08:15.530 00:08:28.239 Awaish Kumar: it all, like, started? Can you, like, walk me through it? Like, before the implementation? Like, were you responsible for creating roadmap and planning out the…

85 00:08:28.540 00:08:36.919 Awaish Kumar: The migration, estimating the deadlines, and all of that? Or was it just… you’re the lead developer, and

86 00:08:37.049 00:08:39.740 Awaish Kumar: That someone else is PMing that.

87 00:08:41.090 00:08:43.789 Ahmed Alkarboly: So, which migration? The one at the DMV?

88 00:08:43.799 00:08:44.499 Awaish Kumar: Yeah, yeah.

89 00:08:44.920 00:08:55.700 Ahmed Alkarboly: Yeah, that’s a huge team involved in that. They’ve… the department failed 3 times to do this migration. The third attempt, they brought in Kendrel. Kendril are the folks that are doing more of the PMing.

90 00:08:56.010 00:09:01.960 Ahmed Alkarboly: I work with the networking team and their developers to implement our design, so I designed how

91 00:09:02.300 00:09:04.950 Ahmed Alkarboly: we want to use Databricks slash Synapse

92 00:09:05.840 00:09:12.700 Ahmed Alkarboly: you know, I hate Synapse, man, and if I haven’t made that clear, you know, I apologize. Synapse is not a good product, but…

93 00:09:12.970 00:09:28.659 Ahmed Alkarboly: The state of Virginia wants to use Synapse, so we designed and architected Synapse. From a high level, I did all the architecture diagramming, and then I worked with their… with Kendrel, who is the third-party vendor. They have a product called Max, and so we just had to make some changes

94 00:09:28.800 00:09:35.280 Ahmed Alkarboly: for the state of Virginia’s cloud environment that was different. I think they also did it for the state of Phoenix.

95 00:09:35.710 00:09:42.939 Ahmed Alkarboly: So, they were the ones delivering the project planning and the timelines, because their contract.

96 00:09:43.830 00:09:44.310 Awaish Kumar: Okay.

97 00:09:44.310 00:09:46.450 Ahmed Alkarboly: You know, responsible for the delivery of that.

98 00:09:46.800 00:09:51.799 Awaish Kumar: And for that, like, you mentioned that you’re writing the pipelines in Databricks.

99 00:09:52.250 00:09:56.259 Awaish Kumar: Does that mean you’re writing Python scripts, or…

100 00:09:57.230 00:10:00.590 Awaish Kumar: like, using PySpark or anything like that?

101 00:10:01.130 00:10:15.659 Ahmed Alkarboly: Yeah, it’ll be… so SQL, SQL goes into blob storage, and then from blob storage, you know, once it hits our bronze tier, all the transformations are done notebook in PySpark slash Python. Right now, though,

102 00:10:15.660 00:10:30.140 Ahmed Alkarboly: those pipelines are on, activated inside of Synapse. Synapse is not a good tool for governance, it’s not a good tool for developers, and so my migration within the migration is moving our cloud transformations, the notebooks, from

103 00:10:30.160 00:10:34.200 Ahmed Alkarboly: Synapse into Databricks. That’s literally what I’m working on this week.

104 00:10:34.200 00:10:36.620 Awaish Kumar: Okay, and what exactly is the change?

105 00:10:36.910 00:10:38.509 Awaish Kumar: In that migration.

106 00:10:39.020 00:10:41.539 Ahmed Alkarboly: The change in that migration is…

107 00:10:42.200 00:10:45.329 Ahmed Alkarboly: Synapse is an old school… have you ever used Synapse?

108 00:10:45.740 00:10:46.390 Awaish Kumar: No.

109 00:10:46.880 00:10:52.719 Ahmed Alkarboly: It’s very old school, it requires a lot of configuration and networking. For example, like.

110 00:10:52.980 00:11:07.240 Ahmed Alkarboly: I have over 100 SQL servers that need to be ingested into the data warehouse. For each one of those, I need to configure a self-hosted integration runtime, a virtual machine that’s sitting on the same subnet as Synapse. So, that’s 100 virtual machines.

111 00:11:07.530 00:11:10.579 Ahmed Alkarboly: Because of the way that the…

112 00:11:10.820 00:11:28.559 Ahmed Alkarboly: state of Virginia DMV cloud environment is architected, that’s 100 virtual machines. Whereas Databricks, I don’t need 100 virtual machines. I believe networking is done, and this is the part I’m proving out, I believe networking is done at a subnet level, so my interaction with the networking team

113 00:11:28.620 00:11:40.559 Ahmed Alkarboly: should be a lot simpler. Instead of having to create a virtual machine, do the networking rules from the SQL server to the private endpoint in storage for every single

114 00:11:40.630 00:11:42.780 Ahmed Alkarboly: For every single.

115 00:11:43.660 00:11:44.160 Awaish Kumar: Yeah.

116 00:11:44.160 00:11:49.500 Ahmed Alkarboly: source, it simplifies, it simplifies that. Also, from a governance perspective.

117 00:11:50.250 00:11:53.390 Ahmed Alkarboly: the Unity catalog inside of Databricks.

118 00:11:53.520 00:12:03.690 Ahmed Alkarboly: allows me to integrate cleanly, smoothly with Purview, and so, you know, there is no concept of a catalog in Synapse. That’s part of the reasons why it’s terrible.

119 00:12:05.340 00:12:11.400 Ahmed Alkarboly: You know, the governance team came to us and said, okay, well, we actually need to know full data lineage

120 00:12:11.490 00:12:13.850 Ahmed Alkarboly: All the way from report to source.

121 00:12:13.860 00:12:32.399 Ahmed Alkarboly: I could only do that inside of Power BI for them without Purview, and the only way that I could give them complete lineage, so they could see, like, you know, they could have something like a data dictionary, so that they could have full lineage, was moving everything to Databricks so that the Unity and correctly configuring the Unity catalog.

122 00:12:32.570 00:12:35.890 Ahmed Alkarboly: And the schema, and how we want to think about.

123 00:12:35.930 00:12:38.529 Awaish Kumar: Our particular data warehouse environment.

124 00:12:38.530 00:12:39.510 Ahmed Alkarboly: And then…

125 00:12:39.640 00:12:50.880 Ahmed Alkarboly: connecting Purview to that Databricks workspace so that they can have the complete… the view. So, really, it’s like a retooling, because Synapse is not the right tool to run a modern data warehouse.

126 00:12:51.670 00:12:52.040 Awaish Kumar: And.

127 00:12:52.040 00:12:53.849 Ahmed Alkarboly: My opinion. It’s happening?

128 00:12:54.400 00:12:55.260 Ahmed Alkarboly: What’s up?

129 00:12:56.080 00:13:04.269 Awaish Kumar: How the ingestion now… Is happening, like, from those SQL servers to S3, or…

130 00:13:05.310 00:13:17.689 Ahmed Alkarboly: Yeah, so it’s literally the first step is a copy activity, where we dump it, we use the self-hosted integration path. Inside of Synapse, there’s a thing called Azure Data Factory. Azure Data Factory was an attempt

131 00:13:17.690 00:13:25.349 Ahmed Alkarboly: that they had a low-code, no-code approach to basically doing data transformers, which I think is terrible, for the record. I don’t think it’s… I don’t…

132 00:13:25.730 00:13:27.440 Ahmed Alkarboly: I don’t think it’s the way to go.

133 00:13:28.250 00:13:32.119 Ahmed Alkarboly: Anyway, so the ingestion occurs with a copy activity.

134 00:13:32.550 00:13:43.360 Ahmed Alkarboly: that copy activity is actually a preset, like, leaflet, I think they call it, inside of Azure Data Factory. So you define a source, you define a sync, and then you can define

135 00:13:43.610 00:13:50.559 Ahmed Alkarboly: where… in that sink, like, a folder space that you want to dump, whatever, Parquet or CSV.

136 00:13:52.160 00:14:01.740 Ahmed Alkarboly: Currently, it’s CSVs, because there’s a networking issue with the self-hosted integration runtime that I’m working through with our IT team at the moment. Everything needs to be Parquet.

137 00:14:01.950 00:14:07.949 Ahmed Alkarboly: But that requires… the self-hosted integration runtime runs a version of Java.

138 00:14:08.530 00:14:20.600 Ahmed Alkarboly: Which is a part of the reason that Synapse is terrible. It runs Java, and that self-hosted integration runtime needs to have connectivity to the private endpoint in order for it to.

139 00:14:20.600 00:14:20.990 Awaish Kumar: living.

140 00:14:20.990 00:14:22.229 Ahmed Alkarboly: right to Parquet.

141 00:14:22.570 00:14:23.559 Ahmed Alkarboly: So anyway, yes.

142 00:14:23.560 00:14:26.709 Awaish Kumar: How these copy commands are running, like,

143 00:14:27.020 00:14:30.099 Awaish Kumar: Are they orchestrated through a tool? They are…

144 00:14:30.550 00:14:37.270 Ahmed Alkarboly: Synapse is that. Yeah, Synapse is that. So it technically would be an Apache Spark pool that the compute that this stuff is running on.

145 00:14:37.600 00:14:42.139 Ahmed Alkarboly: And I guess the background of that, you know, the copy data activity.

146 00:14:42.760 00:14:46.920 Ahmed Alkarboly: Have you heard of N-A-N? N-A-N-A-N.io? You ever heard of that?

147 00:14:46.920 00:14:47.810 Awaish Kumar: Anytime?

148 00:14:48.100 00:14:48.730 Ahmed Alkarboly: Yeah.

149 00:14:49.000 00:14:49.690 Awaish Kumar: Yeah, yeah.

150 00:14:50.000 00:15:03.599 Ahmed Alkarboly: Yeah, so it kind of looks like that, where you drag the blocks, and then there’s leaflets that you connect, so there’s literally a leaflet called a copy activity. The underlying compute to that, I’m pretty sure, is an Apache Spark, I think is an Apache Spark pool?

151 00:15:03.600 00:15:04.250 Awaish Kumar: Okay.

152 00:15:04.250 00:15:10.270 Ahmed Alkarboly: Or it might be an auto-resolve integration runtime. I’m not sure exactly which one that copy activity would run on.

153 00:15:11.080 00:15:18.770 Ahmed Alkarboly: Actually, I take that back. It’s running on the SHIR, right? That copy data activity is running on the self-hosted integration runtime.

154 00:15:19.700 00:15:23.029 Ahmed Alkarboly: I’m not sure I’m not understanding your question, I don’t know if I’ve answered it.

155 00:15:23.730 00:15:25.099 Awaish Kumar: Okay,

156 00:15:25.260 00:15:38.270 Awaish Kumar: Yeah, like, there are two parts of data injection. One is that you are moving from your current warehouse, whatever you’re using, to your… the data bricks. Second thing is…

157 00:15:38.340 00:15:45.559 Awaish Kumar: Ingestion, where now your sources will point to the new warehouse for your incremental ingestions.

158 00:15:47.240 00:15:51.870 Awaish Kumar: Yeah, so, like, I understand one thing here, that now you are…

159 00:15:52.170 00:15:57.830 Awaish Kumar: The data factory date is… the data is being moved to there’s three…

160 00:15:57.840 00:16:08.200 Ahmed Alkarboly: And that is being used in Databricks. It’s Azure, it’s blob storage, but yeah, point taken. It’s blob storage, so what we do is we do an incremental window. So, I’ve set up everything to basically…

161 00:16:08.600 00:16:15.399 Ahmed Alkarboly: Some of our sources refresh hourly, some refresh daily. Regardless, there’s an overlap window.

162 00:16:15.400 00:16:28.679 Ahmed Alkarboly: So, that overlap window gets copied as a job into the bronze tier. The silver notebooks do the merge function so that I’m taking care of all the duplicates. A part of my brainpower and my focus is figuring out

163 00:16:29.230 00:16:46.140 Ahmed Alkarboly: I don’t want to have 100 notebooks. I don’t want to have 100 different pieces of code. I want to create some sort of, like, array or JSON file that defines all of the tables that I’m… for our SQL data sources. We also do API-based ingestion from, like, Splunk.

164 00:16:46.250 00:16:50.239 Ahmed Alkarboly: from Dynatrace, so there’s a couple API-based and…

165 00:16:50.370 00:16:56.269 Ahmed Alkarboly: that is different. Those are different, handle… I handle those differently, because we only have a couple of them.

166 00:16:56.630 00:17:06.230 Ahmed Alkarboly: But we’ve templatized, basically, how to pull in multiple tables that have multiple keys, maybe multiple dates, into one

167 00:17:06.380 00:17:13.409 Ahmed Alkarboly: Notebook template that we then pass those parameters into our notebook template, and then it applies all of the transformations.

168 00:17:13.960 00:17:15.739 Ahmed Alkarboly: into silver.

169 00:17:16.270 00:17:21.229 Ahmed Alkarboly: And then what’s in silver, if we need to make a custom table or something more…

170 00:17:21.720 00:17:27.549 Ahmed Alkarboly: custom, then that’s when I would write, like, a one-on-one notebook to create a view or something like that.

171 00:17:29.010 00:17:29.740 Awaish Kumar: Okay.

172 00:17:30.210 00:17:40.400 Awaish Kumar: Yeah, and yeah, last few questions regarding, how do you, like, explain your… findings are…

173 00:17:41.210 00:17:45.159 Awaish Kumar: Analysis to the non-technical stakeholders.

174 00:17:45.600 00:17:59.499 Ahmed Alkarboly: Man, I’m a very visual person. I’m a very visual person, so I try to draw very simple architecture diagrams. I think a picture is worth a million… a million words, and so throughout my career, I’ve always focused on maintaining documentation.

175 00:17:59.690 00:18:10.640 Ahmed Alkarboly: that is technical documentation, but then also being able to show a picture that I can talk through. So, my litmus test is, if I show someone a picture, and then after a 2-minute conversation, they don’t… they still don’t…

176 00:18:11.510 00:18:20.079 Ahmed Alkarboly: have a basic grasp of the components I’m talking about, that’s feedback for me that my picture was inadequate. So, with non-technical business folks.

177 00:18:20.400 00:18:29.910 Ahmed Alkarboly: It’s really focused on their end result. You want data. They don’t care… they don’t care about connectivity not working, or the ticket that I’m waiting on. They just… they just… they don’t care.

178 00:18:30.370 00:18:47.619 Ahmed Alkarboly: So, for having an architectural conversation, like, how does this happen? I always bring a picture. If they’re just worried about their data, I try to just communicate where we are in the process, and usually I have a diagram to describe that, so they know roughly, hey, we’re 25% done, 50% done, 75% done.

179 00:18:48.320 00:18:57.449 Ahmed Alkarboly: That’s for the data engineering. For the data analysts, like, AI, like, if I’m building an agent, I don’t really have a good answer for that off the top of my head, you know? Like, the…

180 00:18:58.210 00:19:07.960 Ahmed Alkarboly: communicating with the business when building out a feature versus just, you know, bringing something into the data warehouse, that’s… that’s not something that I do too often these days.

181 00:19:09.150 00:19:10.480 Awaish Kumar: Okay.

182 00:19:10.850 00:19:13.920 Awaish Kumar: And, yeah, that’s it, I think.

183 00:19:14.410 00:19:18.880 Awaish Kumar: From my side, I’m happy to answer if you have any questions.

184 00:19:19.550 00:19:25.000 Ahmed Alkarboly: Yeah, I mean, I would just love to learn, sort of, like, you know, I’m trying to understand the culture

185 00:19:25.560 00:19:31.769 Ahmed Alkarboly: at Brain Forge, and sort of, like, I would… in that, I would be interested to learn how you ended up with the company.

186 00:19:31.960 00:19:35.499 Ahmed Alkarboly: What your experience is. Also, like.

187 00:19:36.580 00:19:45.299 Ahmed Alkarboly: you said you manage a team, so what is the sort of composition of that team? What are the projects that you’re working on? And then also, I guess…

188 00:19:45.470 00:19:54.159 Ahmed Alkarboly: I understand from the recruiter, you know, you have kind of 3 main groups, or at least that’s what y’all are hiring for right now. The analysts, the AI engineers.

189 00:19:54.370 00:19:58.119 Ahmed Alkarboly: Machine learning engineers, whatever you want to call them, and then the data engineers.

190 00:19:58.960 00:20:01.630 Ahmed Alkarboly: What is your handoff point between those two?

191 00:20:01.890 00:20:18.129 Ahmed Alkarboly: And then also, are you involved in the initial contract acquisition? So are you meeting with clients? Let’s say they don’t have data. They don’t have the data that we need to effectively implement artificial intelligence in whatever capacity. Machine learning, Agentech workflows, whatever.

192 00:20:18.300 00:20:28.959 Ahmed Alkarboly: Are you involved in that conversation, or is someone else involved in that conversation to establish a data presence to build out actual data systems that are useful? Sorry, that was a lot.

193 00:20:29.580 00:20:41.879 Awaish Kumar: Okay, yeah, no worries. At Brainforge, basically, we have… we are, as I mentioned, we are providing 3 different, types of services, like, AI services.

194 00:20:42.210 00:20:52.950 Awaish Kumar: And the data services, and in the data we have, like, you can say we have strategist or analyst, we have analytics engineers, and we have data engineers.

195 00:20:53.170 00:20:54.409 Awaish Kumar: And,

196 00:20:54.610 00:21:08.639 Awaish Kumar: that’s what, like, we provide the services in. So, I just mentioned that I’m leading data engineering part of it, so I’m mostly… so, yeah, anything related to data engineering comes,

197 00:21:08.870 00:21:11.150 Awaish Kumar: To me, I have experience.

198 00:21:11.150 00:21:13.160 Ahmed Alkarboly: Oh, so everything?

199 00:21:14.360 00:21:18.330 Awaish Kumar: Yeah, like, as a… as a lead, I… I just oversee

200 00:21:18.680 00:21:28.089 Awaish Kumar: the data engineering, and then to try to enforce best practices and things like that. Obviously, there’s a team who works on it, but, yeah.

201 00:21:28.410 00:21:32.429 Awaish Kumar: And, apart from that,

202 00:21:32.770 00:21:35.850 Awaish Kumar: You mentioned regarding client communication, so…

203 00:21:36.150 00:21:42.349 Awaish Kumar: We have, like, our sales team, which is basically, responsible, for…

204 00:21:42.540 00:21:54.499 Awaish Kumar: bringing in the deals, and also we have our CEO, and and the co-founder, which are also involved in those client meetings. And then we are…

205 00:21:54.750 00:22:11.449 Awaish Kumar: We are called in those meetings if required. For example, if we are talking to a client which is a data inch heavy, and we need to talk more about the data foundations and the data infrastructure, then we are called in to those meetings and talk about it.

206 00:22:11.460 00:22:23.940 Awaish Kumar: And if it is… the client is someone where we need someone from AI team, then we have people from AI team joining those calls. Same goes for every other project, yeah.

207 00:22:24.480 00:22:33.879 Ahmed Alkarboly: So the AI team, what kind of feedback are they giving you? And I ask because when I was at DISH, one of my responsibilities was managing the data warehouse for…

208 00:22:34.000 00:22:47.379 Ahmed Alkarboly: the research and development group, it was a bunch of people, to be frank, learning about AI, right? So, my job was to make sure they had clean data sets, that if they needed help building models, that I could answer questions on how I would kind of do it.

209 00:22:47.500 00:22:51.360 Ahmed Alkarboly: Is that the kind of interaction that you’re having with your AI folks?

210 00:22:52.030 00:23:08.669 Awaish Kumar: Yeah, right now, we have… like, we don’t… we are not doing any ML work. That is on the line, that’s in the… in the pipeline, which we want to provide as a service, but right now, we are only focused on providing, pure AI services, which is, like,

211 00:23:08.790 00:23:16.940 Awaish Kumar: providing AI features where you can say, a chat box, or where you can actually

212 00:23:17.070 00:23:31.700 Awaish Kumar: make a booking, right? You can write in a human natural language, and I want to book this table in a restaurant on this daytime, and it can actually make a booking for you without you going through the normal flow. So things like that.

213 00:23:31.850 00:23:37.379 Awaish Kumar: But, yeah, we had some conversations where, the…

214 00:23:38.020 00:23:48.679 Awaish Kumar: the client needs some features, for example, the one like I mentioned. It does not require any ML or something, but it’s for… for any,

215 00:23:48.870 00:24:03.519 Awaish Kumar: feature, there is some data required, right? And and there’s… we have some clients which needs the feature. They don’t even have any data warehouse. They don’t have somewhere for us to look for.

216 00:24:03.520 00:24:18.120 Awaish Kumar: Then we are involved, obviously, and we come up… we go back to the client with our new scope, for data engineering, that we need to bring in some data engineering folks here as well, and we need to do this before we move on to…

217 00:24:18.330 00:24:20.960 Awaish Kumar: Building that AI feature.

218 00:24:21.550 00:24:33.819 Ahmed Alkarboly: That makes sense. Yeah, that’s reassuring to hear. One of the things that I was hoping to hear from you was the reality that most people don’t have data that’s ready to go, and that’s, like, 80% of the work.

219 00:24:33.820 00:24:41.820 Ahmed Alkarboly: From my experience. Something you might find interesting is I’m trying to… I’m in the beginning stages of trying to figure out how to architect a safe

220 00:24:42.400 00:24:55.879 Ahmed Alkarboly: agent for the department of… for the DMV. We want a bot where you can… first of all, it’s like, hey, I need to renew my license, or I need a new license, what documents do I need? And then either it tells you, or it points you to the right webpage.

221 00:24:55.900 00:25:12.849 Ahmed Alkarboly: That’s a conversation of, like, where should the content actually live there, or should this bot know things that are not on the website? That’s a conversation that’s happening. And then the second one is, they want people to be able to take pictures of their documents filled out, and to see… to have the agent determine if everything was filled out correctly.

222 00:25:12.980 00:25:14.220 Ahmed Alkarboly: So, you know.

223 00:25:15.160 00:25:23.700 Ahmed Alkarboly: inside of a .gov tenant, how do you make all those things happen? It’s been a very fun thing for me to work out and learn, so that’s something I’m.

224 00:25:23.850 00:25:24.610 Awaish Kumar: Absolutely.

225 00:25:24.810 00:25:27.569 Awaish Kumar: We are doing similar things for over…

226 00:25:27.770 00:25:39.079 Awaish Kumar: for our own company, like, we are… we are, like, obviously hunting for clients all the time, building SOWs and things like that, but then we have built our…

227 00:25:40.370 00:25:57.809 Awaish Kumar: the, like, the pool of agents which can help us build those documents, so somebody… if I come up with an SOW, I need a reviewer to basically give me feedback, and it takes a lot of time for any person to go and review that, the whole document.

228 00:25:58.100 00:26:00.720 Awaish Kumar: So instead, we have built our own…

229 00:26:00.930 00:26:06.040 Awaish Kumar: Agents, a group of agents which can just review their document.

230 00:26:06.170 00:26:08.530 Awaish Kumar: And based on the context given.

231 00:26:09.040 00:26:14.410 Awaish Kumar: The… the guidelines we have defined And, it basically…

232 00:26:14.760 00:26:27.790 Awaish Kumar: reviews the chunks of the document, and give you a feedback on if you have… if your document is good enough to be shared with the client, or if it is not, and what needs to be changed, and how it should be filled out.

233 00:26:28.470 00:26:34.310 Ahmed Alkarboly: So, Kayla… Kayla, I believe the recruiter’s name is, Kayla had mentioned

234 00:26:34.450 00:26:45.329 Ahmed Alkarboly: That you guys manage, essentially, internal tooling. Everyone does that. Is this what… is this the internal tooling that… that she was referring to? A part of it, maybe?

235 00:26:45.540 00:26:48.540 Awaish Kumar: Yeah, we have a lot of internal things, that’s one of them.

236 00:26:53.510 00:26:54.900 Ahmed Alkarboly: Sorry, you cut out.

237 00:26:54.900 00:26:59.110 Awaish Kumar: Oh, I was saying that we do have a lot of internal

238 00:26:59.250 00:27:03.760 Awaish Kumar: Guidelines for using agents, internal platforms, and things like that.

239 00:27:03.870 00:27:07.169 Awaish Kumar: And yes, the thing that I talked about, there is part of it.

240 00:27:07.830 00:27:08.600 Ahmed Alkarboly: Okay.

241 00:27:08.850 00:27:14.450 Ahmed Alkarboly: Very cool. And then does your team… your team actively contributes to that? I guess it’s closed source, but…

242 00:27:14.690 00:27:16.650 Ahmed Alkarboly: Is everyone in brain fog?

243 00:27:16.650 00:27:19.699 Awaish Kumar: Everybody, everyone in the company can see it.

244 00:27:19.830 00:27:21.330 Awaish Kumar: And contribute it.

245 00:27:21.870 00:27:23.300 Ahmed Alkarboly: Okay, that’s very cool.

246 00:27:23.520 00:27:24.280 Ahmed Alkarboly: Very cool.

247 00:27:27.530 00:27:31.669 Ahmed Alkarboly: I don’t have any other questions for you. I guess, do you have my resume in front of you?

248 00:27:32.690 00:27:40.000 Awaish Kumar: I think I have the Notion page from my recruiters, where I have all your information.

249 00:27:40.720 00:27:52.380 Ahmed Alkarboly: Okay, cool. So in there, I don’t, you know, because I don’t put this on my resume, but I also do have my own LLC. It’s called Forerunner Technical. I have a client here in Austin who I do…

250 00:27:52.730 00:28:03.250 Ahmed Alkarboly: pretty much what you guys do, you know, AI consulting. It started off… they’re a heavy equipment rental company, so, like, excavators and bulldozers and stuff like that.

251 00:28:03.500 00:28:10.910 Ahmed Alkarboly: Built out, you know, They didn’t… they were processing documents, like credit checks, on paper.

252 00:28:11.020 00:28:15.700 Ahmed Alkarboly: move them over to Documento, open source, document signing.

253 00:28:15.840 00:28:17.310 Ahmed Alkarboly: solution.

254 00:28:17.320 00:28:37.019 Ahmed Alkarboly: built them, like, forms, deployed some, like, internal applications for them. And then we also… I designed and implemented an Agentic sales lead bot for them, so there’s data sources available to us that tell us when someone has started new construction or applied for a permit, and there’s…

255 00:28:37.350 00:28:51.700 Ahmed Alkarboly: a non-standard way to determine what kind of permit that is. So there’s a… we built it a simple Agentic workflow that looks at all the permits in all the counties that they have offices in. That’s over 11 counties in the state of Texas.

256 00:28:51.840 00:28:54.999 Awaish Kumar: And it forwards to the salespeople.

257 00:28:55.480 00:28:58.409 Ahmed Alkarboly: Opportunities that it believes are relevant.

258 00:28:59.180 00:29:10.639 Ahmed Alkarboly: Aka a new customer, because it cross-checks the permit applicant for our internal… for their internal customer relation management system.

259 00:29:10.760 00:29:14.459 Ahmed Alkarboly: And so that way, they can determine, is this an existing

260 00:29:14.950 00:29:23.149 Ahmed Alkarboly: customer, do I need to route it to the salesperson that… that manages that contract, or is this a new potential customer? So…

261 00:29:23.930 00:29:34.600 Ahmed Alkarboly: Building up that tool stack, if you want to call it, building up the capability in order to do more and more advanced things. You know, the next thing we’re going to move into is to…

262 00:29:34.760 00:29:52.149 Ahmed Alkarboly: they need a smarter inventory management system between their five locations. They actually don’t have a centralized way to track everything, and that’s because the inventory management system they used was… they just moved it to the cloud, but previously, it was a local deployment in each office.

263 00:29:52.520 00:29:53.520 Ahmed Alkarboly: So…

264 00:29:53.760 00:30:05.649 Ahmed Alkarboly: So there was a lot of manual tracking and just insufficient systems. And so, one of the things I do for them is help them build up their digital infrastructure, which does not include AI or ML.

265 00:30:05.650 00:30:11.270 Awaish Kumar: So it was reassuring to hear you talk about the pre-work, like, that’s a reality, that’s what I was hoping to hear.

266 00:30:11.720 00:30:14.670 Awaish Kumar: We are all… we are all data engineers here.

267 00:30:14.780 00:30:21.260 Awaish Kumar: Including the CEO, so we all know what goes on, goes in, in being a data engineer.

268 00:30:21.400 00:30:22.839 Awaish Kumar: Yeah. Yep.

269 00:30:24.250 00:30:29.520 Ahmed Alkarboly: Well, sounds good, man. I, don’t want to hold you any longer than I have 2 minutes over.

270 00:30:30.340 00:30:37.910 Awaish Kumar: Yeah, no worries, I’m just going to submit my feedback, and after that, our recruiters will get back to you.

271 00:30:38.380 00:30:42.209 Ahmed Alkarboly: Okay. Thanks, man, appreciate your time, brother. Stay safe over there.

272 00:30:42.600 00:30:44.200 Awaish Kumar: Yep, you too, bye.