2026-03-12_brainforge_data_engineer_interview

Meeting Title: Brainforge Data Engineer Interview Date: 2026-03-12 Meeting participants: Vinuthna Sandadi, Awaish Kumar

WEBVTT

1 00:02:53.670 ⇒ 00:02:54.550 Awaish Kumar: Hello.

2 00:02:59.690 ⇒ 00:03:03.850 Vinuthna Sandadi: Raish… Did I pronounce the name correctly?

3 00:03:04.970 ⇒ 00:03:05.880 Awaish Kumar: Yes.

4 00:03:06.550 ⇒ 00:03:13.010 Vinuthna Sandadi: Okay, this is Vinas now. I hope I’m not late, I was just wondering if I joined the right link.

5 00:03:13.950 ⇒ 00:03:16.180 Awaish Kumar: Yeah, that’s okay. I will see you then.

6 00:03:16.740 ⇒ 00:03:17.930 Vinuthna Sandadi: Oh, okay, okay.

7 00:03:18.220 ⇒ 00:03:19.160 Vinuthna Sandadi: Nice.

8 00:03:20.230 ⇒ 00:03:21.910 Vinuthna Sandadi: Yeah, I mean.

9 00:03:22.820 ⇒ 00:03:24.369 Awaish Kumar: Where are you located?

10 00:03:25.320 ⇒ 00:03:33.920 Vinuthna Sandadi: I live in Jersey City, but, just, I’m in Chicago right now, I’m in this week, visiting some friends.

11 00:03:35.110 ⇒ 00:03:35.920 Awaish Kumar: Okay.

12 00:03:36.740 ⇒ 00:03:37.280 Vinuthna Sandadi: Yeah.

13 00:03:37.640 ⇒ 00:03:39.740 Awaish Kumar: Yeah, I’m… I don’t know, I’m just…

14 00:03:41.140 ⇒ 00:03:46.459 Awaish Kumar: The video is just, like, lagging, so I don’t know if it is your internet or mine.

15 00:03:47.350 ⇒ 00:03:48.160 Vinuthna Sandadi: Oh.

16 00:03:48.510 ⇒ 00:03:51.230 Vinuthna Sandadi: Mine is good.

17 00:03:52.480 ⇒ 00:03:59.169 Awaish Kumar: Yeah, it’s… I think the same about myself, but yeah, we can start,

18 00:03:59.310 ⇒ 00:04:03.245 Awaish Kumar: So, agenda for the… today’s meeting is…

19 00:04:03.840 ⇒ 00:04:14.150 Awaish Kumar: Like, getting to know more about you and your recent experiences, and what kind of projects you have worked on, and just doing a deep dive on, like, one of the projects.

20 00:04:14.860 ⇒ 00:04:15.789 Awaish Kumar: Yeah.

21 00:04:16.100 ⇒ 00:04:19.879 Awaish Kumar: So, I will just kick it off with my introduction.

22 00:04:19.990 ⇒ 00:04:25.230 Awaish Kumar: My name is Aveesh Kumar, and I’m leading the data engineering at Brainforce.

23 00:04:26.250 ⇒ 00:04:37.650 Awaish Kumar: And PenForge is basically providing AI and data consultancy services to mid- to large-scale enterprises. Most of our clients right now are in U.S,

24 00:04:38.720 ⇒ 00:04:47.329 Awaish Kumar: We are operating remotely, so… our, employees spend across, the world. So, yeah, we have people from

25 00:04:47.430 ⇒ 00:04:53.530 Awaish Kumar: U.S, Europe, Asia, so, yeah, that’s basically it.

26 00:04:53.770 ⇒ 00:04:57.300 Awaish Kumar: So, let’s start it… let’s get started with your introduction.

27 00:05:00.830 ⇒ 00:05:01.510 Vinuthna Sandadi: Yeah.

28 00:05:01.640 ⇒ 00:05:20.550 Vinuthna Sandadi: So, I’m a data analytics engineer with about 5 years of experience, building data pipelines, analytic infrastructures, and cloud-based, platforms. I’m currently working with Galaxy. I work on data platforms and support analytics, trading activity.

29 00:05:20.550 ⇒ 00:05:35.220 Vinuthna Sandadi: client lifecycle data, product usage, yeah, and, sales team as well. So in my current… in my role, I build, like, maintain ETL pipelines, I work… I work with Python, PySpark.

30 00:05:35.330 ⇒ 00:05:52.000 Vinuthna Sandadi: To orchestrate workflows. I also work with Airflow. I develop, SQL-based data models on platforms like, Snowflake and, Databricks to make… to basically make data reliable and accessible, so…

31 00:05:52.000 ⇒ 00:05:56.550 Vinuthna Sandadi: Prior to this, I worked with JP Morgan and CVS Health, where I… where I…

32 00:05:56.550 ⇒ 00:06:10.799 Vinuthna Sandadi: built ETL workflows, designed, data models, developed reporting layers that supported business and operational, analytics, with large, enterprise data sets.

33 00:06:11.030 ⇒ 00:06:15.419 Vinuthna Sandadi: Overall, I, I enjoy working with,

34 00:06:15.610 ⇒ 00:06:23.049 Vinuthna Sandadi: You know, like, big, big data, infrastructures, like, enabling,

35 00:06:23.270 ⇒ 00:06:37.470 Vinuthna Sandadi: you know, like, building robust data warehouses, data warehouse layers, and delivering, you know, like, end-to-end data solutions. I also work on the reporting side. I also, like, build dashboards. I work with…

36 00:06:37.610 ⇒ 00:06:51.030 Vinuthna Sandadi: tools like Power BI, Tableau, and Looker in the past. Yeah, I also have a bit of, data science and ML experience where, you know, I…

37 00:06:51.160 ⇒ 00:07:09.990 Vinuthna Sandadi: I worked on the financial projects, where we also build, like, scalable, ML data pipelines, and integrated data from, you know, multiple sources, and things like that. So, yeah, I guess these are the things that I’ve been, working for over the years.

38 00:07:11.080 ⇒ 00:07:17.830 Awaish Kumar: Yeah, can you give me an example, like, example of a project where you built, like, complex data pipelines?

39 00:07:18.130 ⇒ 00:07:25.170 Awaish Kumar: And, like, how it… like, an end-to-end, like,

40 00:07:25.810 ⇒ 00:07:35.240 Awaish Kumar: flow of, like, how it happened. It was basically implemented, and, like, where were you involved? Like, what were your contributions in that project?

41 00:07:36.290 ⇒ 00:07:38.880 Vinuthna Sandadi: Yeah, yeah.

42 00:07:39.080 ⇒ 00:07:49.170 Vinuthna Sandadi: Okay, so in my, in my recent project at Galaxy, so I’ve, I’ve led… I led, like, a project.

43 00:07:49.280 ⇒ 00:08:05.449 Vinuthna Sandadi: where I… I built and maintained, like, scalable ETL pipelines using PySpark and, ingested data from multiple sources, like, you know, we had multiple data sources that the data was sitting at, and then…

44 00:08:05.450 ⇒ 00:08:12.850 Vinuthna Sandadi: it was a transactional data, and, we ingested that data into Snowflake, and, we,

45 00:08:12.850 ⇒ 00:08:26.239 Vinuthna Sandadi: I mean, we… I also use, like, cleaning techniques. We… I implemented data quality checks, did data normalization using Spark. I also used AWS EMR, which got, like, I mean,

46 00:08:26.240 ⇒ 00:08:37.960 Vinuthna Sandadi: we… we did batch processing of the data, and, the data was ingested, on a daily basis. Our pipelines run every day, to ingest this data into our,

47 00:08:37.960 ⇒ 00:08:51.580 Vinuthna Sandadi: you know, in the data layer, we also curated data sets on top of that and built BI tables that were directly used by our… by the analytics teams to build reporting layers and dashboards on top of that.

48 00:08:51.580 ⇒ 00:08:59.150 Vinuthna Sandadi: And we also, like, reduced the manual deployment, with this process.

49 00:08:59.150 ⇒ 00:09:02.329 Awaish Kumar: Are you the only person in the team doing all of that, or…

50 00:09:02.340 ⇒ 00:09:12.560 Vinuthna Sandadi: So I… yeah, I mean, at Galaxy, we… we wear multiple teams, but it’s just not me. We have, like, different projects, and we have, like, our…

51 00:09:12.560 ⇒ 00:09:15.200 Awaish Kumar: We’re talking about that project specifically.

52 00:09:15.200 ⇒ 00:09:18.599 Vinuthna Sandadi: So, sorry, what did you say?

53 00:09:19.190 ⇒ 00:09:26.250 Awaish Kumar: Yeah, I said we are just talking about the project that you are talking about. Weren’t you the only one doing that project, or…

54 00:09:26.390 ⇒ 00:09:30.930 Awaish Kumar: If it was a team effort, then, like, what was your contributions?

55 00:09:30.930 ⇒ 00:09:35.609 Vinuthna Sandadi: With this particular one, I was dealing with,

56 00:09:35.820 ⇒ 00:09:41.940 Vinuthna Sandadi: It was a team effort, and some of the sources were also helped by the engineering team.

57 00:09:42.040 ⇒ 00:09:55.170 Vinuthna Sandadi: And, but I… I kind of led the major, ingestion of the, major ingestion part, and then, like, I… we had, like, multiple sources of the data, and,

58 00:09:55.210 ⇒ 00:10:09.510 Vinuthna Sandadi: yeah, I mean, I could say, like, maybe I… I did, like, 70-80% of the, I worked with 7… 70-80% of the part, and then I also took some help where… when… where we had to, like, ingest

59 00:10:09.510 ⇒ 00:10:21.379 Vinuthna Sandadi: data from, for the marketing team. We had, like, Google Analytics API, Apple ads, things like that, so I also had some of the help from the engineering team to help in just that part of the data.

60 00:10:21.510 ⇒ 00:10:26.729 Awaish Kumar: What sources were… There, where you basically ingested the data.

61 00:10:27.530 ⇒ 00:10:33.709 Vinuthna Sandadi: So we had CRM sources, it… it wasn’t, like, a single source.

62 00:10:34.330 ⇒ 00:10:35.249 Vinuthna Sandadi: It was.

63 00:10:35.600 ⇒ 00:10:37.490 Awaish Kumar: Sources, that’s… there’s all oil.

64 00:10:37.680 ⇒ 00:10:51.019 Vinuthna Sandadi: Oh, okay, we had segment, HubSpot, Salesforce data that I was majorly working with, and, yeah, and some of the Google Analytics API data, Apple AdWords, as I mentioned earlier.

65 00:10:51.570 ⇒ 00:11:03.410 Awaish Kumar: Okay, and how are the data from these ingested into Snowflake? Like, I know you mentioned PySpark, but were you writing API calls in PySpark, or, like, what exactly you were doing?

66 00:11:04.350 ⇒ 00:11:06.830 Vinuthna Sandadi: Yeah, we were doing the API calls, that’s correct.

67 00:11:07.350 ⇒ 00:11:09.019 Awaish Kumar: To each individual source.

68 00:11:10.460 ⇒ 00:11:23.600 Vinuthna Sandadi: Yeah, yeah, right, that’s correct. I mean, Snowflake was in, snowflake was not the only data layer. We also had Databricks, where some of the sources were directly ingested, into the tables, into Databricks.

69 00:11:25.250 ⇒ 00:11:32.820 Awaish Kumar: Yeah, and and can you, like, like, what was the reason for having two different, data

70 00:11:33.170 ⇒ 00:11:37.489 Awaish Kumar: warehouse is kind of, like, Databricks and Snowflake.

71 00:11:39.170 ⇒ 00:11:44.400 Vinuthna Sandadi: Well, some, some of the, We have both.

72 00:11:44.660 ⇒ 00:11:52.030 Vinuthna Sandadi: I mean, it was just about, like, leveraging different stands, or different parts of the data workflow, right? We use Databricks.

73 00:11:52.130 ⇒ 00:12:05.560 Vinuthna Sandadi: we use Databricks for Spark, Engine, and Delta Lake, like, it’s great for building, like, scalable and flexible ETL data pipelines, where we have to, like, maintain or process large

74 00:12:05.560 ⇒ 00:12:22.490 Vinuthna Sandadi: a huge data, both with patch and streaming data, where it’s easy and, it’s also a bit fast and more scalable than using, like, Snowflake and, where we had to do, like, or where we had to, like, dump,

75 00:12:22.560 ⇒ 00:12:24.410 Vinuthna Sandadi: Data where it’s, where it.

76 00:12:24.410 ⇒ 00:12:31.100 Awaish Kumar: Were you using PySpark in Databricks, or were you using PySpark in AWS EMR?

77 00:12:31.610 ⇒ 00:12:32.670 Awaish Kumar: You mentioned both.

78 00:12:34.450 ⇒ 00:12:37.669 Vinuthna Sandadi: Yeah, yeah, in Databricks. I mean, we had,

79 00:12:38.460 ⇒ 00:12:53.680 Vinuthna Sandadi: I did use in both environments, actually. We used PySpark EMR to build the ETL pipelines, and for Delta Lakes and PySpark pipelines within the Medillion architecture, like data processing, data transformation, we use PySpark.

80 00:12:53.680 ⇒ 00:12:58.029 Awaish Kumar: You know, like, within… EMR is on the AWS, right?

81 00:12:58.030 ⇒ 00:12:58.600 Vinuthna Sandadi: Yeah.

82 00:12:59.330 ⇒ 00:13:04.590 Awaish Kumar: So you use PySwag in the EMR to… ingest the data, right? To…

83 00:13:05.020 ⇒ 00:13:08.589 Awaish Kumar: read the data from different sources, API calls, and…

84 00:13:08.940 ⇒ 00:13:13.139 Awaish Kumar: Process it and store it, to some destination, right?

85 00:13:13.510 ⇒ 00:13:25.190 Vinuthna Sandadi: Yes, exactly, yeah. I mean, I use EMR to run PySpark and Scalar jobs, like, we use, like, ingested in cleansing and transform… transforming the data.

86 00:13:26.230 ⇒ 00:13:29.520 Awaish Kumar: Okay, but when the data has to go to…

87 00:13:29.660 ⇒ 00:13:32.620 Awaish Kumar: When it’s already, like, processed, why then?

88 00:13:32.800 ⇒ 00:13:37.419 Awaish Kumar: We need to use Databricks, sorry, PySpark again in the Databricks.

89 00:13:38.160 ⇒ 00:13:52.009 Vinuthna Sandadi: No, I was mentioning that we used, for different sources of the data, we’ve used sometimes AWS EMR and sometimes PySpark, but there were also, cases where, after the initial pre-processing.

90 00:13:52.010 ⇒ 00:14:11.890 Vinuthna Sandadi: Because for the… I mean, for the data refinement or advanced transformations or building ML models, so even after the data is ingested and cleaned, Databricks helps it to apply, like, complex business log… logics, and we created, like, a curated data, I mean, for the curated data, we created the Medellin architecture even to, like.

91 00:14:12.070 ⇒ 00:14:17.720 Vinuthna Sandadi: to, like, productionize the ML where it flows, or, like, you know, data.

92 00:14:17.770 ⇒ 00:14:23.699 Vinuthna Sandadi: Data that’s been used by the, analytics or the data scientist team.

93 00:14:23.740 ⇒ 00:14:37.420 Vinuthna Sandadi: So, I mean, I believe, like, I mean, with Databricks, you can also, like, version… version it, and then, you can also, like, it is easily integratable, with the, you know, like, the…

94 00:14:37.420 ⇒ 00:14:43.800 Vinuthna Sandadi: direct… directly with the BI tools, like Tableau, so that’s much more, like, convenient, too.

95 00:14:44.590 ⇒ 00:14:48.150 Awaish Kumar: Okay, and… have you used dbt?

96 00:14:50.010 ⇒ 00:14:53.320 Vinuthna Sandadi: Yeah, I mean, in the past I have, yes.

97 00:14:54.540 ⇒ 00:14:57.570 Awaish Kumar: So, like, what is the… like, the…

98 00:14:58.420 ⇒ 00:15:02.030 Awaish Kumar: difference between using dbt or PySpark?

99 00:15:05.080 ⇒ 00:15:09.149 Awaish Kumar: If you have to just do… Like the…

100 00:15:09.430 ⇒ 00:15:12.310 Vinuthna Sandadi: I mean, there’s different purposes.

101 00:15:13.790 ⇒ 00:15:17.919 Awaish Kumar: I mean, if you have to just transform your data for business

102 00:15:18.030 ⇒ 00:15:23.059 Awaish Kumar: logic and things like that. Both can be used in that case.

103 00:15:23.190 ⇒ 00:15:24.090 Vinuthna Sandadi: So.

104 00:15:24.600 ⇒ 00:15:29.770 Awaish Kumar: like, why you chose PySpark and over dbt?

105 00:15:33.510 ⇒ 00:15:41.219 Vinuthna Sandadi: Yeah. So, I mean, when the data transformation is easy… I mean, it’s easy to, like.

106 00:15:41.330 ⇒ 00:15:59.400 Vinuthna Sandadi: when we’re, like, transforming huge data, right, with large-scale data processing, where, like, when it goes, like, beyond SQL capabilities, like working with huge data sets, or applying advanced business logic, or integrating ML workflows, in that specific reasons, we’ve, like, we would work with PySpark.

107 00:15:59.430 ⇒ 00:16:02.949 Vinuthna Sandadi: And, I mean, over DBT, so…

108 00:16:03.930 ⇒ 00:16:15.240 Awaish Kumar: My question is more like, when you already have data in your Databricks, and then we are not moving it out from the Databricks, obviously, you process it and store it in the same…

109 00:16:15.700 ⇒ 00:16:18.419 Awaish Kumar: Databricks, right?

110 00:16:18.660 ⇒ 00:16:21.919 Awaish Kumar: in the same, destination. So…

111 00:16:22.430 ⇒ 00:16:32.079 Awaish Kumar: when it’s, like, the same place, like, you can use Databricks, capabilities via dbt to process your data, and it can obviously process, like.

112 00:16:32.080 ⇒ 00:16:40.380 Vinuthna Sandadi: I mean, when the data is already in Databricks, yeah, PySpark definitely makes total sense. We wouldn’t, like, use dbt, right?

113 00:16:41.340 ⇒ 00:17:00.800 Vinuthna Sandadi: I mean, if you need to, like, scale, like, scale or process the right, like, where the data lives, right, you can apply, like, complex business transformation or even, like, ML models efficiently without, like, having any extra, like, data movement, which saves time and reduce errors. So, like, Databricks and Data Lake… Data Lake also, like, provides.

114 00:17:00.810 ⇒ 00:17:03.960 Vinuthna Sandadi: You know, like, versioning, like.

115 00:17:04.010 ⇒ 00:17:17.740 Vinuthna Sandadi: You know, and it’s… it’s easy data… I mean, it’s easy data management, so we can get, like, high data quality and consistent data, and it even, like, supports collaborative development, and it has quick iteration.

116 00:17:17.940 ⇒ 00:17:27.030 Vinuthna Sandadi: Yeah, I mean, it is also easy to, like, store data within Databricks and keeps everything streamlined and, performant.

117 00:17:27.380 ⇒ 00:17:29.040 Awaish Kumar: Yeah, so…

118 00:17:29.180 ⇒ 00:17:39.050 Awaish Kumar: Yeah, so the, like, the point I was trying to make is, like, once data is in a warehouse, DVD just runs on top of the warehouse. It won’t move data anywhere.

119 00:17:39.340 ⇒ 00:17:43.559 Awaish Kumar: So, it also makes sense to use dbt, because

120 00:17:44.200 ⇒ 00:17:47.009 Awaish Kumar: like, PySpark comes with its own…

121 00:17:48.240 ⇒ 00:17:54.949 Awaish Kumar: Like, it’s really heavy, so it comes with its own… The environmental setup and everything.

122 00:17:55.290 ⇒ 00:17:55.980 Vinuthna Sandadi: Yeah.

123 00:17:57.640 ⇒ 00:18:08.160 Vinuthna Sandadi: I mean, yeah, because dbt is, like, great for, like, lightweight and SQL-based transformation, right? Indirectly in the warehouse without, like, having to move the data anywhere.

124 00:18:08.210 ⇒ 00:18:25.620 Vinuthna Sandadi: Whereas SpySpark and Databricks, like, it’s used for, like, transformation, and when we’re had… when… yeah, as I mentioned, like, when we’re doing, like, complex transformations or dealing with large-scale data and advanced business logic, yeah, and especially when applying AI or ML,

125 00:18:25.620 ⇒ 00:18:34.159 Vinuthna Sandadi: integration, so I… I mean, I guess these… this is a specific reason we integrated with, like, PySquart, because it has, like, distributed

126 00:18:34.300 ⇒ 00:18:39.750 Vinuthna Sandadi: Processing and handles, like, heavy, heavier data, and…

127 00:18:40.000 ⇒ 00:18:47.849 Vinuthna Sandadi: I feel like if it’s, like, semi-structured data, I think dbt is better, like, even with, like,

128 00:18:47.990 ⇒ 00:18:53.569 Vinuthna Sandadi: I mean, even Databricks, like, offers Delta Lake or versioning, things like that, but…

129 00:18:53.700 ⇒ 00:18:59.769 Vinuthna Sandadi: Yeah, I mean, when it’s analytics, or when we’re dealing with, like.

130 00:19:00.460 ⇒ 00:19:04.710 Vinuthna Sandadi: Sorry, can you hear me okay? Yeah, yeah, I can hear you.

131 00:19:05.990 ⇒ 00:19:12.640 Awaish Kumar: Okay, my next question was, like, why you don’t use the tools like Fivetran or…

132 00:19:14.610 ⇒ 00:19:18.830 Awaish Kumar: Some, like, ear bite for the injection, instead of riding your own.

133 00:19:21.210 ⇒ 00:19:22.140 Awaish Kumar: API quotes.

134 00:19:23.970 ⇒ 00:19:36.360 Vinuthna Sandadi: I mean, we usually, like… I mean, we usually use, like, custom engine pipelines, right? Like, whatever the business and the company supports. So, in my… in my experience, like.

135 00:19:36.640 ⇒ 00:19:45.080 Vinuthna Sandadi: Like, with JP Morgan or Galaxy, we’ve used, like, PySpark pipelines to handle, like, complex transformations or, you know, incremental loads.

136 00:19:45.280 ⇒ 00:19:52.660 Vinuthna Sandadi: And, it, like, custom validation checks or tools. I mean, these tools might not, like, support well.

137 00:19:52.810 ⇒ 00:19:55.490 Vinuthna Sandadi: I, I, I believe that was, like, the spec…

138 00:19:55.490 ⇒ 00:20:01.620 Awaish Kumar: So, since you have mentioned, like, HubSpot, segment, like, Facebook ads, or Google Ads, and…

139 00:20:01.750 ⇒ 00:20:11.209 Awaish Kumar: these tools support all these sources, and they support it, like, quite well. I have used it a lot, so…

140 00:20:11.350 ⇒ 00:20:13.279 Awaish Kumar: I never had any issues with that.

141 00:20:14.370 ⇒ 00:20:32.529 Vinuthna Sandadi: I mean, we’ve also, like, had a direct integration set up, like, Databricks within HubSpot, and we’ve also used AppFlow within AWS to do the data ingestion part, which made it so much easier, and we didn’t… we didn’t need to, like, use any other tools.

142 00:20:32.660 ⇒ 00:20:34.370 Vinuthna Sandadi: And…

143 00:20:34.440 ⇒ 00:20:43.140 Vinuthna Sandadi: it was, like, it came from the management that we use this way, but I’m pretty sure, like, the other tools would also, like, be much more

144 00:20:43.190 ⇒ 00:20:59.189 Vinuthna Sandadi: you know, efficient, when doing these ingestions. But since these were, like, huge data, and we had to do, like, complex transformations, yeah, I mean, it did support, like, it did fully handle the data ingestion part, and then.

145 00:20:59.190 ⇒ 00:20:59.870 Awaish Kumar: Okay.

146 00:20:59.870 ⇒ 00:21:02.290 Vinuthna Sandadi: Yeah, so…

147 00:21:02.460 ⇒ 00:21:07.859 Awaish Kumar: Since you mentioned airflow, like, can you a little bit explain, like, the architecture of the airflow?

148 00:21:09.190 ⇒ 00:21:15.600 Vinuthna Sandadi: Yeah. So with, bride…

149 00:21:15.860 ⇒ 00:21:22.770 Vinuthna Sandadi: See, I mean, it’s like, it’s built around, like, few key concepts, you know, there’s, like, a scheduler.

150 00:21:22.770 ⇒ 00:21:36.689 Vinuthna Sandadi: Which is, like, responsible for triggering, like, task-based defined schedules or dependencies. Then comes, like, the web server, which, you know, like, provides the user interface where you can monitor, manage, and trigger workflows, and then.

151 00:21:36.720 ⇒ 00:21:50.209 Vinuthna Sandadi: Which is, like, DAG, and then next comes, like, a meta… I guess, like, a metadata database, which stores all the state information about, like, DAG and TAS, and then their runs. Like, like, we can say, like, usually it’s, like.

152 00:21:50.210 ⇒ 00:21:59.679 Vinuthna Sandadi: you know, when it’s a relational database, maybe, like, Postgres or MySQL, then comes, like, an executor, which is, like, actually, you know, like, runs the task.

153 00:21:59.780 ⇒ 00:22:19.290 Vinuthna Sandadi: And, it can be, like, local or, you know, in Kubernetes or other, depend… depending on the setup, actually. And finally, then comes, you know, like, the workers which execute the task assigned by the, you know, like, the execute… it is executed by the whole system and works by defining, workflows.

154 00:22:19.380 ⇒ 00:22:31.619 Vinuthna Sandadi: DAGs, like, direct, and, you know, like, direct, you know, like, directed acrylic graphs in Python, or, you know, like, scheduler’s the,

155 00:22:31.910 ⇒ 00:22:36.840 Awaish Kumar: Yeah, then, like, the second question is, like,

156 00:22:37.530 ⇒ 00:22:40.789 Awaish Kumar: if… if I… I have,

157 00:22:41.190 ⇒ 00:22:43.569 Awaish Kumar: Added a task in their flow.

158 00:22:44.350 ⇒ 00:22:51.879 Awaish Kumar: But, like, my task, sometimes it runs okay, but sometimes it gets stuck, and…

159 00:22:52.050 ⇒ 00:23:02.840 Awaish Kumar: it’s just stuck there for, like, infinite time. And I want to… I wanted to, like, get some response, like, if it does not succeed in, like,

160 00:23:02.950 ⇒ 00:23:05.900 Awaish Kumar: 10 minutes, I should get alerted in Slack.

161 00:23:06.080 ⇒ 00:23:11.880 Awaish Kumar: So, what you would normally do to… To implement this feature.

162 00:23:13.600 ⇒ 00:23:20.120 Vinuthna Sandadi: Yeah, Slack integration. See, I would, like, set up, like, a timeout,

163 00:23:20.170 ⇒ 00:23:37.000 Vinuthna Sandadi: I mean, task using, like, airflow execution timeout parameter, and limit how long it can take. Let’s say, like, 10 minutes, right? And if a task exceeds that, airflow will, like, mark it as a fail… mark it as failed, and then I would use, like, airflow alert… alerting feature by adding, like, a…

164 00:23:37.000 ⇒ 00:23:50.610 Vinuthna Sandadi: failure callback, and I would configure Slack API or, you know, like, config… for configuring email or, like, a Slack update, then, you know, maybe, like, a custom Python function that will, like.

165 00:23:50.760 ⇒ 00:23:57.240 Awaish Kumar: The task fail if it is… would you make your task to fail if it does not succeed in 10 minutes?

166 00:23:59.300 ⇒ 00:24:07.999 Vinuthna Sandadi: Yeah, I mean, we will set, like, a timeout, right? So we can set a timeout, like, using extension timeout within the airflow.

167 00:24:09.410 ⇒ 00:24:11.780 Awaish Kumar: Yeah, that’s my question. How would you set that?

168 00:24:11.880 ⇒ 00:24:14.850 Awaish Kumar: And therefore… What exactly you will write.

169 00:24:15.200 ⇒ 00:24:16.429 Vinuthna Sandadi: Do you have one?

170 00:24:16.630 ⇒ 00:24:24.670 Vinuthna Sandadi: We have an execution timer parameter in airflow, right? So I would… I would… I agree.

171 00:24:24.670 ⇒ 00:24:27.250 Awaish Kumar: Any parameter with this name.

172 00:24:28.430 ⇒ 00:24:30.429 Vinuthna Sandadi: Sorry, what? Can you say that again?

173 00:24:30.430 ⇒ 00:24:35.369 Awaish Kumar: Yeah, I don’t remember any parameter with this name, like execution parameter timeout.

174 00:24:38.480 ⇒ 00:24:50.929 Vinuthna Sandadi: that’s with the time delta, right? So it’s, like, we use… we use it to set, like, a max runtime of the task, it’s part, like, of a… it’s part… it’s part of, like, a base operator, I mean…

175 00:24:51.700 ⇒ 00:25:01.370 Vinuthna Sandadi: If you guys are, like, using anything, like, an Airflow, you know, like a… Sorry, Ken, again, maybe.

176 00:25:01.370 ⇒ 00:25:11.620 Awaish Kumar: Like, can you, like, or just say, recall, the parameter, something like SLA or something?

177 00:25:14.480 ⇒ 00:25:19.150 Vinuthna Sandadi: Sorry, do you mind repeating that again? Is it…

178 00:25:19.150 ⇒ 00:25:23.230 Awaish Kumar: Yeah, I’m saying, is there any parameter called SLA?

179 00:25:29.490 ⇒ 00:25:31.650 Vinuthna Sandadi: Yeah,

180 00:25:33.770 ⇒ 00:25:50.430 Vinuthna Sandadi: I think… I mean, if you guys are, like, using any older version of, execution parameter, you can… you can use… I mean, it isn’t available, then we can, like, handle the timeout. I can use, like, Python signal module or a timeout wrapper to, like.

181 00:25:50.650 ⇒ 00:26:00.579 Vinuthna Sandadi: force the, function, you know, like… but I, I do believe there’s… I mean, there is, like, an execution timer parameter.

182 00:26:00.580 ⇒ 00:26:01.100 Awaish Kumar: Cheers.

183 00:26:01.100 ⇒ 00:26:01.520 Vinuthna Sandadi: It doesn’t.

184 00:26:01.520 ⇒ 00:26:04.759 Awaish Kumar: There is one, and it is called SLA.

185 00:26:05.310 ⇒ 00:26:07.009 Awaish Kumar: Where you can basically set

186 00:26:07.300 ⇒ 00:26:14.100 Awaish Kumar: SLAs. It’s called Service Level Agreement, and where you can basically set the timeout for your tasks.

187 00:26:14.100 ⇒ 00:26:14.950 Vinuthna Sandadi: Yeah, that’s…

188 00:26:16.070 ⇒ 00:26:23.539 Vinuthna Sandadi: Yeah, we can configure an SLA callback function, right, to send an alert to Slack messages. Is that what you’re…

189 00:26:23.700 ⇒ 00:26:24.980 Vinuthna Sandadi: Is that what you mean?

190 00:26:24.980 ⇒ 00:26:27.119 Awaish Kumar: Tell me is basically a parameter to…

191 00:26:27.120 ⇒ 00:26:30.709 Vinuthna Sandadi: also set, like, an SLA time delta with which, you know.

192 00:26:30.990 ⇒ 00:26:38.589 Vinuthna Sandadi: we can define an SLA, you know, SLA, a missed callback to, like, alert the… or handle the notifications, if that’s.

193 00:26:38.590 ⇒ 00:26:39.300 Awaish Kumar: I’m kidding.

194 00:26:39.430 ⇒ 00:26:48.229 Awaish Kumar: include the callback, but before that, you have to find… define the SLA parameter with some timeout, so it can fail, and then…

195 00:26:48.490 ⇒ 00:26:50.720 Awaish Kumar: And then calls to the function where.

196 00:26:50.720 ⇒ 00:27:01.120 Vinuthna Sandadi: Yeah, I mean, yeah, we would do something, like, we… we would set the SLS, which is, like, equal to, like, a time delta, minutes, we would set it as 10 minutes, if you want to, like.

197 00:27:02.360 ⇒ 00:27:02.680 Awaish Kumar: Yeah.

198 00:27:02.780 ⇒ 00:27:06.540 Vinuthna Sandadi: They’ll, you know, handle the slack and… Is that…

199 00:27:07.770 ⇒ 00:27:08.520 Awaish Kumar: -

200 00:27:08.640 ⇒ 00:27:13.960 Awaish Kumar: Yeah, so, yeah, and moving on,

201 00:27:14.500 ⇒ 00:27:23.069 Awaish Kumar: So, yeah, I think I’m done with all the technical questions. I just want to understand how would you,

202 00:27:23.440 ⇒ 00:27:29.550 Awaish Kumar: Communicate, with your non-technical stakeholders regarding your work.

203 00:27:30.580 ⇒ 00:27:33.969 Awaish Kumar: If you have to explain anything to the non-technical stakeholders.

204 00:27:34.490 ⇒ 00:27:52.759 Vinuthna Sandadi: Yeah, I mean, I try to, like, I always try to keep things, like, simple and business-focused when, like, dealing with especially non-technical stakeholders, and I would… I would use, like, PL visuals, like dashboards, or, you know, like, PowerPoints, or…

205 00:27:52.850 ⇒ 00:27:58.899 Vinuthna Sandadi: To make… to make… make it simpler, and then I would also, like, trans… translate

206 00:27:59.180 ⇒ 00:28:06.469 Vinuthna Sandadi: technical terms into business, business terms, and I would… I would try to, like, avoid,

207 00:28:06.750 ⇒ 00:28:19.160 Vinuthna Sandadi: jargon, or I’ll try to, like, explain the data flow completely, and how we’re able to, like, improve the, improve, improve the requirement, and yeah, I would,

208 00:28:20.640 ⇒ 00:28:27.240 Vinuthna Sandadi: I would always try to, like, if they have any plans initially, I would always, like, try to go with their plan and doing…

209 00:28:27.240 ⇒ 00:28:42.689 Vinuthna Sandadi: doing things in a certain way, or, I would… I would also, like, come… come up with some recommendations, and help them understand which one to go with, or, which would have more impact, or help them make better decisions with the,

210 00:28:42.880 ⇒ 00:28:49.010 Vinuthna Sandadi: you know, with the decisions that, that we make, during, during our calls.

211 00:28:51.390 ⇒ 00:28:52.200 Awaish Kumar: Okay.

212 00:28:52.400 ⇒ 00:28:58.319 Awaish Kumar: And also, like, if there is a disagreement within your team, How would you resolve that?

213 00:29:01.730 ⇒ 00:29:20.750 Vinuthna Sandadi: with disagreements, firstly, I try to, like, understand their perspective on the decision, or why they would probably want to, like, go forward with their approach, and I would try to, like, validate both… both our approaches, and I’m not… I’m not, like.

214 00:29:20.940 ⇒ 00:29:28.540 Vinuthna Sandadi: I’m not, like… I should… I mean, it’s not that I am right always, or they are right always, so I try to, like,

215 00:29:28.620 ⇒ 00:29:40.960 Vinuthna Sandadi: Listen to their approaches, and try to find, like, a common ground, and and try to think about it, and see if we could… if… see if we could also do, like, a hybrid approach.

216 00:29:41.080 ⇒ 00:29:53.530 Vinuthna Sandadi: And, yeah, I believe that that would be more optimized, but it depends. It depends on the problem and approaches that we’re… and the solutions that we’re, dealing with.

217 00:29:54.890 ⇒ 00:29:55.680 Awaish Kumar: Okay.

218 00:29:55.860 ⇒ 00:30:03.889 Awaish Kumar: And, like… Okay, yeah, the last question is, like, why are you looking for a new role, and…

219 00:30:04.530 ⇒ 00:30:07.300 Awaish Kumar: Why do you think, like, BrainForge would be a really fit?

220 00:30:08.300 ⇒ 00:30:09.040 Awaish Kumar: for you.

221 00:30:11.980 ⇒ 00:30:20.119 Vinuthna Sandadi: Yeah, I… I’m… so I’m currently with Galaxy, and I…

222 00:30:20.120 ⇒ 00:30:39.810 Vinuthna Sandadi: So my current engagement with Galaxy, is not on a contractual basis, but, there was some, there was some restructuring done at, at the company, where the company is no longer, like, sponsoring visas to the employees, and we’re not sure, like, at this time,

223 00:30:39.880 ⇒ 00:30:53.470 Vinuthna Sandadi: where we are at, in terms of sponsorship, and I… I mean, given my situation, I do not need sponsorship immediately as well, but I’m just trying to, like, explore new roles,

224 00:30:53.790 ⇒ 00:30:56.530 Vinuthna Sandadi: And you know, like.

225 00:30:56.630 ⇒ 00:31:06.650 Vinuthna Sandadi: see how I might be a best fit, and I’m trying to, like, look for the right team. I mean, if the right role and right team comes across, and I’m like, why not, like, give it a try?

226 00:31:06.660 ⇒ 00:31:17.850 Vinuthna Sandadi: But especially, with Brainforge, I love working on MVC… MVPs and shipping, like, proof of concepts and,

227 00:31:17.850 ⇒ 00:31:30.630 Vinuthna Sandadi: I’m having… working with Galaxy, like, we’re a smaller team, and I would definitely want to, like, expand that, the knowledge that I’ve acquired over the years, and I feel like

228 00:31:30.630 ⇒ 00:31:45.540 Vinuthna Sandadi: working with smaller teams also, like, gives me that, you know, like, the experience that I need. And, yeah, I guess, I, I felt Brainforge would also provide me such,

229 00:31:45.540 ⇒ 00:31:51.600 Vinuthna Sandadi: Such space to grow and expand my skill set, and yeah.

230 00:31:52.280 ⇒ 00:31:53.490 Vinuthna Sandadi: And,

231 00:31:53.910 ⇒ 00:32:09.229 Vinuthna Sandadi: And I believe Brainforge takes over, like, clients’ projects and works on them. It’s not specifically, like, sticking with one particular client, or you… is that how it works? I believe not.

232 00:32:09.440 ⇒ 00:32:10.280 Vinuthna Sandadi: Right?

233 00:32:11.710 ⇒ 00:32:15.629 Awaish Kumar: At Brainforge is basically… it’s a consultancy, obviously.

234 00:32:15.630 ⇒ 00:32:17.849 Vinuthna Sandadi: So you get to work on…

235 00:32:18.040 ⇒ 00:32:20.520 Awaish Kumar: On the clients, and

236 00:32:20.960 ⇒ 00:32:27.969 Awaish Kumar: Maybe you might be working on more than one client simultaneously, or you can also…

237 00:32:28.200 ⇒ 00:32:33.699 Awaish Kumar: Maybe, yeah, if one client goes, we can have another client coming in, and…

238 00:32:34.310 ⇒ 00:32:40.440 Awaish Kumar: And most of our clients are, like, mid to large enterprises. We have larger contracts, but

239 00:32:41.860 ⇒ 00:32:48.519 Awaish Kumar: Yeah, but our, like, people, basically, rotate.

240 00:32:48.690 ⇒ 00:32:50.360 Awaish Kumar: Because…

241 00:32:50.630 ⇒ 00:33:06.009 Awaish Kumar: like, some clients might need, your services for some time, and for other time, we might lead to more data analyst work, or… and not more of a DEA work. So we basically rotate based on the client needs.

242 00:33:06.220 ⇒ 00:33:07.680 Awaish Kumar: In the company.

243 00:33:09.240 ⇒ 00:33:18.540 Vinuthna Sandadi: Okay, yeah, sounds good. I mean, I… what kind of, clients do you currently handle, or what… what kind of projects?

244 00:33:18.770 ⇒ 00:33:19.590 Vinuthna Sandadi: That the demon.

245 00:33:19.590 ⇒ 00:33:24.310 Awaish Kumar: So we have, like, we have, like, obviously data engineering projects.

246 00:33:24.520 ⇒ 00:33:31.880 Awaish Kumar: But our clients are from different industries, like telehealth, they are from… there are CPG companies.

247 00:33:32.470 ⇒ 00:33:39.100 Awaish Kumar: A lot of them are, like, e-commerce clients, selling on Amazon, Shopify, And things like that, so…

248 00:33:39.530 ⇒ 00:33:44.230 Awaish Kumar: So yeah, these are our major clients in the space.

249 00:33:47.360 ⇒ 00:33:53.380 Awaish Kumar: Yeah, so that’s basically… And we have also, like, AI.

250 00:33:54.230 ⇒ 00:34:01.689 Awaish Kumar: part AI department, so we have AI clients which basically need AI services, we have a team, AI team, like, for that.

251 00:34:02.090 ⇒ 00:34:05.260 Awaish Kumar: So, we provide both data and AI consultancy services.

252 00:34:07.640 ⇒ 00:34:08.520 Vinuthna Sandadi: Okay.

253 00:34:08.670 ⇒ 00:34:17.540 Vinuthna Sandadi: So this is, so Brainforge is completely client-based in, not like a, service or a product-based company, right?

254 00:34:17.540 ⇒ 00:34:22.740 Awaish Kumar: Like, it’s a service base, like, obviously, provides services to clients.

255 00:34:22.739 ⇒ 00:34:24.630 Vinuthna Sandadi: No, no, we are not product-based.

256 00:34:25.060 ⇒ 00:34:25.900 Vinuthna Sandadi: Okay.

257 00:34:26.070 ⇒ 00:34:28.569 Vinuthna Sandadi: Yeah, I mean, yeah, maintained,

258 00:34:29.320 ⇒ 00:34:47.149 Vinuthna Sandadi: Yeah, and is it current, like, the data engineering team, is currently working on, like, any, any of the, like, the AI initiatives, or trying to, like, integrate any AI, AI tools or services within the data engineering,

259 00:34:48.100 ⇒ 00:34:53.960 Awaish Kumar: We are using AI services, like, for our own, like, speeding up for our own development.

260 00:34:54.020 ⇒ 00:34:55.739 Vinuthna Sandadi: For example, it was…

261 00:34:56.310 ⇒ 00:34:58.549 Awaish Kumar: AI tools to develop.

262 00:34:58.890 ⇒ 00:35:03.970 Awaish Kumar: And then, we have our AI team build some internal tools.

263 00:35:04.170 ⇒ 00:35:07.830 Awaish Kumar: To speed up, like, how we manage tickets, how we…

264 00:35:08.080 ⇒ 00:35:09.010 Vinuthna Sandadi: No.

265 00:35:09.010 ⇒ 00:35:12.210 Awaish Kumar: update our tickets, how PR gets reviewed.

266 00:35:12.820 ⇒ 00:35:20.500 Awaish Kumar: our QAID, so we have… we are growing that data platform for… to speed up our…

267 00:35:20.720 ⇒ 00:35:23.000 Awaish Kumar: Internal, development.

268 00:35:23.260 ⇒ 00:35:30.179 Awaish Kumar: But apart from that, if there is any AI-related, like, client work, then it goes to the AI team.

269 00:35:30.330 ⇒ 00:35:37.050 Awaish Kumar: So, it can be a, like, the collaboration between data and AI team. We have some cloud dates, AI…

270 00:35:38.110 ⇒ 00:35:42.149 Awaish Kumar: services, but then for that AI service, we need to bring in some data.

271 00:35:42.490 ⇒ 00:35:42.860 Vinuthna Sandadi: Yeah.

272 00:35:42.860 ⇒ 00:35:46.799 Awaish Kumar: The data team can be involved to bring in that area, build the foundation.

273 00:35:46.920 ⇒ 00:36:01.669 Awaish Kumar: For the AI, but we have a dedicated AI team, basically, to take over from there. But you can… but if you are interested and want to work there, there is… we are flexible enough to, like, you can go

274 00:36:01.840 ⇒ 00:36:10.269 Awaish Kumar: And explore, like, given your time on, wherever you are, expert, and… but then you can get some time

275 00:36:10.750 ⇒ 00:36:14.379 Awaish Kumar: working on EIPs, if you… if you are interested.

276 00:36:15.350 ⇒ 00:36:33.079 Vinuthna Sandadi: Yeah, yeah, that sounds good, yeah. I mean, I just wanted to know if the team… I mean, everybody’s, like, so big on AI and using, like, AI tools within, within the team, or within, you know, in terms of, like, prototyping or development. It’s definitely has become so much more productive.

277 00:36:33.090 ⇒ 00:36:37.669 Vinuthna Sandadi: With the AI tools, so I just… I just wanted to, like, know if, if the team.

278 00:36:37.670 ⇒ 00:36:38.560 Awaish Kumar: Okay.

279 00:36:38.560 ⇒ 00:36:39.770 Vinuthna Sandadi: Yeah.

280 00:36:41.520 ⇒ 00:36:45.689 Awaish Kumar: Okay, we are over the time, I think we can end the interview here, because… Yeah.

281 00:36:46.160 ⇒ 00:36:49.929 Awaish Kumar: Nice talking to you, and thank you for your time.

282 00:36:51.190 ⇒ 00:36:57.069 Vinuthna Sandadi: Yeah, yeah, thanks. Yeah, same here. Yeah, thank you so much for taking the time to meet with me today, yeah.

283 00:36:57.710 ⇒ 00:36:59.039 Awaish Kumar: Okay, thank you.

284 00:36:59.040 ⇒ 00:37:00.009 Vinuthna Sandadi: Yeah, thanks, bye.

Brainforge Knowledge

Explorer

2026-03-12_brainforge_data_engineer_interview_5c5251b6

Graph View