Meeting Title: Brainforge Interview w- Awaish Date: 2026-04-16 Meeting participants: Nikhil G, Awaish Kumar


WEBVTT

1 00:00:38.550 00:00:40.360 Awaish Kumar: Hello, Nikhil, how are you?

2 00:00:40.710 00:00:43.940 Nikhil G: I’m doing good, Avesh, how are you doing?

3 00:00:44.480 00:00:45.669 Awaish Kumar: I’m good as well.

4 00:00:46.170 00:00:49.930 Nikhil G: Nice, nice. Am I pronouncing your name correctly? Abeh?

5 00:00:50.180 00:00:54.899 Awaish Kumar: Yes, yes. So… Where are you located right now?

6 00:00:55.500 00:00:58.440 Nikhil G: I’m in Dublin, Ireland at the moment, yeah.

7 00:00:58.440 00:00:59.560 Awaish Kumar: Dublin, okay.

8 00:00:59.580 00:01:00.430 Nikhil G: Absolutely.

9 00:01:00.910 00:01:03.240 Awaish Kumar: Yeah, I’m in UAE right now.

10 00:01:03.750 00:01:04.509 Nikhil G: Oh, okay.

11 00:01:04.660 00:01:09.679 Awaish Kumar: So, for this Interview. We are just going to talk about,

12 00:01:09.890 00:01:13.469 Awaish Kumar: Your background, what you’ve been doing so far.

13 00:01:13.690 00:01:16.690 Awaish Kumar: Talk more about, your…

14 00:01:16.960 00:01:19.940 Awaish Kumar: The projects you’ve worked on, and the…

15 00:01:20.280 00:01:27.990 Awaish Kumar: Yeah, so basically, just briefly talking about that, and then,

16 00:01:28.120 00:01:31.019 Awaish Kumar: Yeah, I’m here to answer any…

17 00:01:31.510 00:01:34.929 Awaish Kumar: You… yeah, questions you might have for me, okay?

18 00:01:36.130 00:01:37.090 Nikhil G: Sure, sure, yeah.

19 00:01:37.090 00:01:42.370 Awaish Kumar: Okay, let’s get started with the… Your introduction.

20 00:01:43.510 00:01:44.040 Nikhil G: Yeah.

21 00:01:44.280 00:01:52.510 Nikhil G: So, yeah, like, my name is Nikhil. I have been working in the same data industry for more than 10 years now.

22 00:01:52.510 00:02:08.509 Nikhil G: in last, like, 5 to 6 years with the Streamworks, I have been work… working on the latest, like, tools and technologies to build the data warehouse systems and, you know, the migration projects, and I have worked with the multiple clients, you know.

23 00:02:08.509 00:02:14.429 Nikhil G: So, with that one, like, I have got a very good experience with working on the,

24 00:02:14.450 00:02:33.189 Nikhil G: the tools like the Snowflake, dbt, Airflow, Python, all types of SQL, like traditional databases like Oracle, MySQL, Postgres, SQL Server, and on the cloud side, like Snowflake, Redshift, and then the Databricks, you know, so…

25 00:02:33.520 00:02:41.370 Nikhil G: Well, previous to that, like, I have worked on all the legacy Hadoop and all that sort of stuff, but in the last 5 years, like, I have…

26 00:02:41.480 00:02:58.449 Nikhil G: got very good traction with, like, open source ingestion tools, like the AirByte or DLT to load the data, and then from that, like, using the dbt to transform data, and then, like, setting up the Airflow as an orchestration, true orchestration, not an ETL.

27 00:02:58.450 00:03:09.939 Nikhil G: And then, like, doing the test-driven development, so that, like, we have data quality checks and all that sort of stuff, so that we know, like, the foundation of the data is true.

28 00:03:09.940 00:03:28.500 Nikhil G: So that you can leverage the AIs to build the meaningful business solutions on top of data. In last project, I have worked on building the semantic models in Snowflake, because the architecture in the data itself was following Modelian style, you know?

29 00:03:28.750 00:03:45.039 Nikhil G: bronze, silver, and gold layer, so that, like, it was, really available for the business users to directly query. So we built some AI agents as well on top of that, too, so that, like, stakeholders now directly interact with the semantic layers and get their answers, you know, so…

30 00:03:45.040 00:04:05.770 Nikhil G: Yeah, that kind of experience I have in the data industry. So, yeah, I was talking to Kyla, and then you guys are doing the similar work, so I think that’s the exciting part. So, definitely would like to learn, and then exchange my experience, and then learn from you guys as well, whatever you guys are building with your clients here.

31 00:04:06.590 00:04:14.750 Awaish Kumar: Yeah, let’s talk about any… one of the recent projects that you might have been working on that you think was really complex and…

32 00:04:14.890 00:04:17.929 Awaish Kumar: And the project where you have worked, like, end-to-end.

33 00:04:18.050 00:04:23.330 Awaish Kumar: We can discuss about, tourist act, and the…

34 00:04:23.720 00:04:27.370 Awaish Kumar: And your responsibilities for that, and then.

35 00:04:28.440 00:04:31.240 Awaish Kumar: the outcome Of the project.

36 00:04:32.180 00:04:42.199 Nikhil G: Definitely, so I was working for this client, you know, like, they had legacy system that was managed by the third-party vendor, like, vendor was, like, getting their data.

37 00:04:42.200 00:05:03.409 Nikhil G: And then they had built their warehouse, you know, like, it was completely managed on own by them, their custom things, and they were paying millions of license fee every year. So, they decided to build their own data warehouses, and then, like, they contacted us, and then, like, we started with the planning and all that stuff.

38 00:05:03.410 00:05:13.149 Nikhil G: So, we built this project from scratch, so I acted as a data architect, you know, and then, like, hands-on implementing all this stuff, so…

39 00:05:13.150 00:05:32.900 Nikhil G: starting from… from building the everything from scratch, like, let’s say GitHub, we had, like, then, like, we used Terraform, Snowflake, Airflow. Airflow was managed by Astronomer Airflow, and dbt Core, we didn’t go for dbt Cloud. Then.

40 00:05:33.240 00:05:47.769 Nikhil G: we use the Dockerize containers, you know, and all that sort of stuff, you know, so when we are talking about, like, the building it from the scratch, so we started with the Terraform for the both Astronomer and the Snowflake, all the users are back.

41 00:05:47.770 00:05:59.380 Nikhil G: creation of the database objects, role management, RPAC, and then, like, everything was in Terraform. You have to give one single entry, create a pull request.

42 00:05:59.380 00:06:02.549 Nikhil G: CICD will check your TF format.

43 00:06:02.550 00:06:17.150 Nikhil G: And then if it is uproot, then the Terraform apply was… happened after merging. And then, similarly, all the pull requests for the airflow code, DBT code, like, we were doing the testing on the CICD first.

44 00:06:17.150 00:06:36.790 Nikhil G: And then once it is passed, then the human user will approve it after reviewing then, obviously. And then it was getting deployed, you know? So dbt orchestration was happening in the Airflow Cosmos. So, Cosmos is open source package for running the dbt in Airflow.

45 00:06:36.790 00:07:01.720 Nikhil G: So, we went ahead with that one. So, creation of manifest files, like dbt docs and all that stuff, it was being generated in the CICD, and then we were hosting it in our internal place where everyone can use that. So, as I mentioned, like, it was heavily driven using the dbt tests as well, you know? Like, we have utilized all dbt

46 00:07:01.720 00:07:11.740 Nikhil G: utils. Elementary, elementary was used to detect the anomalies in the data, like volume. If there is sudden change in the volume, the elementary was

47 00:07:11.740 00:07:30.310 Nikhil G: triggers, you know? Like, every day you are getting 100,000 records, but suddenly you got, like, 500,000 records. That’s annually, or, like, less records, you know? So, data freshness and all that sort of stuff. So, yeah, that was the end-to-end we deployed, and it was, like, merely in 6 months, and then, like, the vendor was also…

48 00:07:30.320 00:07:32.590 Nikhil G: Impressed by that one, yeah.

49 00:07:33.230 00:07:41.239 Awaish Kumar: Okay, so for… elementary, how did you set it up, and, how did you ensure that the…

50 00:07:41.980 00:07:44.419 Awaish Kumar: Alerts that you are getting are…

51 00:07:44.750 00:07:52.490 Awaish Kumar: are not, like, spamming you, and it’s more like, you are getting, like, the alerts that are… that really matter.

52 00:07:52.900 00:07:59.580 Nikhil G: Yeah, yeah, so, yeah, you, the point, like, it’s basically noise, you know, like, whenever,

53 00:07:59.900 00:08:15.500 Nikhil G: If there is no issue, but still you’re getting alerts, that’s, like, the false positive. So in that case, definitely, you have to monitor your pipeline for a couple of days and set the benchmark, and depending upon that, you can add just the sensitivity within that one, you know?

54 00:08:15.500 00:08:24.560 Nikhil G: So, they give you option to change the sensitivity from the 2 to 3, and then you can reduce it, increase it, and also you can increase the

55 00:08:24.560 00:08:42.779 Nikhil G: maximum look-back period, you know? So, by default, you look for 14 days data, and then predict next 2 days, you know? So, you can increase that period, and then, like, you can add, extra sensitivity to access it. So, basically, it’s a heuristic approach you need to take, and then.

56 00:08:42.780 00:08:46.930 Awaish Kumar: But, my question is more like, So, what are you…

57 00:08:47.810 00:08:52.059 Awaish Kumar: Did you face any challenges with the… with maintaining that?

58 00:08:52.490 00:09:09.739 Nikhil G: Yeah, yeah, definitely. It’s still, like, it’s an ongoing process. Sometimes, we… you receive, like, the true alerts, sometimes you get the false alerts. So, depending upon that, you know, like, if there is any holiday or something, like, definitely you get the false alerts because of the sudden change in the numbers or something.

59 00:09:09.740 00:09:17.330 Nikhil G: So, depending on that, like, you adjust that, sensitivity, based on, based on the…

60 00:09:17.330 00:09:23.270 Awaish Kumar: Sensivity is more like, Obviously, a little bit of…

61 00:09:24.570 00:09:30.199 Awaish Kumar: You… if you decrease the sensitivity, that means, like, it might send you less alerts.

62 00:09:31.570 00:09:35.680 Awaish Kumar: But the thing is that, like, it will send you alerts.

63 00:09:36.480 00:09:39.550 Awaish Kumar: Like, and it will send you alert for…

64 00:09:39.670 00:09:41.289 Awaish Kumar: For example, if there is a…

65 00:09:41.820 00:09:47.330 Awaish Kumar: Change in number of… if you’re counting the number of rows, if there is a change, like, on…

66 00:09:47.890 00:09:50.490 Awaish Kumar: On, on, on, on any of the…

67 00:09:50.970 00:09:55.150 Awaish Kumar: holiday, or Sunday, you… the data didn’t land, and…

68 00:09:55.340 00:09:58.330 Awaish Kumar: Then you start getting those alerts.

69 00:10:00.690 00:10:09.720 Awaish Kumar: And when you have a big data mart, which has, like, hundreds of tables in the mart layer.

70 00:10:10.080 00:10:12.390 Awaish Kumar: And then you have to monitor

71 00:10:13.020 00:10:16.070 Awaish Kumar: Tables, then… but also, like, have to monitor

72 00:10:16.420 00:10:20.729 Awaish Kumar: Specific fields, columns, revenue, like the numbers.

73 00:10:21.380 00:10:24.210 Awaish Kumar: So, the monitoring becomes really…

74 00:10:25.520 00:10:34.329 Awaish Kumar: Like, what’d you say, scale becomes larger, and even if we reduce the sensitivity, we are still going to get a lot of alerts that are

75 00:10:34.470 00:10:36.510 Awaish Kumar: Yeah.

76 00:10:36.510 00:10:39.389 Nikhil G: You might face them, yeah, definitely, yeah, like, if it is…

77 00:10:39.390 00:10:44.189 Awaish Kumar: for time. The problem is… my question is more like, the…

78 00:10:44.760 00:10:47.820 Awaish Kumar: The number of alerts that you get,

79 00:10:49.010 00:10:59.310 Awaish Kumar: it’s not possible for anyone to basically go and resolve them all along, like, it gets slipped. So how do you handle that?

80 00:10:59.990 00:11:20.419 Nikhil G: Yeah, definitely, we definitely, faced this kind of problems, you know, like, we’re getting lots of false alerts, you know, so we increased… decreased the sensitivity in that, like, we missed some, true positives as well, you know? So, what we have recently done is, like, we have built our custom AI prompt, you know, so whenever anything fails.

81 00:11:20.420 00:11:35.539 Nikhil G: then, like, Snowflake has a Cortex LLM layer, you know? So, we are getting that error, and then passing it to any LLM, you can pass it to Snowflake’s LLM, or, like, any OpenAI endpoint, you know?

82 00:11:35.540 00:11:43.049 Nikhil G: And then we let AI decide, you know, like, this is the information, this is the context, we get the certain patterns from it.

83 00:11:43.050 00:11:59.990 Nikhil G: And then we ask, like, this is the case I’m talking about the retry and not retry, you know? Like, when you have, let’s say, 4 retries, so we ask LLM to continue with this retry or not retry, you know? Like, if there is syntax error, even if you try 100 times.

84 00:11:59.990 00:12:19.339 Nikhil G: it’s not going to solve. So, similarly, for this anomaly detection, we are passing this information to the LLM prompts, and from the LLM, like, we get the feedback from it, whether, like, this could be the false positive or negative, you know? And then, like, that was the additional backup for, like, whatever we have implemented, you know? So…

85 00:12:19.340 00:12:38.570 Nikhil G: It’s definitely in the workflows, like, there will be more integration, AI integrations, will help us to maintain our, jobs and then the pipelines, you know, without much human interaction. And at the end, every morning, we get the… this report from the agent that, like, these many failures

86 00:12:38.570 00:12:45.650 Nikhil G: And these are the true positives, and then, like, it’s again the heuristic approach that LLMs, we still need to…

87 00:12:45.650 00:12:54.879 Nikhil G: Add this and, implement. But yeah, definitely, even if you tighten it, it’s… no system is going to be 100%

88 00:12:54.930 00:13:02.030 Nikhil G: proof, you know? There will be still, like, 1% or whatever percentage, like, you will get some wrong alerts.

89 00:13:02.030 00:13:05.350 Awaish Kumar: Yeah, even with the LLM’s approach, I’m…

90 00:13:06.740 00:13:08.499 Awaish Kumar: I want to ask, like.

91 00:13:08.960 00:13:11.710 Awaish Kumar: Even with LLM, like, LLM will take the…

92 00:13:12.230 00:13:23.009 Awaish Kumar: error, and then maybe build a query, and run it on Snowflake, get the result, and then analyze it, and then it can tell you, okay, this 20%

93 00:13:23.100 00:13:34.690 Awaish Kumar: higher number of rows alert that you received is correct, right? But how it will decide that… is it an anomaly or not? Is it a real 20% increase or not?

94 00:13:35.020 00:13:39.480 Nikhil G: Yeah, that’s a valid question. So, we have centralized date spine table.

95 00:13:40.060 00:14:04.320 Nikhil G: Okay, that’s mine, like, it has all the business days, you know, like the public holidays, and then all sort of thing, the day of, the week, day of the month, and all that sort of stuff, so we provide that context to LLM as well. And then, based on that, it also has, like, more visibility into whether it was, like, public holiday, a weekend, or, like, what

96 00:14:04.320 00:14:13.679 Nikhil G: Was there any special event? We have our own campaigns as well. If there is campaign running, then you might receive more orders. Let’s take in that case.

97 00:14:13.680 00:14:23.639 Nikhil G: So, LLM, since it has more context, it can take… it can, like, analyze that data and then, say, you know, because of… there was, like, campaign.

98 00:14:23.640 00:14:40.480 Nikhil G: So, you might expect more records in the orders tables, you know? So that kind of thing, you can integrate and then still, like, it’s better assistant. It won’t take the final decision, but it will give you all these things at the start of the day, so that you know, like, what happened

99 00:14:40.480 00:14:49.730 Nikhil G: Why this is, this happened, you get more context, for human, or whoever is on call, and then they can take a call, based on that.

100 00:14:50.640 00:14:51.270 Awaish Kumar: Okay.

101 00:14:54.190 00:14:55.590 Awaish Kumar: Okay.

102 00:14:55.590 00:15:02.200 Nikhil G: I would like to, learn, like, if you have faced similar things, and how did you approach this?

103 00:15:04.700 00:15:08.050 Awaish Kumar: The thing is that, like, we are…

104 00:15:08.190 00:15:25.390 Awaish Kumar: a startup, which is really fast-paced, and we have tried to do similar tools, which gives you… and adjust the sensitivities, adjust the ranges, come up with custom ranges, such that maybe sometimes relying on that prediction is

105 00:15:25.850 00:15:30.200 Awaish Kumar: The predictive model that is trying to get the…

106 00:15:32.480 00:15:37.860 Awaish Kumar: trying to understand if it’s an outlier or not, using heuristic approaches and their own models, we…

107 00:15:38.040 00:15:40.520 Awaish Kumar: We try to, like,

108 00:15:41.140 00:15:49.859 Awaish Kumar: tighten it with the… with providing custom ranges and things like that, so it… yeah, we are… we get… we get less and less,

109 00:15:52.730 00:15:56.300 Awaish Kumar: alerts, and… False positives, basically.

110 00:15:58.210 00:16:04.260 Awaish Kumar: But the thing, that’s… that’s what I’m… I’m trying to… Understand, like…

111 00:16:04.550 00:16:07.059 Awaish Kumar: This thing is, is something that,

112 00:16:07.290 00:16:10.210 Awaish Kumar: We are continuously going to get alerts.

113 00:16:10.440 00:16:15.879 Awaish Kumar: There’s no way to… Find out, like, the exact 100% accuracy.

114 00:16:16.160 00:16:16.700 Nikhil G: Yep.

115 00:16:17.630 00:16:20.480 Awaish Kumar: And the way we are handling it, like.

116 00:16:20.600 00:16:31.319 Awaish Kumar: we went from, like, using these tools, now we are trying to come up with some… something of our own. We tried, like, using Metaplane and…

117 00:16:31.620 00:16:39.430 Awaish Kumar: which is similar to Elementary, and now we are looking to more… now that we have AI in our hands, like, we want to

118 00:16:39.540 00:16:42.540 Awaish Kumar: using dbt test, and then build our own

119 00:16:43.260 00:16:59.079 Awaish Kumar: model, given the context we have for the client, so… so that for each metric, the LLM knows what this metric means, like, it has a definition, not just the number, but has more information about that metric itself.

120 00:16:59.830 00:17:02.160 Awaish Kumar: And then it has, obviously, the…

121 00:17:02.640 00:17:10.550 Awaish Kumar: debris test, which is written, then it has access to data, but then we are trying to build out something that can work.

122 00:17:11.650 00:17:17.619 Nikhil G: Yeah, interesting, and impressive as well, you know, like, it’s a real-world problem, and then how we apply our…

123 00:17:19.869 00:17:39.280 Nikhil G: expertise at that one to solve that one, that really helps in the current and future projects. Also, as I mentioned, you know, like, we really implemented this AI retry gate in the airflow, so our… one of the jobs, let’s give you one example. It was failing after 40 minutes for the duplicate records, because merge query, you know, in the DVD,

124 00:17:39.280 00:17:47.080 Nikhil G: If the merge is running, so until and unless that duplicate record is found, it will still run, and it was running after 40 minutes, so if you put

125 00:17:47.080 00:18:02.399 Nikhil G: the three retries, so 40 plus 40 plus 40, it was running for 120 minutes, and after that, we were getting alerts in Slack, you know, because there is duplicate. So, we gave this query to the LLM, and LLM will decide whether to try or retry. So first.

126 00:18:02.630 00:18:20.959 Nikhil G: try itself, it will, stop the retries, and then we will get this, alert in the Slack 40, 90 minutes before, you know? So we are saving that 90 minutes of snowflake credit usage, and, like, we are getting alerts early so that we can fix that issue and then continue our pipelines, you know? So that was…

127 00:18:20.980 00:18:32.920 Nikhil G: huge, implement using AI workflow, and I… I can see, like, today, Airflow released, AI Adapter in the recent 3.2 version.

128 00:18:32.920 00:18:41.309 Awaish Kumar: Like, Airflow provides… Airflow has an operator for that, to make LLM calls, or you were writing Python scripts for that?

129 00:18:41.570 00:18:58.629 Nikhil G: It was Python script, it was custom written, and in the… if you, like, in the retry callback, you know, we have failure callbacks. In the retry callback, you can call that function, and from that function, you can call any LNM, you know, we are using Snowflake because it’s internal, it was already approved.

130 00:18:58.630 00:19:13.389 Nikhil G: So, you can just send this prompt, and then… let’s say it’s a timeout issue. So, for the timeout, you know, like, you can still retry. If it is syntax error, no retry. If it is, like, duplicate, no retry, you know? If there is any,

131 00:19:13.790 00:19:31.070 Nikhil G: column missing, no retry, you know? Like, we have created that prompt with the retries and not retries, and AI had its own brain, you know, to determine, to go with, and that really helped, with the metrics. We saved that logs, and then it was implemented. So.

132 00:19:31.270 00:19:44.999 Nikhil G: So, similarly, I like to implement more such ideas in the other data engineering practices, along with how we use it for general use case in the coding and all that sort of stuff.

133 00:19:45.940 00:19:46.670 Awaish Kumar: Okay.

134 00:19:46.840 00:19:54.590 Awaish Kumar: Okay, let’s… Okay, how… How, like, how much… how much… how many years you have been using Airflow?

135 00:19:56.150 00:19:59.890 Nikhil G: From 2017-18 here.

136 00:20:00.020 00:20:03.780 Nikhil G: Started with AFO 1.

137 00:20:03.920 00:20:05.080 Nikhil G: 10, yeah.

138 00:20:05.690 00:20:06.350 Awaish Kumar: Okay.

139 00:20:06.720 00:20:18.070 Awaish Kumar: And then, like, so, like, how… how the… architecture of airflow looks like.

140 00:20:19.290 00:20:20.760 Awaish Kumar: What are the different components of it?

141 00:20:22.220 00:20:33.660 Nikhil G: Sure, yeah, definitely. Depending upon which airflow you’re talking about, like, the… that has evolved from 1 to 2 to 3, you know? Like, you have… the core is, like, the…

142 00:20:33.860 00:20:43.849 Nikhil G: the scheduler, you know, like, you can run everything in a single machine, or, like, you can have, like, multiple machines to do the different jobs. The scheduler is the core

143 00:20:43.960 00:20:57.959 Nikhil G: Then the DAG parsing, it does the DAG parsing, whatever time you set, every 3 minutes, 5 minutes, and then you can set the timeout for parsing as well. It will generate the DAGs using the dynamic DAG factory, or whatever you build.

144 00:20:57.960 00:21:12.699 Nikhil G: And then, like, it will schedule as per the cron, or data sets, or asset-driven, scheduling, you know? And then you have the web server, then you have salary, if you are using civil salary, Kubernetes, you know, to…

145 00:21:12.700 00:21:26.629 Nikhil G: to run your workers. Yeah, that’s basically the idea of the airflow, you know, the decouple thing, and then it can scale vertically, and horizontally as well, and…

146 00:21:27.260 00:21:30.450 Awaish Kumar: It also has a, like, metadata database, which…

147 00:21:30.700 00:21:32.190 Nikhil G: Yeah, that’s, that’s the podcast.

148 00:21:32.190 00:21:35.699 Awaish Kumar: The first thing includes their… all the tasks and their statuses.

149 00:21:35.700 00:21:46.689 Nikhil G: Yeah, and I realized that in the Airflow 3, now you can’t read the metadata database directly, you can’t access that one, so that’s, again, becomes difficult in some workflows, but…

150 00:21:46.930 00:21:51.990 Nikhil G: It’s good. Like, are you guys using Astronomer or MWA, or your own?

151 00:21:51.990 00:21:55.060 Awaish Kumar: We are not using Airflow, we are using Dexter.

152 00:21:55.060 00:21:55.540 Nikhil G: Oh, good.

153 00:21:55.540 00:22:01.340 Awaish Kumar: Yeah, for our internal… Even things like, pipelines, and…

154 00:22:01.660 00:22:08.729 Awaish Kumar: Some of the client pipelines are supported by our internal orchestration tool, because

155 00:22:09.560 00:22:17.930 Awaish Kumar: Yeah, many of our clients, they are, like, Right now, they… they don’t… Go for having orchestration tool.

156 00:22:18.080 00:22:25.779 Awaish Kumar: There’s a lot of tools already involved, like, we now have a separate ingestion tools, like Fivetran, Polytomic.

157 00:22:26.010 00:22:32.169 Awaish Kumar: And then we have Snowflake, dbt, and then BI tools, and then,

158 00:22:33.220 00:22:37.279 Awaish Kumar: Yeah, people are more… then there are some AI…

159 00:22:37.550 00:22:45.599 Awaish Kumar: use cases like AI in Omni, AI in Snowflake, a lot of layers and tooling involved.

160 00:22:46.060 00:22:53.440 Awaish Kumar: So, yeah, not every client prefers that we include one more tool.

161 00:22:53.440 00:22:54.740 Nikhil G: Julia. Definitely.

162 00:22:54.740 00:22:57.469 Awaish Kumar: And, pay for it, basically.

163 00:22:57.810 00:22:58.510 Awaish Kumar: Hmm.

164 00:22:58.900 00:23:01.249 Awaish Kumar: Yeah, that’s why we handle it using

165 00:23:01.700 00:23:08.950 Awaish Kumar: Orchestration, normally, we need mostly for… Dbt, because…

166 00:23:09.690 00:23:14.080 Awaish Kumar: Most of our requests are handled by our ingestion partners.

167 00:23:14.320 00:23:25.999 Awaish Kumar: if we have any new API to ingest data, but if there is something more urgent, we use Dexter, as a… for our orchestration, but we write pipelines

168 00:23:26.190 00:23:30.710 Awaish Kumar: It’s an internal tool, but we write pipelines for our clients and maintain their.

169 00:23:31.170 00:23:31.630 Nikhil G: Or something.

170 00:23:31.630 00:23:38.440 Awaish Kumar: It’s really urgent, and we need to build it ourself. But apart from that, mostly what you need

171 00:23:38.690 00:23:41.150 Awaish Kumar: Orchestration tool for is just,

172 00:23:41.330 00:23:44.900 Awaish Kumar: running dbt, or managing the order.

173 00:23:45.150 00:23:49.510 Awaish Kumar: That things, like, run in order.

174 00:23:49.630 00:23:50.480 Nikhil G: Yep.

175 00:23:51.370 00:23:57.200 Awaish Kumar: And for that, like, we… right now, we are running dbt Core with GitHub Actions.

176 00:23:59.000 00:24:03.169 Awaish Kumar: Yeah, but for the… yeah, but that’s, like, that’s how…

177 00:24:03.440 00:24:05.830 Awaish Kumar: That’s how our tech stack looks like.

178 00:24:06.460 00:24:09.670 Awaish Kumar: Warehouse might change, ingestion tool might change.

179 00:24:09.850 00:24:12.629 Awaish Kumar: But this is how we normally…

180 00:24:13.440 00:24:27.709 Nikhil G: Yeah, yeah, definitely. In my last 10 years of experience, you know, like, I have worked on different tools and technologies, but the idea of core, basic data engineering principles, you crack that, you know, then, like, as you mentioned, you know, like.

181 00:24:27.710 00:24:38.490 Nikhil G: Take any tool, and then you can just get job done in an efficient and robust way, so it’s just… same idea, applied using different, things. So, yeah.

182 00:24:39.330 00:24:42.630 Awaish Kumar: Okay, sounds great, and I think we are…

183 00:24:44.060 00:24:50.580 Awaish Kumar: just 5 minutes left without time, so I will leave this time for you to ask any question.

184 00:24:50.960 00:25:03.839 Nikhil G: Sure, yeah, definitely, I would like to understand, like, how you guys work with the clients. If you have, like, multiple clients, then how do you allocate your time, or, like, do you work on multiple things at a time, or, like, how…

185 00:25:03.840 00:25:07.430 Awaish Kumar: Yes, both. So, basically, we have multiple clients that

186 00:25:08.410 00:25:14.760 Awaish Kumar: So, we are basically divided into multiple service teams in our company.

187 00:25:15.690 00:25:17.629 Awaish Kumar: We have an AI service?

188 00:25:17.810 00:25:35.249 Awaish Kumar: Then we have a data service inside data, because that’s a really huge, service kind of category, so we basically divide it into multiple data engineering, data analytics engineering, and data analysts.

189 00:25:35.490 00:25:39.410 Awaish Kumar: So… These three become different service lines.

190 00:25:40.440 00:25:43.540 Awaish Kumar: Then, in the data, and then we have clients.

191 00:25:45.560 00:25:56.070 Awaish Kumar: And for the clients, we have a dedicated, like, the customer success owner that is responsible for managing the account with the client.

192 00:25:56.190 00:25:58.640 Awaish Kumar: And then…

193 00:25:59.460 00:26:07.120 Awaish Kumar: for each of this, the client, like, obviously the CSO is someone working with the client, he comes up with, okay.

194 00:26:07.570 00:26:13.429 Awaish Kumar: This is the project, this is the task that we need to deliver, and based on that, he might…

195 00:26:13.860 00:26:16.110 Awaish Kumar: Decide, or he might get over…

196 00:26:16.230 00:26:21.059 Awaish Kumar: Certain, like, help on, on, on… Deciding, like, which…

197 00:26:21.380 00:26:37.619 Awaish Kumar: service needs to be basically implementing that, right? If he needs hours from engineering, or data engineering, or data analytics, or data analyst, or if he just needs some hours from AI team. So, basically, we devise a plan, and based on that.

198 00:26:37.740 00:26:52.139 Awaish Kumar: If it is something for data engineering, then it comes to the data engineering team, and in the data engineering team, we divide our time by, like, everyone has, like, 40 hours per week, and for that 40 hours, we divide it,

199 00:26:52.640 00:26:57.020 Awaish Kumar: between clients. So, if one client needs 10 hours, another needs 20,

200 00:26:57.140 00:27:00.519 Awaish Kumar: And another one needs 10, so it’s kind of like that.

201 00:27:00.950 00:27:04.360 Awaish Kumar: So we don’t have any clients right now.

202 00:27:04.600 00:27:07.209 Awaish Kumar: That only needs 40 hours.

203 00:27:08.730 00:27:16.640 Awaish Kumar: of a single engineer, right? Even if someone needs 40 hours, then we try to divide it in two people.

204 00:27:16.770 00:27:19.750 Awaish Kumar: 2020, so that, that is,

205 00:27:20.380 00:27:28.370 Awaish Kumar: support, like, when it’s primary, if somebody goes on leave, somebody’s off, so that we have a coverage for that. So right now.

206 00:27:28.730 00:27:30.780 Awaish Kumar: But we… we… we…

207 00:27:31.350 00:27:37.280 Awaish Kumar: Try to divide time, like, basically in the… in the buckets of 10 hours, 20 hours.

208 00:27:37.450 00:27:48.150 Awaish Kumar: So that, but, like, everyone has at least two And, at most, 3 clients.

209 00:27:49.260 00:27:50.750 Awaish Kumar: To deliver for them, yeah.

210 00:27:51.670 00:27:53.370 Nikhil G: Okay, okay, nice, nice.

211 00:27:54.530 00:28:03.620 Nikhil G: Yeah, this, this, sounds interesting, you know, like, definitely the past-based environment, I have worked in the similar way, I’m, I’m,

212 00:28:03.810 00:28:19.049 Nikhil G: To be honest, like, really like to work in a team, as a good team member, like, to learn from the others, and also share my, views in positive ways, and, yeah, looking forward for, like,

213 00:28:19.340 00:28:24.720 Nikhil G: whatever the outcome, will be. So…

214 00:28:25.220 00:28:30.870 Nikhil G: I believe the next steps would be another take-home taste or something?

215 00:28:31.510 00:28:36.240 Awaish Kumar: Yeah, next steps are meeting with one of my… Colleague?

216 00:28:36.370 00:28:43.529 Awaish Kumar: And he’s… that will also be something like 30-minute discussion. Then there’s a take-home taste, and then final

217 00:28:44.760 00:28:46.879 Awaish Kumar: Panel, interview, and that’s all.

218 00:28:47.810 00:28:53.860 Awaish Kumar: And yeah, after I submit my feedback with Kayla, she’s going… she’s going to come back with the…

219 00:28:54.430 00:28:56.520 Awaish Kumar: With the next steps, and yeah.

220 00:28:56.650 00:28:58.579 Awaish Kumar: She’s followed quickly. She’s going to follow.

221 00:28:59.530 00:29:00.460 Nikhil G: Okay, okay.

222 00:29:01.370 00:29:02.620 Awaish Kumar: Okay, cool.

223 00:29:03.110 00:29:05.240 Awaish Kumar: No more questions, then…

224 00:29:06.030 00:29:24.999 Nikhil G: Yeah, I mean, there is no point in discussing tech stack, and now we, you know, like, it’s going to be advanced tech stacks, might be different tools and technologies, we are on the same page for that one, you know, like, the working style, I’m open with the US hours and all that, you know, like, I have already worked in the startup environment, you know, delivered the projects.

225 00:29:25.000 00:29:39.989 Nikhil G: client face, directly talk to the client, engaging as a single… stay a single person, individual contributor as… and a team, player as well. So, yeah, that takes most of the boxes from my end, you know, so…

226 00:29:39.990 00:29:43.569 Awaish Kumar: Okay, great. Yeah, thank you for your time today.

227 00:29:43.930 00:29:46.010 Awaish Kumar: Oh, good luck for the next steps.

228 00:29:46.140 00:29:46.770 Awaish Kumar: Yay.

229 00:29:46.770 00:29:48.029 Nikhil G: Thank you, thanks, Avish, yeah.