Meeting Title: Data Engineer Interview (Jonthan Gomendoza) Date: 2025-07-16 Meeting participants: Awaish Kumar, jonathan g
WEBVTT
1 00:01:13.160 ⇒ 00:01:16.100 jonathan g: Hey? I wish good day.
2 00:01:20.990 ⇒ 00:01:22.780 Awaish Kumar: Hello! How are you doing.
3 00:01:23.530 ⇒ 00:01:28.559 jonathan g: I’m fine. Thank you. Thanks for asking. Yeah. I just wanted to ask, How are you.
4 00:01:29.720 ⇒ 00:01:31.600 Awaish Kumar: Yeah, I’m good as well.
5 00:01:32.780 ⇒ 00:01:39.460 jonathan g: Hey, Aish, is it? Alright if we do on off Cam, because I had Internet issues on my side.
6 00:01:41.470 ⇒ 00:01:43.190 Awaish Kumar: Sorry. Sorry. Can you please come again.
7 00:01:43.190 ⇒ 00:01:49.230 jonathan g: Hello, okay, I will repeat my request. My request. Is it all right? We do off cam
8 00:01:49.994 ⇒ 00:01:55.660 jonathan g: because I had Internet issues on my side, but we can. We can continue the call.
9 00:01:57.380 ⇒ 00:02:01.130 Awaish Kumar: Yeah, but I think it would be nice if we could have
10 00:02:01.630 ⇒ 00:02:06.459 Awaish Kumar: cameras open. But it’s like more like a conversation
11 00:02:07.108 ⇒ 00:02:10.219 Awaish Kumar: not like a test, if it.
12 00:02:10.870 ⇒ 00:02:19.359 Awaish Kumar: If it was a kind of a some kind of desk, then I could just say, you know, to like work on that. I’m just waiting. We don’t need cameras for that.
13 00:02:19.780 ⇒ 00:02:25.159 Awaish Kumar: but this is like a 2 way conversation where we are going to like, discuss about the
14 00:02:25.430 ⇒ 00:02:30.920 Awaish Kumar: the brain forge, about your experiences and how we can potentially collaborate.
15 00:02:31.180 ⇒ 00:02:37.570 Awaish Kumar: So it it. I, I don’t have issues in rescheduling. If you have Internet issues right now.
16 00:02:38.222 ⇒ 00:02:40.877 Awaish Kumar: you can reschedule it at any
17 00:02:41.450 ⇒ 00:02:44.899 Awaish Kumar: time in my calendar. We can do it later.
18 00:02:46.340 ⇒ 00:02:46.950 jonathan g: Oh.
19 00:02:47.862 ⇒ 00:02:56.030 jonathan g: yeah, I think we could continue. But is it alright? I do off cam on my side, or I should do on camp on side as well.
20 00:03:01.210 ⇒ 00:03:05.460 jonathan g: Because right now, yeah, internet issues, I.
21 00:03:05.460 ⇒ 00:03:06.030 Awaish Kumar: Yeah.
22 00:03:06.370 ⇒ 00:03:07.760 jonathan g: We could continue, continue.
23 00:03:08.890 ⇒ 00:03:21.230 Awaish Kumar: That’s what I’m saying that this is a 2 way conversation I and I would prefer that we have a cameras on, and if it is not possible right now, because you have Internet issues. You can reschedule it
24 00:03:23.000 ⇒ 00:03:24.230 Awaish Kumar: for a later time.
25 00:03:28.100 ⇒ 00:03:28.990 Awaish Kumar: Okay.
26 00:03:29.539 ⇒ 00:03:40.730 Awaish Kumar: that’s that’s what I would advise if you are not able to connect to camera, just reschedule it at any other time, which, where you have when you, you have the good Internet connection.
27 00:03:43.290 ⇒ 00:03:44.210 Awaish Kumar: Okay.
28 00:03:45.250 ⇒ 00:03:53.170 jonathan g: Okay, I will give. I will give my best effort on the data to configure my
29 00:03:53.370 ⇒ 00:03:55.390 jonathan g: cam first.st You said alright.
30 00:03:56.300 ⇒ 00:03:58.069 jonathan g: I’ll just give my best effort.
31 00:03:58.340 ⇒ 00:03:59.740 jonathan g: Okay, I’ll just.
32 00:03:59.740 ⇒ 00:04:03.830 Awaish Kumar: No like. My just question is like, Do you want to continue now?
33 00:04:03.830 ⇒ 00:04:04.200 Awaish Kumar: Yes.
34 00:04:05.090 ⇒ 00:04:05.810 jonathan g: We’ll proceed.
35 00:04:05.810 ⇒ 00:04:08.451 Awaish Kumar: Keep your cameras on, or
36 00:04:09.540 ⇒ 00:04:15.360 Awaish Kumar: we can do it any at any time later. So Snake, no pressure.
37 00:04:16.440 ⇒ 00:04:19.140 jonathan g: Sure. Thanks, thank you.
38 00:04:48.230 ⇒ 00:04:48.840 Awaish Kumar: Right.
39 00:04:51.510 ⇒ 00:04:53.400 jonathan g: Hey, Amish! Can you see me.
40 00:04:54.220 ⇒ 00:04:58.613 Awaish Kumar: Yeah, I can hear you. Sorry. Just give me a moment
41 00:04:59.633 ⇒ 00:05:07.690 Awaish Kumar: yeah, so we can start. My name is Avish Kumar. I’m the engineering manager at Rainforge. And
42 00:05:08.895 ⇒ 00:05:12.870 Awaish Kumar: today we are going in this interview. We are going to talk about
43 00:05:13.358 ⇒ 00:05:28.670 Awaish Kumar: what Brainforge does. And then we are going to know a little bit more about you. Your experience in the kind of projects you have been working on, and what is your contribution in your projects. We’ll deep dive into the project term
44 00:05:29.597 ⇒ 00:05:33.670 Awaish Kumar: architecture, and how you are building the pipelines and all.
45 00:05:33.820 ⇒ 00:05:51.470 Awaish Kumar: So yeah, let’s start. So my name is Arishma. So what what brainforce does is Brainforge is a data AI consultancy from we provide data services to different client clients spanning across the industries.
46 00:05:52.460 ⇒ 00:06:03.997 Awaish Kumar: now we are also providing quite a few services, building AI agent boards based on the Lms and different like supporting the different
47 00:06:05.910 ⇒ 00:06:27.129 Awaish Kumar: natural language. kind of processing work, like, for example, somebody wants to build a chat board to basically book something some hotels on something like that. So all of that powered by I like, that’s kind of services we are providing along with. All the data.
48 00:06:27.290 ⇒ 00:06:29.530 Awaish Kumar: all the data kind of work like
49 00:06:29.630 ⇒ 00:06:39.130 Awaish Kumar: data analysis, data, engineering data, analytics, everything data. And we don’t have us.
50 00:06:39.270 ⇒ 00:06:43.980 Awaish Kumar: The specific set of tools which we work with is more like based on the clients.
51 00:06:44.230 ⇒ 00:06:49.086 Awaish Kumar: So for each client and for each use case, we going to define what kind of
52 00:06:49.620 ⇒ 00:07:16.190 Awaish Kumar: tools are going to be needed for for this and what? And also what client can approve and based on that? We use different tools and technologies. But this gives us enough and edge that because we are able to explore more and more tools let like the the come, the new tools which are coming in market. If we explore that use them for our new clients.
53 00:07:17.370 ⇒ 00:07:25.890 Awaish Kumar: So that’s basically what Brainforge is doing is a team of like 10 to 15 people including the engineering and the
54 00:07:26.596 ⇒ 00:07:33.030 Awaish Kumar: sales marketing team. And yeah, that’s that’s everything about Brainforge.
55 00:07:33.250 ⇒ 00:07:36.390 Awaish Kumar: So now, like, if you can introduce yourself, please.
56 00:07:37.700 ⇒ 00:07:51.899 jonathan g: Alright. Thanks for that introduction, Alicia. I’m Jonathan. I’ve been working in the it industry for 11, going for 11 years, going 12 years, so I’ve been exposed to industries like
57 00:07:52.000 ⇒ 00:08:08.510 jonathan g: financial services, Telco. There’s also retail food and beverages, government institutions. There’s also conglomerates. Some are in the Philippines, some are located in Philippines. And there’s some are located in
58 00:08:09.200 ⇒ 00:08:15.565 jonathan g: all part other parts of the world like Europe. There’s also us. And there’s also,
59 00:08:16.850 ⇒ 00:08:29.059 jonathan g: yeah, there’s some are in. Yeah, those are the areas I’ve been working with. Aside from that one, I was also exposed to healthcare industry and also Hr industry. The most recent.
60 00:08:29.390 ⇒ 00:08:48.739 jonathan g: Then for the technologies I’ve been exposed to the cloud vendors like Aws, there’s all. I also have knowledge in Google Cloud Platform and Microsoft. Assure, aside from that one. There’s also infrastructure scope, whereas aws cloud formation. We know that this is owned by aws.
61 00:08:48.880 ⇒ 00:08:51.789 jonathan g: then there’s also open source like data port.
62 00:08:52.290 ⇒ 00:09:01.290 jonathan g: Aside from that one. There’s also open source tools like Dbt or data build tool where you could do some
63 00:09:01.750 ⇒ 00:09:05.600 jonathan g: orchestration within the data warehouse or the database itself.
64 00:09:06.030 ⇒ 00:09:24.159 jonathan g: And also speaking of database, I had also experience in structured query, language or SQL. So I had exposure as well in other database vendors like Microsoft, SQL. Mysql. And post your SQL. Then for programming languages I’ve been
65 00:09:24.370 ⇒ 00:09:33.629 jonathan g: exposed to Python and Nodejs recently. I’ve been doing upscaling as well in Python Snowflake and assure fabric.
66 00:09:33.880 ⇒ 00:09:39.019 jonathan g: Then for the the the. There’s also the
67 00:09:39.270 ⇒ 00:09:45.379 jonathan g: context of tabs. There’s also continuous integration, continuous delivery, or Cicd
68 00:09:45.580 ⇒ 00:09:51.389 jonathan g: the tools I was exposed to. There was Github actions, bit market, pipeline and azure devops.
69 00:09:51.630 ⇒ 00:09:55.510 jonathan g: There’s also project management tools like Jira and confluence.
70 00:09:55.640 ⇒ 00:09:58.990 jonathan g: Recently I was exposed to linear and notion.
71 00:09:59.540 ⇒ 00:10:11.669 jonathan g: Then, apart from that one there is also Etl or Elt. So Etl. It is extract, transform, and load. Recently I was also exposed to extract, load, and
72 00:10:11.770 ⇒ 00:10:12.540 jonathan g: transform.
73 00:10:15.540 ⇒ 00:10:25.329 Awaish Kumar: Okay, so talking about your recent experiences, can you give me an experience of a project where
74 00:10:26.360 ⇒ 00:10:31.510 Awaish Kumar: you optimized some existing data pipeline.
75 00:10:32.230 ⇒ 00:10:33.550 jonathan g: Alright.
76 00:10:33.950 ⇒ 00:10:45.309 jonathan g: right? Yeah, I can share a few. So for the existing data pipeline. So I’ll just give you the context for more like so yes, there’s an existing data pipeline.
77 00:10:45.420 ⇒ 00:10:51.800 jonathan g: So the tech stack they’re using was in Ws, so to be specific services. Lambda
78 00:10:51.910 ⇒ 00:11:01.000 jonathan g: using cloud formation as the infrastructure of code. Then they are storing the data in Google, Cloud Storage and bigquery.
79 00:11:01.270 ⇒ 00:11:07.919 jonathan g: So the it is good for short term. But in the long term it’s not
80 00:11:08.070 ⇒ 00:11:11.670 jonathan g: possible already, because moving, because the
81 00:11:11.840 ⇒ 00:11:15.529 jonathan g: 1st data which is stored in Aws Rds instance.
82 00:11:15.940 ⇒ 00:11:25.429 jonathan g: the live environment or the production environment keeps on aggregating. So if there is a trace of data of the room, count.
83 00:11:25.750 ⇒ 00:11:35.370 jonathan g: So there is like a net a time up when loading data to Google Cloud storage, though it could populate in bigquery, but it is not
84 00:11:35.940 ⇒ 00:11:39.520 jonathan g: peaceable for long term, so
85 00:11:39.920 ⇒ 00:11:48.060 jonathan g: I’ve done a proof of concept. So there are 2 services. I’ll be used. So one is in Aws using Aws Group.
86 00:11:48.200 ⇒ 00:11:57.130 jonathan g: the other one is using Google Cloud Data Stream that is owned by Google Cloud Platform. Upon doing some research, then that includes as well
87 00:11:57.300 ⇒ 00:12:07.159 jonathan g: cost the costing as well how it’s gonna be used. And also for the long term, it seems like during my research, I had to use
88 00:12:07.320 ⇒ 00:12:11.379 jonathan g: data stream, the data pipeline.
89 00:12:11.480 ⇒ 00:12:15.259 jonathan g: So the reason of using data stream is that
90 00:12:15.360 ⇒ 00:12:22.669 jonathan g: there is the functionality of change data capture where it whereas that if there is a
91 00:12:22.850 ⇒ 00:12:27.750 jonathan g: changes or an update in your source label or your schema.
92 00:12:27.960 ⇒ 00:12:31.740 jonathan g: then it will also reflect, then it will process that
93 00:12:32.290 ⇒ 00:12:36.110 jonathan g: to to load your data to Google Cloud Storage.
94 00:12:36.210 ⇒ 00:12:41.230 jonathan g: Then for bigquery, you can just map or point it to
95 00:12:42.000 ⇒ 00:12:47.489 jonathan g: to be to Google Cloud Storage. So that’s 1 of the enhancement pipelines. I’ve done
96 00:12:47.660 ⇒ 00:12:54.299 jonathan g: another. Another enhancement I’ve done is that there is a security problem.
97 00:12:54.430 ⇒ 00:12:56.540 jonathan g: So the problem is that
98 00:12:56.950 ⇒ 00:13:05.710 jonathan g: the the password of the credentials. It’s exposed. So it means to say it was stored in a
99 00:13:06.580 ⇒ 00:13:11.370 jonathan g: in a code script, doesn’t, that is, in Json format to be specific.
100 00:13:11.900 ⇒ 00:13:26.529 jonathan g: So what should we do if you are going to, if it. That is the best practice when it comes to storing credentials. I don’t think that’s a best practice, because security team will audit that
101 00:13:26.790 ⇒ 00:13:33.329 jonathan g: pipeline or that repository. It will just call you out, said, this is not the best practice of storing your
102 00:13:33.840 ⇒ 00:13:35.070 jonathan g: your password
103 00:13:35.180 ⇒ 00:13:48.399 jonathan g: in a code level. So does it matter if that is a project owned repository, or that is a public repository, because we do not want to compromise the data of the client.
104 00:13:48.610 ⇒ 00:13:59.560 jonathan g: So I suggested that how about we use aws secrets, manager, that is owned by Aws, which is a password manager owned by aws! So that’s 1.
105 00:13:59.680 ⇒ 00:14:04.479 jonathan g: There’s also Github secrets that is own, that is also owned by Github.
106 00:14:04.680 ⇒ 00:14:12.150 jonathan g: Another option is that, how about we use Google Cloud platform secrets manager that is owned by Dcp.
107 00:14:12.690 ⇒ 00:14:24.089 jonathan g: So all options are are take into consideration as well. So it varies as well that if you are going to use Aws services, then utilize
108 00:14:24.200 ⇒ 00:14:28.329 jonathan g: aws secrets. Manager. If we are going to utilize
109 00:14:28.920 ⇒ 00:14:32.169 jonathan g: Tcp services, then we use Gcp secrets.
110 00:14:32.370 ⇒ 00:14:37.629 jonathan g: If this is like a open source, let us utilize it have secrets.
111 00:14:37.870 ⇒ 00:14:41.370 jonathan g: So there are options that takes into consideration.
112 00:14:42.880 ⇒ 00:14:46.659 jonathan g: Then, aside from that one, there’s also
113 00:14:47.040 ⇒ 00:14:51.799 jonathan g: one of my team members doing manual deployment. So the manual.
114 00:14:51.990 ⇒ 00:14:54.260 Awaish Kumar: For example, I have a table
115 00:14:54.480 ⇒ 00:14:58.480 Awaish Kumar: with billions of billions of like rows.
116 00:14:59.070 ⇒ 00:15:03.419 Awaish Kumar: and I want to run some queries on top of it.
117 00:15:03.870 ⇒ 00:15:07.470 Awaish Kumar: And my query is already very slow.
118 00:15:09.750 ⇒ 00:15:12.189 Awaish Kumar: because there’s a lot of data. Obviously.
119 00:15:12.480 ⇒ 00:15:16.759 Awaish Kumar: I want to identify the user with the
120 00:15:17.687 ⇒ 00:15:25.230 Awaish Kumar: some like, I want to search for some users. And the table is like, kind of have a 1st name
121 00:15:26.456 ⇒ 00:15:38.319 Awaish Kumar: email phone number address, and few more field such as like, you know, for example, then
122 00:15:39.530 ⇒ 00:15:42.890 Awaish Kumar: region or whatever like, etc. So
123 00:15:43.740 ⇒ 00:15:50.959 Awaish Kumar: now I’m I want to carry for a user. I want to search for a user. I want to get address, for example.
124 00:15:51.100 ⇒ 00:15:58.180 Awaish Kumar: So this Kelly is taking more than a like minute to execute.
125 00:15:58.460 ⇒ 00:16:01.220 Awaish Kumar: So how can I optimize?
126 00:16:01.720 ⇒ 00:16:02.640 Awaish Kumar: Wow!
127 00:16:02.930 ⇒ 00:16:08.560 Awaish Kumar: What steps can I take to optimize, to reduce this security time.
128 00:16:10.350 ⇒ 00:16:16.229 jonathan g: Oh, from what I understand about the problem is that it takes time when it comes to query, Is that correct?
129 00:16:16.520 ⇒ 00:16:23.150 jonathan g: So what I, what I will be doing it. That’s a a solution for that one is that
130 00:16:23.450 ⇒ 00:16:35.330 jonathan g: yes, you will. You need to assess that which columns that is needed for your investigation or for your report. That will depend on your client requirements.
131 00:16:35.650 ⇒ 00:16:40.459 jonathan g: So if it says that all columns should be included.
132 00:16:40.640 ⇒ 00:16:47.180 jonathan g: then you need to access as well. What are the filters that is needed for your for the business requirements?
133 00:16:47.400 ⇒ 00:16:55.239 jonathan g: So we will check on there like status or tagging, that what are the
134 00:16:55.690 ⇒ 00:16:58.269 jonathan g: filters or the values that is needed?
135 00:16:58.380 ⇒ 00:17:04.269 jonathan g: Another option is that we can use common table expression for cte.
136 00:17:04.410 ⇒ 00:17:11.320 jonathan g: So you had like a getting all the columns. Then from there you could add some filters from start.
137 00:17:11.740 ⇒ 00:17:12.789 jonathan g: After doing that.
138 00:17:13.534 ⇒ 00:17:19.599 Awaish Kumar: Is so if I even if I read in a cte like that’s going to read full table.
139 00:17:20.010 ⇒ 00:17:20.470 jonathan g: Hmm.
140 00:17:21.380 ⇒ 00:17:23.210 Awaish Kumar: So that’s like, that’s
141 00:17:23.560 ⇒ 00:17:52.410 Awaish Kumar: that’s the problem. Right? I’m I don’t. If I read full table, it’s it’s just takes. That’s why it takes longer if I carry it by like, I want to search for myself on my name. And there are billions of rows in that like I don’t want to. I’m I don’t. I’m not looking for optimizing, for multiple queries like, Okay, we load this data into memory, and after that every search will be faster.
142 00:17:53.558 ⇒ 00:18:06.129 Awaish Kumar: That’s not the the situation here. We don’t want to like. That’s a different part where we want to optimize for multiple, similar multiple requests. So
143 00:18:06.510 ⇒ 00:18:09.069 Awaish Kumar: search for search by name
144 00:18:09.430 ⇒ 00:18:22.950 Awaish Kumar: for 10 employees. So we load the data into memory and then search in memory instead of directly carrying the database. That’s 1 of the solutions like with City. You want to do that. But what? My question is more like.
145 00:18:23.300 ⇒ 00:18:28.160 Awaish Kumar: how? How can I structure my table itself, so I can reduce my carry time.
146 00:18:30.700 ⇒ 00:18:35.990 jonathan g: I think I will. I will depend on the filter based on the tagging. So, for example, if you are.
147 00:18:35.990 ⇒ 00:18:41.700 Awaish Kumar: So my requirement. I’ve told you that the I want to carry for a user
148 00:18:41.930 ⇒ 00:18:49.590 Awaish Kumar: based on its number. For example, name or whatever. Right? So
149 00:18:51.640 ⇒ 00:19:01.120 Awaish Kumar: for example, I I say, like, I have a table with like events table. If you can say it’s kind of events. Table I have a mob. For example, if I
150 00:19:02.060 ⇒ 00:19:08.630 Awaish Kumar: make a full example, I have a mobile application. For example, my mobile application have 1,000
151 00:19:09.222 ⇒ 00:19:15.230 Awaish Kumar: users, and out of those 1,000 users. There they are making some
152 00:19:15.992 ⇒ 00:19:20.920 Awaish Kumar: click, some something, some button button click, some page views.
153 00:19:21.110 ⇒ 00:19:35.609 Awaish Kumar: So there are different activities happening on the app by all of these 1,000 users, and it builds up a table in the back end. I just have one table which is storing all these events
154 00:19:36.150 ⇒ 00:19:40.080 Awaish Kumar: right? And that one table is storing all these events
155 00:19:40.669 ⇒ 00:19:53.440 Awaish Kumar: by 1,000 users, and it has grow, grew so much that now it has like millions and millions of phones. Now, when I want to search for myself on my name
156 00:19:53.590 ⇒ 00:19:57.779 Awaish Kumar: in that table, that’s basically the problem that
157 00:19:58.310 ⇒ 00:20:02.490 Awaish Kumar: now I don’t. I’m just not searching for 1,000 rows.
158 00:20:02.630 ⇒ 00:20:12.110 Awaish Kumar: I have a table which has grew because of event activities happening. And I’m adding new rules at the table. Now I have millions of rows. I want to search for rows
159 00:20:12.260 ⇒ 00:20:18.570 Awaish Kumar: for a wish, Kumar, and now, when I’ll do that, it is taking 1 min more than 1 min to execute
160 00:20:19.130 ⇒ 00:20:20.470 Awaish Kumar: for a single query.
161 00:20:20.590 ⇒ 00:20:25.770 Awaish Kumar: How can I? Maybe you can propose restructuring of the table. You can propose?
162 00:20:26.979 ⇒ 00:20:32.250 Awaish Kumar: any kind of optimization strategies, anything we can do in this situation.
163 00:20:34.670 ⇒ 00:20:42.599 jonathan g: Well, judging from your explanation, there are some who are going to use indexing the index function. Some would use that.
164 00:20:42.820 ⇒ 00:20:52.349 jonathan g: But yeah, that’s good for the performance, but for long term it’s not good friend, for utilization. That’s 1 using index.
165 00:20:52.680 ⇒ 00:21:01.620 jonathan g: You’re going to index your name. Another thing is that you need to have, like our statement that only your name, it will be filtered
166 00:21:01.940 ⇒ 00:21:03.960 jonathan g: also. Another thing is that.
167 00:21:04.550 ⇒ 00:21:18.630 jonathan g: do you need all the whole rooms? Because if you need all the homes, then it as expected. That will be cost a runtime, unless you will be eliminating some of the
168 00:21:19.450 ⇒ 00:21:25.950 jonathan g: columns that are not relevant, then that I would say that will also speed up the efficiency
169 00:21:26.060 ⇒ 00:21:33.030 jonathan g: other options, that if if the company or the client has
170 00:21:33.480 ⇒ 00:21:41.379 jonathan g: has a good budget or a high budget. Then there will be times that we could increase the memory of that gigabytes.
171 00:21:41.520 ⇒ 00:21:54.000 jonathan g: But I don’t think that will be considered so, since it. It needs approval from the upper, from the higher ups when it comes to increasing memory, because that will also increase cost.
172 00:21:54.450 ⇒ 00:21:59.450 Awaish Kumar: Yeah, we are not looking to increase memory. And I’m just want to optimize
173 00:21:59.620 ⇒ 00:22:05.390 Awaish Kumar: the like, just want to work with. SQL, so
174 00:22:05.570 ⇒ 00:22:16.129 Awaish Kumar: how would you architect your database and how you can employ some optimization techniques provided by SQL, so everything else you can leave it house
175 00:22:16.390 ⇒ 00:22:23.009 Awaish Kumar: alright. I’m just asking in context of as well. So we can leave everything else on the side.
176 00:22:26.000 ⇒ 00:22:30.220 Awaish Kumar: So you mentioned about indexing and what else we can do.
177 00:22:31.310 ⇒ 00:22:37.179 jonathan g: Indexing, that’s 1. There’s also you need to use our statement.
178 00:22:37.370 ⇒ 00:22:43.539 jonathan g: Then you need to utilize the like function, if I may. Some would use the like function
179 00:22:43.830 ⇒ 00:22:45.279 jonathan g: or the like person.
180 00:22:45.380 ⇒ 00:22:47.830 jonathan g: There’s also some as well.
181 00:22:48.270 ⇒ 00:22:49.460 jonathan g: You need your phone.
182 00:22:49.460 ⇒ 00:22:55.989 Awaish Kumar: Patience, how functions are going to help with the optimization, the query, execution, time.
183 00:22:56.990 ⇒ 00:23:02.760 jonathan g: You need to check on your columns as well. What are the columns that is needed for your relevant investigation?
184 00:23:02.760 ⇒ 00:23:09.590 Awaish Kumar: Oh, like, yeah, I’m I’m getting so my kid is just selects only the columns needed.
185 00:23:09.930 ⇒ 00:23:21.429 Awaish Kumar: I’m not reading all the from all the columns I need. I use name of the user and the address searching on the name of the user. That’s all I’m doing. I’m not selecting the
186 00:23:22.740 ⇒ 00:23:26.280 Awaish Kumar: the extra columns which are present in the table.
187 00:23:27.860 ⇒ 00:23:29.890 Awaish Kumar: but still it takes that time.
188 00:23:34.430 ⇒ 00:23:40.569 jonathan g: Hmm! Another thing is that what I can do is that I will use the name.
189 00:23:40.730 ⇒ 00:23:53.609 jonathan g: Then I will do a row, count, so I will check how many rows, or how many row counts or rows that contains doesn’t matter if that is a wish, Jonathan or
190 00:23:53.780 ⇒ 00:24:05.559 jonathan g: John. Yeah, that’s I will check as well on how many, because, as expected. If you will, let’s say, if it reach 500, then we expect that
191 00:24:05.900 ⇒ 00:24:08.379 jonathan g: it will. It should be running fast.
192 00:24:08.490 ⇒ 00:24:14.960 jonathan g: But if it reaches around 1 million, we’re already expecting that. Yeah, there are a lot of
193 00:24:15.490 ⇒ 00:24:20.960 jonathan g: transactions that’s being done by John. I wish, or Jonathan, based on the filters.
194 00:24:21.100 ⇒ 00:24:27.839 Awaish Kumar: Okay, yeah. But like, this is, this is more like auditing the data that
195 00:24:28.050 ⇒ 00:24:31.400 Awaish Kumar: how many roles we have for each person.
196 00:24:32.090 ⇒ 00:24:34.850 Awaish Kumar: But what about?
197 00:24:35.130 ⇒ 00:24:36.060 Awaish Kumar: Oh.
198 00:24:37.248 ⇒ 00:24:43.100 Awaish Kumar: like this is the 1st like you’d audit the data. You got the rows number of calls. But I
199 00:24:43.210 ⇒ 00:24:44.897 Awaish Kumar: I’m saying that
200 00:24:46.940 ⇒ 00:24:53.449 Awaish Kumar: that’s like, even if the reasonable like this is taking way longer than I would expect.
201 00:24:53.750 ⇒ 00:24:55.480 Awaish Kumar: I want to optimize it.
202 00:24:58.170 ⇒ 00:25:00.670 Awaish Kumar: So what about if we
203 00:25:01.600 ⇒ 00:25:07.739 Awaish Kumar: think about restructuring of the table like I. What if we split the table into 2.
204 00:25:10.540 ⇒ 00:25:11.900 jonathan g: More like partitioning.
205 00:25:13.140 ⇒ 00:25:18.949 Awaish Kumar: This is, yeah, that’s number 1. 1 of the. This can be one of the strategies partitioning.
206 00:25:19.280 ⇒ 00:25:22.649 Awaish Kumar: Yeah. Number 2 could be that I told you
207 00:25:22.820 ⇒ 00:25:25.700 Awaish Kumar: that I’m searching for a person’s name.
208 00:25:25.860 ⇒ 00:25:30.830 Awaish Kumar: so name is is a constant like, I’m making 1 million events.
209 00:25:30.930 ⇒ 00:25:37.269 Awaish Kumar: So name Avesh Kumar is redundant in August. Middle 1 million rows. What if I create another table?
210 00:25:37.928 ⇒ 00:25:48.929 Awaish Kumar: With string values, names and ids. And instead of searching for name in the long, a big table, I search based on integer.
211 00:25:50.870 ⇒ 00:25:54.119 Awaish Kumar: So searching on integer is faster than searching on a string right.
212 00:25:57.020 ⇒ 00:26:00.170 jonathan g: Well, we can’t consider on that one we could like.
213 00:26:00.170 ⇒ 00:26:06.599 Awaish Kumar: One of the strategies that like, if if I search on a string, that’s it’s very slow
214 00:26:07.040 ⇒ 00:26:09.940 Awaish Kumar: when you compare searching for an integer.
215 00:26:10.660 ⇒ 00:26:16.019 Awaish Kumar: and that’s very fast. So, and we know that we only have 1,000 names.
216 00:26:16.360 ⇒ 00:26:19.500 Awaish Kumar: so we can put it in in a different table.
217 00:26:19.950 ⇒ 00:26:24.059 Awaish Kumar: and it will give some get some id based on auto integer.
218 00:26:24.410 ⇒ 00:26:34.010 Awaish Kumar: Use that in in the larger table. So now you can add indexing the the strategy you mentioned on the integer column.
219 00:26:34.740 ⇒ 00:26:37.320 Awaish Kumar: so indexing that integer column is.
220 00:26:37.820 ⇒ 00:26:44.300 Awaish Kumar: and then searching on top of it is, will be really, really fast, and then searching on the string itself.
221 00:26:44.420 ⇒ 00:26:47.290 Awaish Kumar: So this is like, kind of a
222 00:26:48.260 ⇒ 00:26:54.799 Awaish Kumar: how you can like structure. But yeah, we can move ahead. And I can ask you mentioned about indexing.
223 00:26:54.930 ⇒ 00:27:00.179 Awaish Kumar: So what are different types of indexing all right and
224 00:27:01.870 ⇒ 00:27:03.999 Awaish Kumar: like. What are their pros and cons.
225 00:27:05.850 ⇒ 00:27:12.760 jonathan g: So the pros and cons for indexing that. Yes, the pro. It could speed up your query.
226 00:27:12.990 ⇒ 00:27:14.430 jonathan g: The point is that
227 00:27:14.590 ⇒ 00:27:27.539 jonathan g: there’s going to be problem moving forward, using the update delete and insert statement, especially if you apply the index function there, then that will also affect the performance and the utilization
228 00:27:28.990 ⇒ 00:27:33.600 jonathan g: that I would say, that’s the most simplified version for.
229 00:27:34.920 ⇒ 00:27:40.779 Awaish Kumar: So click, for example, how indexing works in databases.
230 00:27:43.190 ⇒ 00:27:49.469 jonathan g: From what I understand about indexing is that you could. It seems like you could do some.
231 00:27:49.910 ⇒ 00:27:52.080 jonathan g: You you query the table itself.
232 00:27:52.290 ⇒ 00:27:58.499 jonathan g: Then you’re trying to look some information or specific column. You’re trying to do some index.
233 00:27:58.670 ⇒ 00:28:02.550 jonathan g: Then, after that one, it was already.
234 00:28:02.550 ⇒ 00:28:05.780 Awaish Kumar: Like how database implements indexing right.
235 00:28:06.320 ⇒ 00:28:17.189 Awaish Kumar: For example, I have a column. I I want to add, indexing on that. I added a default on the indexing and on a column in a post table.
236 00:28:18.050 ⇒ 00:28:24.359 Awaish Kumar: but indexing how it is actually working like how it’s actually making the searching fast.
237 00:28:26.720 ⇒ 00:28:31.389 jonathan g: You need to type it in the, in your.
238 00:28:31.390 ⇒ 00:28:33.070 Awaish Kumar: Influence indexing in the back end.
239 00:28:35.760 ⇒ 00:28:48.829 jonathan g: And for that one, from what I understand about the back, the behavior of index, especially in the back end. So they are looking like needs to check on a specific column.
240 00:28:49.010 ⇒ 00:28:52.959 jonathan g: For example, use Id as your index. So from there
241 00:28:54.620 ⇒ 00:28:56.610 jonathan g: kind of row, row, reading row.
242 00:28:56.610 ⇒ 00:28:59.480 Awaish Kumar: Using some data structure right.
243 00:29:01.620 ⇒ 00:29:04.730 jonathan g: Yes, they do use data structure as well. But.
244 00:29:04.730 ⇒ 00:29:10.549 Awaish Kumar: That’s what I’m gonna understand. But what data structure they use and how it works.
245 00:29:17.230 ⇒ 00:29:19.009 jonathan g: Think for that one
246 00:29:21.130 ⇒ 00:29:24.580 jonathan g: I need to review on that one. Can I get back to you on that one.
247 00:29:24.990 ⇒ 00:29:30.840 Awaish Kumar: Okay. And what is difference between for clustered and non clustered, indexing.
248 00:29:32.200 ⇒ 00:29:35.519 jonathan g: For cluster index. It’s more like it is
249 00:29:35.930 ⇒ 00:29:40.289 jonathan g: con consolidated in one area. So there’s like a cluster.
250 00:29:40.570 ⇒ 00:29:43.590 jonathan g: whereas for non-clustered it is spread out.
251 00:29:43.760 ⇒ 00:29:57.149 jonathan g: spread out index. So, for example, for clustered, you are using one table to do your index, whereas non clustered index, you have multiple tables. You could do some indexes.
252 00:29:58.960 ⇒ 00:29:59.900 Awaish Kumar: Okay.
253 00:30:00.370 ⇒ 00:30:01.350 jonathan g: So.
254 00:30:01.810 ⇒ 00:30:05.570 Awaish Kumar: Hmm, what about?
255 00:30:10.120 ⇒ 00:30:17.099 Awaish Kumar: Okay, we can move ahead with a different set of questions. So what is the difference between acid and the base.
256 00:30:17.810 ⇒ 00:30:20.079 Awaish Kumar: 2 different types of systems.
257 00:30:22.370 ⇒ 00:30:24.980 jonathan g: Or can you repeat the question? Acid language.
258 00:30:26.060 ⇒ 00:30:34.980 Awaish Kumar: So so as it like, we know, we say that there are some transactional systems, and there are some analytical systems.
259 00:30:35.160 ⇒ 00:30:38.597 Awaish Kumar: right so, and both of them have
260 00:30:40.350 ⇒ 00:30:48.200 Awaish Kumar: different properties to satisfy. So for for transactional system, we have asset properties, right
261 00:30:48.370 ⇒ 00:30:54.709 Awaish Kumar: so, and the further analytical systems which which basically work on the properties of base.
262 00:30:55.210 ⇒ 00:30:58.939 Awaish Kumar: Can you elaborate more in acid and versus base.
263 00:31:00.480 ⇒ 00:31:16.620 jonathan g: Or acid. Yeah, you mentioned about transaction. This is appropriate for transactional data. So more like a big data, there’s no need. So even though there’s like a complex joint condition included, there will be.
264 00:31:16.760 ⇒ 00:31:29.410 jonathan g: That’s the transaction data that’s going to be used. That’s for. Whereas for base, this is more like the summary data or the aggregated data. So you or there’s also a there’s also
265 00:31:29.690 ⇒ 00:31:33.710 jonathan g: joining. Joining conditions include as well the complex ones.
266 00:31:33.910 ⇒ 00:31:40.540 jonathan g: This can be used for report dashboards or visualization for the
267 00:31:41.110 ⇒ 00:31:46.250 jonathan g: for the other one. That’s the transactional data that is more like simply like a
268 00:31:46.390 ⇒ 00:31:53.099 jonathan g: you could do like a row table or like a source table that can be used by other team members
269 00:31:53.560 ⇒ 00:31:56.489 jonathan g: when trying to what I’m going.
270 00:31:56.490 ⇒ 00:31:58.709 Awaish Kumar: What acid and waste stand for?
271 00:32:00.110 ⇒ 00:32:02.169 Awaish Kumar: Acid is an acronym. What does it.
272 00:32:02.170 ⇒ 00:32:02.540 jonathan g: Students
273 00:32:06.800 ⇒ 00:32:10.799 jonathan g: well for acids. From what I for letter C. This is screwed.
274 00:32:10.920 ⇒ 00:32:13.559 jonathan g: I mean. Sorry it’s not rude. It’s great.
275 00:32:13.890 ⇒ 00:32:18.850 jonathan g: I is for insert d is for delete a is for appendum.
276 00:32:20.460 ⇒ 00:32:21.750 Awaish Kumar: Sorry case. For what?
277 00:32:22.130 ⇒ 00:32:22.820 jonathan g: Append.
278 00:32:24.530 ⇒ 00:32:25.660 Awaish Kumar: A bend.
279 00:32:26.420 ⇒ 00:32:26.910 jonathan g: Pen.
280 00:32:27.360 ⇒ 00:32:32.830 Awaish Kumar: Okay? And so, and I is for.
281 00:32:34.440 ⇒ 00:32:35.320 jonathan g: Thanks, sir.
282 00:32:37.060 ⇒ 00:32:42.880 Awaish Kumar: Okay? So I stands for isolation. So isolation means.
283 00:32:43.640 ⇒ 00:32:47.409 Awaish Kumar: Okay, like, I can ask you more like, what do you?
284 00:32:47.810 ⇒ 00:32:52.460 Awaish Kumar: No, what isolation is in context of SQL,
285 00:32:52.990 ⇒ 00:32:55.990 Awaish Kumar: and what are different types of isolation levels.
286 00:32:57.600 ⇒ 00:33:06.270 jonathan g: Well, I’ll give an I’ll give a shot on answering this isolation question, but it’s I haven’t encountered in my
287 00:33:06.720 ⇒ 00:33:20.919 jonathan g: work experience when it comes to isolation. But okay, so for isolation, it’s more of a you are isolating a data error. That’s how I understand when it comes to isolation. So it’s more of a
288 00:33:21.110 ⇒ 00:33:23.460 jonathan g: is there like a data quality issue.
289 00:33:23.760 ⇒ 00:33:29.569 jonathan g: or say, formatting issue? Then from there you need to validate from your source.
290 00:33:29.980 ⇒ 00:33:31.790 jonathan g: That’s how I understand about it.
291 00:33:32.950 ⇒ 00:33:37.139 Awaish Kumar: And have you used the tools like airflow.
292 00:33:39.580 ⇒ 00:33:44.500 jonathan g: Airflow. I heard about airflow this more like an orchestrated group or open source.
293 00:33:44.500 ⇒ 00:33:46.479 Awaish Kumar: Orchestration tools? Have you used.
294 00:33:46.950 ⇒ 00:33:51.389 jonathan g: Dbt data build tool. Then there’s also for control M,
295 00:33:51.520 ⇒ 00:33:55.479 jonathan g: and also for aws, there is step, function.
296 00:33:56.420 ⇒ 00:33:57.150 Awaish Kumar: Okay?
297 00:33:58.640 ⇒ 00:34:03.779 Awaish Kumar: So in dbt, like, what kind of different features have you used.
298 00:34:05.450 ⇒ 00:34:13.330 jonathan g: In. Dbt, yeah, I do. Code refactor as well like you transform it into common table expression.
299 00:34:13.440 ⇒ 00:34:22.390 jonathan g: There’s also the file of the email. So you need to add project. For example, you’re going to add the data set. You’re going to add the
300 00:34:22.610 ⇒ 00:34:25.919 jonathan g: where it’s going to be stored, or which folder. It’s going to be stored.
301 00:34:25.920 ⇒ 00:34:29.460 Awaish Kumar: But I mean what is seed in DVD.
302 00:34:31.760 ⇒ 00:34:32.409 jonathan g: In.
303 00:34:33.150 ⇒ 00:34:34.869 Awaish Kumar: What is seed? Seed.
304 00:34:37.170 ⇒ 00:34:41.149 jonathan g: Oh, for seed haven’t encountered seed yet for.
305 00:34:41.803 ⇒ 00:34:44.070 Awaish Kumar: Have you used the macros.
306 00:34:45.370 ⇒ 00:34:51.969 jonathan g: Macros. Yes, the most common for macros is using preference and choice.
307 00:34:53.900 ⇒ 00:34:56.700 Awaish Kumar: Okay. But have you implemented Macros?
308 00:34:58.159 ⇒ 00:34:59.169 jonathan g: The customer.
309 00:34:59.170 ⇒ 00:34:59.760 Awaish Kumar: Thanks.
310 00:35:00.290 ⇒ 00:35:02.750 jonathan g: Not yet. Haven’t done custom market yet.
311 00:35:03.210 ⇒ 00:35:15.219 Awaish Kumar: Okay? And for Dvt, like, what are some like strategies for data, incremental data loading.
312 00:35:18.870 ⇒ 00:35:20.390 jonathan g: Can you repeat the question?
313 00:35:21.455 ⇒ 00:35:27.260 Awaish Kumar: In Dbt for incremental data loading. Right?
314 00:35:27.420 ⇒ 00:35:31.319 Awaish Kumar: What are different strategies that you can use in Dbt.
315 00:35:32.990 ⇒ 00:35:39.880 jonathan g: Or incremental load. Normally, you need to check if there is a duplicate when it comes to loading stuff.
316 00:35:40.000 ⇒ 00:35:45.950 jonathan g: So from there you need to apply the role number option in SQL,
317 00:35:46.440 ⇒ 00:35:49.779 jonathan g: so that it will only get the latest data.
318 00:35:52.150 ⇒ 00:35:52.960 Awaish Kumar: Okay,
319 00:35:54.570 ⇒ 00:35:57.690 Awaish Kumar: So do you know the concept of hooks in the DVD.
320 00:36:01.150 ⇒ 00:36:02.600 jonathan g: Or can you repeat on that.
321 00:36:02.600 ⇒ 00:36:09.769 Awaish Kumar: Do you know the concept of hooks in the Dbt, so there are some pre hooks and the post hooks?
322 00:36:10.080 ⇒ 00:36:13.310 Awaish Kumar: Oh, do you know anything about them?
323 00:36:14.720 ⇒ 00:36:25.030 jonathan g: I haven’t encountered books yet, unless you are referring to connect web hook connecting to Github or slack, but if that is not, I think I haven’t.
324 00:36:25.190 ⇒ 00:36:39.050 Awaish Kumar: Okay, so I am talking about hooks like in the Dbt, you can, for example, when obviously we connect with the database, we work with the database. Indeed! Dbt so the hook, pre hook or post hook is something
325 00:36:39.180 ⇒ 00:36:52.285 Awaish Kumar: I want to run a query. I want to run a execute a model. And then I want to say that. Okay, let’s while creating this while running this model. I created some
326 00:36:54.720 ⇒ 00:36:59.290 Awaish Kumar: like, I want to give permission to some like I want to. I want to create a table.
327 00:36:59.890 ⇒ 00:37:05.610 Awaish Kumar: and after that I want to make sure that Jonathan has the access to that table.
328 00:37:05.790 ⇒ 00:37:07.360 Awaish Kumar: I want to write some graph
329 00:37:07.770 ⇒ 00:37:10.419 Awaish Kumar: you can select on this table
330 00:37:10.550 ⇒ 00:37:21.059 Awaish Kumar: 2 user something like that. So I in the post hopes basically you can do that, you can run it from Kiri, like, which is a model in Dbt
331 00:37:21.270 ⇒ 00:37:29.650 Awaish Kumar: a model executes. And after that the post hook executes, and then basically put in the post hook, we can run any SQL commands.
332 00:37:34.320 ⇒ 00:37:36.909 jonathan g: If something looks I heard about it.
333 00:37:37.580 ⇒ 00:37:43.326 Awaish Kumar: Okay, what what about like?
334 00:37:45.070 ⇒ 00:37:47.160 Awaish Kumar: since you have worked with bigquery? Right?
335 00:37:50.080 ⇒ 00:37:51.500 Awaish Kumar: Alright.
336 00:37:51.960 ⇒ 00:37:53.488 Awaish Kumar: So what is like?
337 00:37:55.530 ⇒ 00:38:02.349 Awaish Kumar: Hmm and qualify? Keyword does in bigquery.
338 00:38:05.090 ⇒ 00:38:07.970 jonathan g: 25 keywords. Did I hear it correctly?
339 00:38:08.290 ⇒ 00:38:16.560 Awaish Kumar: Yes, yes, so like slack from where group I there is one more keyword it’s called qualify.
340 00:38:16.950 ⇒ 00:38:18.460 Awaish Kumar: Have you ever used it?
341 00:38:19.980 ⇒ 00:38:22.250 jonathan g: Having a usage or qualified.
342 00:38:23.150 ⇒ 00:38:27.204 Awaish Kumar: Okay, and what about have you
343 00:38:30.324 ⇒ 00:38:33.645 Awaish Kumar: like, did you know anything about?
344 00:38:35.580 ⇒ 00:38:39.820 Awaish Kumar: you already mentioned? Quite yeah. Cities. What? Like.
345 00:38:40.270 ⇒ 00:38:43.930 Awaish Kumar: what different window functions have you used in bigquery?
346 00:38:46.930 ⇒ 00:38:54.319 jonathan g: Yeah, I’ve encountered wrong number. I say, that’s the most common when it comes to window function. I think. All. SQL,
347 00:38:54.929 ⇒ 00:39:01.070 jonathan g: does it matter. If that is dbt bigquery or other database vendors, they’re using pro.
348 00:39:03.380 ⇒ 00:39:10.290 Awaish Kumar: Oh, and how would you? So do you know about different slowly changing dimension types.
349 00:39:13.100 ⇒ 00:39:17.130 jonathan g: And for the slowly changing types there is a CD type, one.
350 00:39:17.130 ⇒ 00:39:20.569 Awaish Kumar: There is a concept of slowly changing dimensions.
351 00:39:21.160 ⇒ 00:39:25.320 Awaish Kumar: And for slowly changing dimension. There are different types of it.
352 00:39:27.650 ⇒ 00:39:30.469 Awaish Kumar: And it’s it’s called Scd type
353 00:39:30.810 ⇒ 00:39:36.449 Awaish Kumar: 1 0 1, 2 like that. So can you elaborate more on this.
354 00:39:42.800 ⇒ 00:39:50.029 jonathan g: Well for the Scd type 0, I think, from what I understand is that it is just a
355 00:39:51.020 ⇒ 00:39:55.429 jonathan g: it’s simple extraction. So there’s no changes that needs to be included.
356 00:39:55.600 ⇒ 00:40:01.840 jonathan g: whereas for type one, there is a data that needs to be transformed.
357 00:40:02.030 ⇒ 00:40:06.019 jonathan g: and for the second one that applies also for
358 00:40:06.190 ⇒ 00:40:10.489 jonathan g: the not just the data, but also the column name as well.
359 00:40:11.200 ⇒ 00:40:13.530 jonathan g: When when you want to do something.
360 00:40:14.870 ⇒ 00:40:26.230 Awaish Kumar: And no, for the oops, slowly changing like for the slowly changing dimensions
361 00:40:28.790 ⇒ 00:40:34.930 Awaish Kumar: like type 2 like you mentioned that like what is.
362 00:40:36.200 ⇒ 00:40:41.160 Awaish Kumar: how like if I want to implement a CD type 2 for my one of my tables.
363 00:40:41.930 ⇒ 00:40:44.259 Awaish Kumar: how how can I implement that.
364 00:40:46.530 ⇒ 00:40:51.979 jonathan g: So from your source, you need to extract, then you have, like a transformation in the middle.
365 00:40:52.100 ⇒ 00:41:00.200 jonathan g: From there you need to change your. You do need to do mapping. So originally, your source is just a
366 00:41:01.360 ⇒ 00:41:15.349 jonathan g: let’s say Id. Then there’s the integer, then for your target. You want to do it as string. Then there’s also an underscore on the 1st name of Id, so that will be like underscore Id.
367 00:41:15.460 ⇒ 00:41:17.730 jonathan g: Then it will be changed to 3.
368 00:41:20.080 ⇒ 00:41:22.708 Awaish Kumar: Okay, so yeah, like
369 00:41:24.040 ⇒ 00:41:26.720 Awaish Kumar: that was all the questions from my side.
370 00:41:27.040 ⇒ 00:41:33.010 Awaish Kumar: So now, if you want to ask anything about brain force. And what do we do here? Or
371 00:41:34.220 ⇒ 00:41:35.600 Awaish Kumar: please go ahead.
372 00:41:36.160 ⇒ 00:41:38.030 jonathan g: Mind if I asking That
373 00:41:38.360 ⇒ 00:41:43.850 jonathan g: is spring. Forge is like a startup company, or this is already an established company.
374 00:41:46.060 ⇒ 00:41:51.150 Awaish Kumar: Brain forager is a Bootstrap Startup Company.
375 00:41:51.700 ⇒ 00:41:56.180 Awaish Kumar: We are only a team of like, as I mentioned, 10 to 15 people
376 00:41:56.630 ⇒ 00:41:59.660 Awaish Kumar: working on data and AI consultancy services.
377 00:41:59.950 ⇒ 00:42:06.029 Awaish Kumar: And we provide flexibility to work from anywhere in the world.
378 00:42:06.420 ⇒ 00:42:13.600 Awaish Kumar: Also with the flexibility to work with any kind of engagement like full time, part time
379 00:42:14.370 ⇒ 00:42:16.029 Awaish Kumar: with with my information.
380 00:42:19.680 ⇒ 00:42:24.249 jonathan g: So, and from that one participant
381 00:42:24.410 ⇒ 00:42:32.579 jonathan g: is this like a new position? Or this is like a additional role. Additional account for this role.
382 00:42:32.580 ⇒ 00:42:33.350 Awaish Kumar: Sorry.
383 00:42:33.940 ⇒ 00:42:37.620 jonathan g: Is this a new open role, or additional.
384 00:42:39.805 ⇒ 00:42:44.244 Awaish Kumar: So this is like, obviously new role
385 00:42:45.260 ⇒ 00:42:54.490 Awaish Kumar: in a sense that, as I mentioned, we are a data AI consistency from we continue to get different clients
386 00:42:55.070 ⇒ 00:43:02.689 Awaish Kumar: to work on their data. And for that we are the our like data in AI roles are always open.
387 00:43:03.550 ⇒ 00:43:07.849 Awaish Kumar: So we are always looking out for data people.
388 00:43:08.700 ⇒ 00:43:13.309 Awaish Kumar: Because we are continue to get like data projects.
389 00:43:13.720 ⇒ 00:43:18.920 Awaish Kumar: And yeah, so we are always looking for new people to come and try us.
390 00:43:19.940 ⇒ 00:43:23.360 jonathan g: Oh, okay, I have a question. Speaking of AI.
391 00:43:23.500 ⇒ 00:43:28.449 jonathan g: Is your, is the company open for AI whenever you are.
392 00:43:28.550 ⇒ 00:43:35.839 jonathan g: do? If yes, do you utilize the AI, or do you have, like an existing tool, right.
393 00:43:38.090 ⇒ 00:43:49.880 Awaish Kumar: So there are 2 2 things in terms of AI. What I said so number is, one is like as an AI engineer. You develop the the systems for the clients, for the internal teams.
394 00:43:50.751 ⇒ 00:43:54.188 Awaish Kumar: There’s there’s the development part. The second part is
395 00:43:55.030 ⇒ 00:44:03.140 Awaish Kumar: the using of AI tools to to improve our what performance
396 00:44:03.190 ⇒ 00:44:14.590 Awaish Kumar: is it is a is a different thing. So we we have both. So like our our company has AI engineers which are building tools for our clients which are making like these goals
397 00:44:14.640 ⇒ 00:44:36.720 Awaish Kumar: and different AI services for our clients and for our internal teams. But on the other side we are very actively using AI in our any kind of development and data engineering data analytics. So if you want to build some models you can use, you’re free to use any AI tool we have like a cursor id to to do that. We have
398 00:44:37.120 ⇒ 00:44:42.490 Awaish Kumar: chat Gpt subscription. We have. We are using azure open. AI. But
399 00:44:42.995 ⇒ 00:44:52.489 Awaish Kumar: if there is something which is really useful for the team members, we are okay, and open to get that for our team.
400 00:44:54.320 ⇒ 00:45:07.119 jonathan g: Thanks for sharing that one also any question. So for this phone is the team very flexible on their work schedule, like what I mean to say that they can work during their time zone.
401 00:45:07.510 ⇒ 00:45:17.480 jonathan g: For example, if you are in the Us. You can work in your day, shift schedule for Philippines. You can work on your time schedule. Then.
402 00:45:17.750 ⇒ 00:45:22.220 jonathan g: if there’s a meeting, then you just attend afterwards. Call it a day.
403 00:45:22.390 ⇒ 00:45:24.490 jonathan g: That’s something on your team.
404 00:45:25.780 ⇒ 00:45:31.420 Awaish Kumar: Yeah, like, we are very flexible with the time zone. You can work at any time wherever you are.
405 00:45:33.320 ⇒ 00:45:36.970 jonathan g: Okay, that’s good to hear. Then, also.
406 00:45:38.820 ⇒ 00:45:41.320 jonathan g: yeah, what will be the next step after this interview?
407 00:45:42.045 ⇒ 00:45:47.739 Awaish Kumar: Yeah, like my team like, our operation team will connect with you after some time
408 00:45:48.590 ⇒ 00:45:51.219 Awaish Kumar: on this, like for the next steps in this week.
409 00:45:52.090 ⇒ 00:45:54.439 jonathan g: Okay. How long should I wait for your feedback.
410 00:45:55.380 ⇒ 00:45:58.869 Awaish Kumar: Yeah, I mean, in this week our our team is going to reach out.
411 00:45:59.730 ⇒ 00:46:00.240 Awaish Kumar: Okay.
412 00:46:00.240 ⇒ 00:46:02.510 jonathan g: Right? Yeah. Thanks. Aish for your time.