2025-07-15_data_engineer_interview_fernando_rodrigu

Meeting Title: Data Engineer Interview (Fernando Rodrigues Nepomuceno) Date: 2025-07-15 Meeting participants: Awaish Kumar, Fernando Rodrigues Nepomuceno

WEBVTT

1 00:01:13.480 ⇒ 00:01:14.340 Awaish Kumar: Hi.

2 00:01:15.330 ⇒ 00:01:16.450 Fernando Rodrigues Nepomuceno: Hello! Aish.

3 00:01:18.340 ⇒ 00:01:19.580 Awaish Kumar: How are you doing.

4 00:01:20.410 ⇒ 00:01:22.530 Fernando Rodrigues Nepomuceno: I’m doing good. How about you?

5 00:01:23.170 ⇒ 00:01:26.359 Awaish Kumar: Yeah. How to pronounce your name? Fernando.

6 00:01:26.910 ⇒ 00:01:32.489 Fernando Rodrigues Nepomuceno: Fernando, is it perfect, and how to pronounce your name.

7 00:01:33.070 ⇒ 00:01:34.699 Awaish Kumar: Yeah. It’s Arish Kumar.

8 00:01:35.700 ⇒ 00:01:42.425 Fernando Rodrigues Nepomuceno: Ayesh Kumar is easier.

9 00:01:43.970 ⇒ 00:01:46.459 Awaish Kumar: Yeah, you can call just command.

10 00:01:46.580 ⇒ 00:01:47.090 Awaish Kumar: That’s.

11 00:01:47.090 ⇒ 00:01:47.910 Fernando Rodrigues Nepomuceno: Okay.

12 00:01:47.910 ⇒ 00:01:53.439 Awaish Kumar: Most mostly people prefer. And where are you located? By the way.

13 00:01:54.750 ⇒ 00:01:55.630 Fernando Rodrigues Nepomuceno: Okay.

14 00:01:56.786 ⇒ 00:01:58.209 Awaish Kumar: Where are you located?

15 00:01:58.690 ⇒ 00:02:08.389 Fernando Rodrigues Nepomuceno: Oh, so so! Sorry! I’m located in Brazil. So in Sao, Paulo City should be more accurate.

16 00:02:09.350 ⇒ 00:02:10.100 Awaish Kumar: You know.

17 00:02:10.680 ⇒ 00:02:15.279 Awaish Kumar: Okay, what time is it in the and the Brazilian.

18 00:02:15.280 ⇒ 00:02:19.809 Fernando Rodrigues Nepomuceno: Now we it is 10 Am.

19 00:02:24.510 ⇒ 00:02:26.520 Fernando Rodrigues Nepomuceno: Where? Where are you? From?

20 00:02:27.450 ⇒ 00:02:28.779 Awaish Kumar: I’m from Pakistan.

21 00:02:29.580 ⇒ 00:02:31.400 Fernando Rodrigues Nepomuceno: Oh, Pakistan, yeah.

22 00:02:31.870 ⇒ 00:02:40.430 Awaish Kumar: So in this like interview today. I will just share the agenda agenda for the

23 00:02:41.497 ⇒ 00:02:52.552 Awaish Kumar: like the interview. And how we are going to move forward, and then we can start right. So 1st of all, I I can give you some overview of myself and the company, and then

24 00:02:52.980 ⇒ 00:02:57.630 Awaish Kumar: we can start you can choose yourself, and then we can start from there.

25 00:02:57.890 ⇒ 00:02:58.420 Awaish Kumar: Oh.

26 00:02:58.420 ⇒ 00:02:58.750 Fernando Rodrigues Nepomuceno: Okay.

27 00:02:58.750 ⇒ 00:02:59.690 Awaish Kumar: So far

28 00:03:00.680 ⇒ 00:03:12.970 Awaish Kumar: my name is Avesh Kumar, and I have been leading lead data engineer for like for for 8 years now I’ve been working at different companies, startups and the enterprise level

29 00:03:14.360 ⇒ 00:03:19.924 Awaish Kumar: working in revenue department and different places to basically

30 00:03:21.280 ⇒ 00:03:26.270 Awaish Kumar: handle the entire data pipeline and data engineering like foundations.

31 00:03:26.940 ⇒ 00:03:32.979 Awaish Kumar: And so I’m I’m working as a kind of engineering manager here at brain food.

32 00:03:33.270 ⇒ 00:03:40.260 Awaish Kumar: And it’s a brain forge. What we what we are doing is we are a data and AI

33 00:03:41.098 ⇒ 00:03:49.759 Awaish Kumar: consultancy. So we provide services to different clients. I spend across different industries.

34 00:03:49.870 ⇒ 00:03:50.919 Awaish Kumar: Oh, okay.

35 00:03:51.220 ⇒ 00:04:17.299 Awaish Kumar: So mostly, we, we prefer to provide data related services setting up like, for example, marketing, analytics, product analytics. And to to do that like whatever support is required we handle that. And also apart from that, as a as new AI like requirements are increasing for the AI services we have started offering

36 00:04:17.470 ⇒ 00:04:23.369 Awaish Kumar: like creating chatbots, using Llms and different other agentic projects.

37 00:04:24.091 ⇒ 00:04:28.100 Awaish Kumar: For providing AI services to the clients.

38 00:04:28.700 ⇒ 00:04:40.689 Awaish Kumar: Yeah, that’s mostly it. We are team of like 15 people and we we do have a like flexibility for people to join us as a full time, part time.

39 00:04:40.820 ⇒ 00:04:43.650 Awaish Kumar: and then from like across the world.

40 00:04:45.550 ⇒ 00:04:46.550 Awaish Kumar: Oh, nice.

41 00:04:46.550 ⇒ 00:04:49.469 Awaish Kumar: Yeah. So now you can introduce yourself.

42 00:04:51.250 ⇒ 00:04:52.090 Fernando Rodrigues Nepomuceno: Sure.

43 00:04:52.100 ⇒ 00:05:08.530 Fernando Rodrigues Nepomuceno: thanks for the opportunity. 1st of all, it’s a pleasure to talk to you. My name is Fernando. I’m a 42 years old. I live in Brazil. I’m a seasoned data, professional, with a strong passion for attorney raw data into

44 00:05:08.540 ⇒ 00:05:28.920 Fernando Rodrigues Nepomuceno: actionable insights. Over the past 11 years. I’ve developed deep expertise in data engineering data, science areas, enabling business to harness their data effectively and implementing some tools that can easy

45 00:05:29.020 ⇒ 00:05:32.593 Fernando Rodrigues Nepomuceno: their lives is about to make their business

46 00:05:33.200 ⇒ 00:05:39.129 Fernando Rodrigues Nepomuceno: more manageable harnessing technology power.

47 00:05:39.250 ⇒ 00:06:02.060 Fernando Rodrigues Nepomuceno: I have some foundation programming software engineering advanced data modeling data pipelines and query, optimization and I could work as well with machine learning and AI during a certain time being limited between data engineering and data science tasks.

48 00:06:02.830 ⇒ 00:06:28.039 Fernando Rodrigues Nepomuceno: My experience is diversity projects, including my current project. that, I’m working now for Hcl. For apple where I I mean working designing scalable data pipelines optimizing snowflake transformations and real time data processing as well.

49 00:06:28.801 ⇒ 00:06:44.700 Fernando Rodrigues Nepomuceno: And talking about data, the the stack that I’m using now, now I’m using Python seco. Snowflake started recently to touching some

50 00:06:45.446 ⇒ 00:06:56.770 Fernando Rodrigues Nepomuceno: Kubernetes tests as well, because, there was a replacement from our on premise kubernetes to

51 00:06:56.870 ⇒ 00:07:04.670 Fernando Rodrigues Nepomuceno: and kubernetes based on Aps cluster I could work wow!

52 00:07:05.030 ⇒ 00:07:18.870 Fernando Rodrigues Nepomuceno: With hadoop and spark. Now I I could work with a sparking issue with Snowflake as well, leveraging some capabilities like Udf and advanced

53 00:07:18.980 ⇒ 00:07:40.349 Fernando Rodrigues Nepomuceno: and complex data transformations. But in symmetric, because I could work directly with hadoopy and spark as well, mainly, but it’s still having python sequel, Linux bash, scripting and impala and and hive.

54 00:07:40.780 ⇒ 00:07:45.010 Fernando Rodrigues Nepomuceno: and in my technical stack

55 00:07:45.782 ⇒ 00:08:00.350 Fernando Rodrigues Nepomuceno: exit the season hardware. I could work with more data science and the AI things. But it’s still working on some Etl

56 00:08:01.088 ⇒ 00:08:19.599 Fernando Rodrigues Nepomuceno: implementations. And I like some analytics into machine learning initiatives, using cloud platforms like Aws, mainly. And Gcp throughout my career, I’ve collaborated with globally cross functional teams. There’s

57 00:08:20.640 ⇒ 00:08:29.359 Fernando Rodrigues Nepomuceno: there’s something I think almost 5 years that I’ve been working with global teams, sometimes being spread on

58 00:08:29.480 ⇒ 00:08:57.269 Fernando Rodrigues Nepomuceno: on the globe or sometimes being located more in Israel, or even in U.S.A. And I could collaborate with them to deliver in between solutions that enhance data, quality, governance, compliancy, and performance and proficiency with Python, like I mentioned SQL. Linux data, warehouses, engines like snowflake

59 00:08:57.590 ⇒ 00:09:06.699 Fernando Rodrigues Nepomuceno: redshift in bigquery and in terms of orchestration, ingestion and processing tools. I have experience with

60 00:09:07.260 ⇒ 00:09:13.680 Fernando Rodrigues Nepomuceno: dark docker kubernetes and a passion as well.

61 00:09:14.240 ⇒ 00:09:20.529 Fernando Rodrigues Nepomuceno: That’s sum up about my background and feel free to ask me more about

62 00:09:21.370 ⇒ 00:09:25.540 Fernando Rodrigues Nepomuceno: what you need about to tell you to ask is our.

63 00:09:25.540 ⇒ 00:09:27.159 Awaish Kumar: What is your like?

64 00:09:27.630 ⇒ 00:09:31.049 Awaish Kumar: What is your total data? Engineering experience?

65 00:09:31.850 ⇒ 00:09:38.010 Fernando Rodrigues Nepomuceno: 8 years. Actually, I’ve started, yeah, yeah, I’ve started working.

66 00:09:38.010 ⇒ 00:09:44.719 Awaish Kumar: Great your your experience with these like, how how do you rate your experience with these tools like

67 00:09:45.170 ⇒ 00:09:50.350 Awaish Kumar: Python, SQL, spark snowflake Hadoop, dbt.

68 00:09:50.750 ⇒ 00:09:53.210 Awaish Kumar: If you given like number out of 10.

69 00:09:53.650 ⇒ 00:10:02.301 Fernando Rodrigues Nepomuceno: So, python, I’d say that I have 7. Certainly I need to explore more things. But

70 00:10:03.020 ⇒ 00:10:08.987 Fernando Rodrigues Nepomuceno: I’m I’m used to working, even though using

71 00:10:10.200 ⇒ 00:10:19.635 Fernando Rodrigues Nepomuceno: function approach and reentered the object based. Approach with seco, I’d say that

72 00:10:20.370 ⇒ 00:10:40.559 Fernando Rodrigues Nepomuceno: I I put myself in late, because if we use this since I had a business intelligence role. And then that was my my 1st language airflow. I’d say that you’d be 6

73 00:10:43.820 ⇒ 00:10:53.629 Fernando Rodrigues Nepomuceno: snowflake I’d put 7 as well. Which other so sorry.

74 00:10:53.960 ⇒ 00:10:54.840 Awaish Kumar: Dbt.

75 00:10:55.520 ⇒ 00:11:11.589 Fernando Rodrigues Nepomuceno: Dbt, oh, dbt, yeah. Dbt, expert, because it’s lacking some professional experience with that. I’ve been making some personal projects with that to understand how it works, but I’ve

76 00:11:13.600 ⇒ 00:11:16.310 Fernando Rodrigues Nepomuceno: spark, I’d say is having as well.

77 00:11:17.930 ⇒ 00:11:20.150 Awaish Kumar: Okay, so

78 00:11:20.390 ⇒ 00:11:27.499 Awaish Kumar: like, so you must have been writing like, like, how would you have handled the transformation? Mostly like.

79 00:11:27.730 ⇒ 00:11:32.340 Awaish Kumar: is it more like if you can give, give me an example of a project.

80 00:11:33.187 ⇒ 00:11:36.810 Awaish Kumar: Along with its like detailed architecture of

81 00:11:37.880 ⇒ 00:11:41.799 Awaish Kumar: how it it will be, works end to end.

82 00:11:43.190 ⇒ 00:11:58.655 Fernando Rodrigues Nepomuceno: So recently. Most recently, I’ve deployed important project for for Apple, where they had 2 my sequel clusters. But the point here is that

83 00:12:00.096 ⇒ 00:12:08.593 Fernando Rodrigues Nepomuceno: did those my sequel clusters had different behavior. One send us

84 00:12:10.080 ⇒ 00:12:19.440 Fernando Rodrigues Nepomuceno: data in milliseconds timeframe and another one receiving data from

85 00:12:20.650 ⇒ 00:12:30.830 Fernando Rodrigues Nepomuceno: some systems by using batch. And the the challenge here was combined, both because

86 00:12:31.622 ⇒ 00:12:39.329 Fernando Rodrigues Nepomuceno: he was required to collect data from both clusters with different workflows

87 00:12:40.113 ⇒ 00:13:01.989 Fernando Rodrigues Nepomuceno: to have any unique data sets that they could use to measure some important figures about the the process of quality based on that. I’ve implemented a deck connecting this

88 00:13:02.230 ⇒ 00:13:12.450 Fernando Rodrigues Nepomuceno: 2 different clusters by using airflow hooks that we have available in the last

89 00:13:13.177 ⇒ 00:13:28.442 Fernando Rodrigues Nepomuceno: versions that we have. And based on that. I I split it. The the process based on Katka architecture where I could treat

90 00:13:29.470 ⇒ 00:13:37.040 Fernando Rodrigues Nepomuceno: even this event to drive behavior and the batch behavior.

91 00:13:37.380 ⇒ 00:13:48.719 Fernando Rodrigues Nepomuceno: we feel only one data pipeline to guarantee that we are syncing all the the data available at once.

92 00:13:49.481 ⇒ 00:14:01.279 Fernando Rodrigues Nepomuceno: In terms of the transformation I’ve applied to some specific functions based on business requirements. For example, it was necessary to.

93 00:14:02.000 ⇒ 00:14:09.134 Awaish Kumar: So like you’re saying that you handle both real time flow and the batch flow.

94 00:14:09.610 ⇒ 00:14:10.710 Fernando Rodrigues Nepomuceno: Only.

95 00:14:10.710 ⇒ 00:14:11.999 Awaish Kumar: Some same day.

96 00:14:12.680 ⇒ 00:14:16.370 Fernando Rodrigues Nepomuceno: The yeah, in the same, in the same way, in the same way.

97 00:14:16.940 ⇒ 00:14:30.840 Fernando Rodrigues Nepomuceno: because at the final of the day, independently of the time that the Aniro could be linked.

98 00:14:31.360 ⇒ 00:14:39.589 Fernando Rodrigues Nepomuceno: They wanted to have these numbers consolidated independently of the

99 00:14:39.770 ⇒ 00:14:57.959 Fernando Rodrigues Nepomuceno: the the data source. That that’s the point. And it was necessary that implemented this in order to guarantee that they could have all the data syncing ticket into a final database

100 00:14:58.080 ⇒ 00:15:01.799 Fernando Rodrigues Nepomuceno: to generate their measures.

101 00:15:01.800 ⇒ 00:15:02.450 Awaish Kumar: Double proof.

102 00:15:02.450 ⇒ 00:15:02.790 Fernando Rodrigues Nepomuceno: That.

103 00:15:02.790 ⇒ 00:15:11.760 Awaish Kumar: From the source system the data which is going to some hmm somewhere for the real time.

104 00:15:11.870 ⇒ 00:15:18.039 Awaish Kumar: Application like how like airflow was solving that problem.

105 00:15:18.730 ⇒ 00:15:25.500 Fernando Rodrigues Nepomuceno: Yeah, yeah, I’ve I’ve used airflow to orchestrate this

106 00:15:25.930 ⇒ 00:15:30.670 Fernando Rodrigues Nepomuceno: 2 parts of the the data pipeline that that’s the.

107 00:15:30.670 ⇒ 00:15:35.679 Awaish Kumar: I’m still what I want to understand for the real time. Data.

108 00:15:36.456 ⇒ 00:15:40.770 Awaish Kumar: Like airflow triggers, dags on a schedule.

109 00:15:41.450 ⇒ 00:15:42.260 Awaish Kumar: No.

110 00:15:43.390 ⇒ 00:15:47.869 Awaish Kumar: And so that so data is somewhere is being generated

111 00:15:48.601 ⇒ 00:15:52.120 Awaish Kumar: in the real time. And we want to move it.

112 00:15:52.280 ⇒ 00:15:56.620 Awaish Kumar: So like, what else did you use along with airflow.

113 00:15:57.370 ⇒ 00:16:07.029 Fernando Rodrigues Nepomuceno: 1 1 thing, what the the airflow daggy was scheduled to run every every minute.

114 00:16:07.940 ⇒ 00:16:20.259 Fernando Rodrigues Nepomuceno: That’s that’s a point. Because and to ensure that you would have, we wouldn’t have a duplicated to rows.

115 00:16:20.260 ⇒ 00:16:20.860 Awaish Kumar: Strange.

116 00:16:20.860 ⇒ 00:16:23.609 Fernando Rodrigues Nepomuceno: From the real tiny part.

117 00:16:23.750 ⇒ 00:16:27.589 Fernando Rodrigues Nepomuceno: It was used, the Delta time

118 00:16:27.890 ⇒ 00:16:34.190 Fernando Rodrigues Nepomuceno: being, the updated field into the the data source.

119 00:16:35.780 ⇒ 00:16:43.750 Awaish Kumar: Okay? So if they are, you’re loading in every minute. And how did you handle the duplications? Sorry.

120 00:16:44.790 ⇒ 00:16:51.290 Fernando Rodrigues Nepomuceno: Yeah, the the to guarantee that we wouldn’t have duplications.

121 00:16:51.500 ⇒ 00:16:57.130 Fernando Rodrigues Nepomuceno: You was used the date field called update

122 00:16:58.250 ⇒ 00:17:08.749 Fernando Rodrigues Nepomuceno: that, for example, in the service. If the the row was for example, inserted or even updated

123 00:17:09.190 ⇒ 00:17:27.929 Fernando Rodrigues Nepomuceno: the field, the the field updated would be the control to show us. If it, it’s a a new a new role, or updated every time that we had a new row.

124 00:17:28.130 ⇒ 00:17:38.019 Fernando Rodrigues Nepomuceno: this field of the data you’d build be field, and every time that we’d have updated row

125 00:17:38.150 ⇒ 00:17:44.949 Fernando Rodrigues Nepomuceno: this same fields would be would be fields as well changed.

126 00:17:45.110 ⇒ 00:17:58.029 Fernando Rodrigues Nepomuceno: That’s the the that that’s the the field that was used to guarantee that we he started the the load from

127 00:17:58.440 ⇒ 00:18:03.909 Fernando Rodrigues Nepomuceno: any specific piece. Each of the data source.

128 00:18:05.900 ⇒ 00:18:19.729 Awaish Kumar: Okay? So you’re saying, the data is coming from some source airflow is running on a schedule, and then you are handling some using some field to handle deduplications. But what

129 00:18:20.130 ⇒ 00:18:27.170 Awaish Kumar: the flow? What exact connectors did you use? What was that deduplication logic? Was it like?

130 00:18:27.400 ⇒ 00:18:30.059 Awaish Kumar: SQL. Query like, what is the code.

131 00:18:30.060 ⇒ 00:18:42.964 Fernando Rodrigues Nepomuceno: Yeah, the yeah. The duplication logic was put into the silver layer where I could use a sequel script

132 00:18:43.820 ⇒ 00:19:04.729 Fernando Rodrigues Nepomuceno: a piece, a piece of of a sequel script inside the function ensuring that I had all the fields and making accounts of this at the final and using a having clause to

133 00:19:05.190 ⇒ 00:19:16.900 Fernando Rodrigues Nepomuceno: to get only only those that I have. Eco eco! One about my count.

134 00:19:17.840 ⇒ 00:19:27.920 Awaish Kumar: And so like, what can like airflow operators? Have you used.

135 00:19:28.580 ⇒ 00:19:34.530 Fernando Rodrigues Nepomuceno: I’ve used the the, my sequel. Hook operators. Actually.

136 00:19:37.900 ⇒ 00:19:46.800 Fernando Rodrigues Nepomuceno: yeah. Yeah. Only that one. Only that one. Because even having 2 different data source.

137 00:19:46.800 ⇒ 00:19:47.210 Awaish Kumar: Sorry.

138 00:19:47.210 ⇒ 00:19:50.090 Fernando Rodrigues Nepomuceno: Both work from my sequel.

139 00:19:50.330 ⇒ 00:19:50.570 Awaish Kumar: Different.

140 00:19:53.540 ⇒ 00:19:57.200 Fernando Rodrigues Nepomuceno: No, no, I guess I guess.

141 00:19:58.130 ⇒ 00:19:59.820 Fernando Rodrigues Nepomuceno: Oh, sorry. Go ahead, please.

142 00:19:59.870 ⇒ 00:20:13.050 Awaish Kumar: Yeah, so there are airflow hooks. And then there are, connectors, operators. Sorry. And then there are source sensors. So what exactly what combine, like to build a complete workflow.

143 00:20:13.280 ⇒ 00:20:18.099 Awaish Kumar: What exactly, what combination of all these you have visualized.

144 00:20:20.430 ⇒ 00:20:37.070 Fernando Rodrigues Nepomuceno: So I had. I could use the connectors to ensure that all the information that I had in my secrets would be used to connect, for example, my my sequel. Id.

145 00:20:37.210 ⇒ 00:20:50.540 Fernando Rodrigues Nepomuceno: that is, it’s a specific key user types with what I have about credentials in the token.

146 00:20:50.830 ⇒ 00:20:52.859 Awaish Kumar: So I understand that. But I’m I’m

147 00:20:52.960 ⇒ 00:21:01.670 Awaish Kumar: like you have some connection. Ids. Usually you have some keys stored against them, but in the airflow when you we are building a deck.

148 00:21:01.930 ⇒ 00:21:05.889 Awaish Kumar: So a dag is a combination of different tasks right.

149 00:21:06.560 ⇒ 00:21:07.010 Fernando Rodrigues Nepomuceno: See.

150 00:21:07.040 ⇒ 00:21:10.800 Awaish Kumar: There are some operators, some sensors.

151 00:21:11.440 ⇒ 00:21:20.090 Awaish Kumar: She might have utilized, those to build fire, a, a, a complex attack or workflow. So

152 00:21:20.462 ⇒ 00:21:25.909 Awaish Kumar: what? Exactly have you used like, for example, I can give you an example like there’s 1 called Python operator.

153 00:21:26.130 ⇒ 00:21:29.580 Awaish Kumar: So that is used to run python functions.

154 00:21:29.980 ⇒ 00:21:36.820 Awaish Kumar: So similarly, Mysql, if you have used Mysql operators, so Mysql provides some

155 00:21:37.230 ⇒ 00:21:43.830 Awaish Kumar: operators like there are names of different operators, so have you used them.

156 00:21:45.140 ⇒ 00:21:50.690 Fernando Rodrigues Nepomuceno: Yeah, I’ve used but to be since I don’t remember the

157 00:21:50.810 ⇒ 00:22:03.550 Fernando Rodrigues Nepomuceno: exactly naming of this I can. I can say you that in my overall. How I’ve used that, but exactly naming. I don’t remember. Sorry.

158 00:22:03.550 ⇒ 00:22:08.649 Awaish Kumar: Okay, so is it a recent experience? Or have you just used it in the past?

159 00:22:09.650 ⇒ 00:22:19.709 Fernando Rodrigues Nepomuceno: No, no, not the past, but the the point here is that I don’t remember the exactly normal naming of the the disobey. Later.

160 00:22:19.890 ⇒ 00:22:25.329 Awaish Kumar: No, I’m I’m not exact. I’m not specific to one operator. My question is to general like.

161 00:22:25.470 ⇒ 00:22:30.839 Awaish Kumar: but like in airflow, what kind of different operators have you used? You can name any you.

162 00:22:30.840 ⇒ 00:22:49.020 Fernando Rodrigues Nepomuceno: I could. Yeah, I could. I could. I could use to pass parameters from a test to the following, use the gene jetting plates to

163 00:22:49.150 ⇒ 00:23:02.909 Fernando Rodrigues Nepomuceno: make some filters dynamically recovering information from the the the systems. I could use it all.

164 00:23:03.480 ⇒ 00:23:07.930 Awaish Kumar: How many workflows have you built in the in the airflow.

165 00:23:09.720 ⇒ 00:23:14.200 Fernando Rodrigues Nepomuceno: Oh, I’d say that to more than than to Annie.

166 00:23:14.420 ⇒ 00:23:16.309 Fernando Rodrigues Nepomuceno: Actually, I could work.

167 00:23:16.310 ⇒ 00:23:16.740 Awaish Kumar: Working.

168 00:23:17.409 ⇒ 00:23:29.459 Fernando Rodrigues Nepomuceno: With aws environment. And now I’m working more orchestrating processes involving snowflake. In making this integration.

169 00:23:29.700 ⇒ 00:23:31.640 Awaish Kumar: So why have you?

170 00:23:32.260 ⇒ 00:23:35.760 Awaish Kumar: Why have you used like when you have snowflake?

171 00:23:36.050 ⇒ 00:23:47.519 Awaish Kumar: Why did you chose airflow for real time? Streaming instead of directly using snowflakes, features.

172 00:23:48.270 ⇒ 00:23:55.450 Fernando Rodrigues Nepomuceno: Yeah, actually, this is one thing that it was already determined by the apple team.

173 00:23:55.890 ⇒ 00:24:01.109 Fernando Rodrigues Nepomuceno: We didn’t have too much

174 00:24:01.550 ⇒ 00:24:30.309 Fernando Rodrigues Nepomuceno: space to this. He decided that that’s why he was. Yeah. He was implemented by the customer. And he was determined that. Okay, you need to work in this in in this architecture, but I had another project where I could use snow pipe in order to connect on s 3 buckets without to use airflow like your orchestrator.

175 00:24:30.400 ⇒ 00:24:47.818 Fernando Rodrigues Nepomuceno: The point is that the most software data pipelines that we had in the customers today is using this, the predefined architecture, using airflow air, flow as orchestrator triggering

176 00:24:48.650 ⇒ 00:24:55.020 Fernando Rodrigues Nepomuceno: tasks into snowflake in order to make data processment, or something like that.

177 00:24:56.400 ⇒ 00:25:01.979 Awaish Kumar: Okay? And can you like describe the architecture of air airflow itself?

178 00:25:04.190 ⇒ 00:25:15.500 Fernando Rodrigues Nepomuceno: So architecture, we about airflow itself. It’s composed by a data source where

179 00:25:15.670 ⇒ 00:25:21.700 Fernando Rodrigues Nepomuceno: you have the information about the created tags.

180 00:25:22.050 ⇒ 00:25:36.639 Fernando Rodrigues Nepomuceno: credentials and stuff and other things of you have some set, some options to hold this

181 00:25:36.800 ⇒ 00:25:40.659 Fernando Rodrigues Nepomuceno: and user kubernetes to accelerate

182 00:25:42.028 ⇒ 00:25:53.731 Fernando Rodrigues Nepomuceno: the processment as well. What more can I say, is that it’s organized in

183 00:25:55.534 ⇒ 00:26:01.620 Fernando Rodrigues Nepomuceno: into into dags that are a secret

184 00:26:01.820 ⇒ 00:26:12.892 Fernando Rodrigues Nepomuceno: graphs. Actually, where we need we have the data lineage, the the test lineage about what is happening, and the way

185 00:26:15.040 ⇒ 00:26:16.220 Fernando Rodrigues Nepomuceno: came to me.

186 00:26:17.540 ⇒ 00:26:24.050 Awaish Kumar: So, for example, for example, if we have a, we build a data pipeline in airflow.

187 00:26:24.480 ⇒ 00:26:27.740 Awaish Kumar: and we have a task basically.

188 00:26:27.870 ⇒ 00:26:29.350 Awaish Kumar: And

189 00:26:30.170 ⇒ 00:26:43.770 Awaish Kumar: if that runs forever, for sometimes sometime it runs fine, and sometimes it just gets stuck, and it’s it’s just stuck, and it is. It stalled our pipeline

190 00:26:44.000 ⇒ 00:26:46.449 Awaish Kumar: and it’s stuck for forever like

191 00:26:47.078 ⇒ 00:26:50.499 Awaish Kumar: so how can we resolve that situation?

192 00:26:53.320 ⇒ 00:27:02.260 Fernando Rodrigues Nepomuceno: So just to recap this situation. We have a deck that sometimes is. How about you know.

193 00:27:02.260 ⇒ 00:27:09.320 Awaish Kumar: The workflow. In the tag, for example, I have 5 different tasks.

194 00:27:10.230 ⇒ 00:27:11.750 Awaish Kumar: No 1st one.

195 00:27:11.980 ⇒ 00:27:18.830 Awaish Kumar: Sometimes it just executes successfully and full pipeline just executes successfully without any

196 00:27:19.010 ⇒ 00:27:24.170 Awaish Kumar: problems, but sometimes it gets stuck, for example, on some

197 00:27:25.149 ⇒ 00:27:32.820 Awaish Kumar: memory heavy of calculation or something, it just gets stuck, and it’s very slow. And now

198 00:27:33.527 ⇒ 00:27:37.170 Awaish Kumar: we, it just is sitting there.

199 00:27:37.480 ⇒ 00:27:42.619 Awaish Kumar: and our pipeline is is waiting on that task.

200 00:27:42.810 ⇒ 00:27:50.469 Awaish Kumar: And and we don’t know because it has not failed yet. So we didn’t got the notification

201 00:27:51.316 ⇒ 00:27:58.940 Awaish Kumar: of failure either. So how can we solve this issue? So we have much more

202 00:27:59.621 ⇒ 00:28:02.990 Awaish Kumar: observability. And we can like

203 00:28:03.130 ⇒ 00:28:07.659 Awaish Kumar: take action on that. So what what we can do basically here in this situation.

204 00:28:09.100 ⇒ 00:28:18.759 Fernando Rodrigues Nepomuceno: So 1st it would be, are simply sending a notification for

205 00:28:19.260 ⇒ 00:28:31.420 Fernando Rodrigues Nepomuceno: the the deck owner by putting the his email and the- the false.

206 00:28:32.180 ⇒ 00:28:48.160 Fernando Rodrigues Nepomuceno: It’s it it could solve this part of problem, or even create. If we we use and and is lacking, we can also send some notification to a slack channel.

207 00:28:48.290 ⇒ 00:29:02.670 Fernando Rodrigues Nepomuceno: After that, try to increase the the attacks. The hit rise of the this task would be a good approach if it’s not being considered putting

208 00:29:03.070 ⇒ 00:29:13.950 Fernando Rodrigues Nepomuceno: 2 or 3. And yeah, in a more advanced approach, I’d say that another thing could.

209 00:29:14.570 ⇒ 00:29:24.485 Fernando Rodrigues Nepomuceno: Yeah, depending on the how, if it can be skipped or not? Maybe you we we could

210 00:29:26.550 ⇒ 00:29:38.450 Fernando Rodrigues Nepomuceno: includes a configuration to over to to bypass the the display of the tasks. But I I guess that if he this tasks

211 00:29:38.450 ⇒ 00:29:46.099 Fernando Rodrigues Nepomuceno: same, let’s say, depending from the other ones, I’d say that you wouldn’t possible to to apply.

212 00:29:46.100 ⇒ 00:29:59.749 Awaish Kumar: So let’s let’s say, we can skip this, basically. So this, what is one task which is the start of the pipeline, but is optional. We wanna get some data from some Api endpoint. But we already have some data.

213 00:29:59.900 ⇒ 00:30:15.190 Awaish Kumar: some snapshot already. So we just want to see if we can refresh if we get new data is that’s really really nice. If we don’t, we just want to move forward with existing snapshot. So we we really want to see that we don’t want to get stuck.

214 00:30:15.430 ⇒ 00:30:19.139 Awaish Kumar: We we just want to see if it succeeds done. If it is

215 00:30:19.694 ⇒ 00:30:31.420 Awaish Kumar: start. If it fails, just move, move just skip it and move to run other tasks. But the problem is, it’s not successing, succeeding. And it’s not failing.

216 00:30:31.610 ⇒ 00:30:38.160 Awaish Kumar: It’s just get stuck in the like a running state, so how to handle that.

217 00:30:40.440 ⇒ 00:30:51.940 Fernando Rodrigues Nepomuceno: Yeah, maybe use a salary. But to be sincere. I’ve never implemented this this kind of thing, actually.

218 00:30:52.420 ⇒ 00:30:57.130 Fernando Rodrigues Nepomuceno: But maybe implementing a salary to

219 00:30:57.370 ⇒ 00:31:01.679 Fernando Rodrigues Nepomuceno: to deal with that kind of of problem.

220 00:31:02.840 ⇒ 00:31:03.180 Awaish Kumar: Okay.

221 00:31:03.180 ⇒ 00:31:05.820 Fernando Rodrigues Nepomuceno: But this is just an assumption.

222 00:31:06.780 ⇒ 00:31:13.959 Awaish Kumar: Yeah, we can move ahead. So in terms of SQL, like

223 00:31:15.410 ⇒ 00:31:18.970 Awaish Kumar: And like, what are you like really familiar with

224 00:31:19.600 ⇒ 00:31:27.669 Awaish Kumar: like, like, for example, bigquery as bigquery is standard SQL. Or postqueres, SQL. Or

225 00:31:28.110 ⇒ 00:31:32.390 Awaish Kumar: or like different like what variant you are like comfortable with.

226 00:31:33.990 ⇒ 00:31:42.089 Fernando Rodrigues Nepomuceno: So I’ve already put to work here in aws Athena bigquery as well. A lot

227 00:31:42.998 ⇒ 00:31:48.749 Fernando Rodrigues Nepomuceno: in Redshift. I could work in another tool.

228 00:31:49.210 ⇒ 00:31:53.690 Awaish Kumar: What are like? What are city cities.

229 00:31:54.920 ⇒ 00:32:10.550 Fernando Rodrigues Nepomuceno: Yeah, I’ve already could use that in order to modularize and complex. And when it has a lot of business logic behind that, until we have the.

230 00:32:10.690 ⇒ 00:32:13.449 Fernando Rodrigues Nepomuceno: the, the final data set.

231 00:32:13.450 ⇒ 00:32:15.650 Awaish Kumar: Yeah. But what is a city.

232 00:32:16.850 ⇒ 00:32:17.410 Fernando Rodrigues Nepomuceno: It’s

233 00:32:17.410 ⇒ 00:32:32.819 Fernando Rodrigues Nepomuceno: we are. Yeah. City are, is a feature used to to put the preliminary results into the memory in a way that we can using. The

234 00:32:32.950 ⇒ 00:32:36.130 Fernando Rodrigues Nepomuceno: 1st is typing into the the code.

235 00:32:40.160 ⇒ 00:32:51.680 Awaish Kumar: Okay? And so and SQL, like, what

236 00:32:52.050 ⇒ 00:32:56.660 Awaish Kumar: in general? Yes, not not specific to any kind of database.

237 00:32:57.020 ⇒ 00:33:03.440 Awaish Kumar: What are some strategies you you can use, to optimize your

238 00:33:05.630 ⇒ 00:33:09.619 Awaish Kumar: What is your queries in terms of both speed and memory?

239 00:33:11.780 ⇒ 00:33:19.640 Fernando Rodrigues Nepomuceno: So 1st of all, I need you to check the the query plan

240 00:33:20.521 ⇒ 00:33:42.670 Fernando Rodrigues Nepomuceno: most of the sequel engines that I use. You can. Cover that by the the use. For example, in in Snowflake you can see where is happening the bottleneck about the the query and then identify what would be the parts that we need to

241 00:33:42.810 ⇒ 00:33:48.929 Fernando Rodrigues Nepomuceno: tackle in order to optimize it. It would be the the the 1st thing

242 00:33:49.842 ⇒ 00:33:58.830 Fernando Rodrigues Nepomuceno: after that. Try to understand you how to partitioning

243 00:33:59.833 ⇒ 00:34:14.230 Fernando Rodrigues Nepomuceno: the this in in certain way, could be a good thing, recognizing what would be the nature key that we can use to partitioning in some case.

244 00:34:14.350 ⇒ 00:34:24.298 Fernando Rodrigues Nepomuceno: Another good strategy is, only use the fields that we really need to report and

245 00:34:25.250 ⇒ 00:34:37.449 Fernando Rodrigues Nepomuceno: try to avoid, to use the asterisk the because it will, it will provoke a fully scan in the table when we try to

246 00:34:38.090 ⇒ 00:34:43.799 Fernando Rodrigues Nepomuceno: to cover in a select query all the fields.

247 00:34:48.590 ⇒ 00:35:01.319 Fernando Rodrigues Nepomuceno: yeah. Try to to under. Try to when we are using, for example, left to join. Try to understand the natural, direct direction from

248 00:35:01.530 ⇒ 00:35:05.469 Fernando Rodrigues Nepomuceno: from the the tables and the

249 00:35:06.492 ⇒ 00:35:12.999 Fernando Rodrigues Nepomuceno: how we it can match in the in the right stable correctly.

250 00:35:13.110 ⇒ 00:35:17.375 Fernando Rodrigues Nepomuceno: I think this is a good strategy, and

251 00:35:18.940 ⇒ 00:35:26.421 Fernando Rodrigues Nepomuceno: I think. Apply Ct is A is a good approach instead of user use

252 00:35:27.670 ⇒ 00:35:41.760 Fernando Rodrigues Nepomuceno: energy, not energize, but self joys are another ways of joins. I guess we can leverage this feature today. Very well.

253 00:35:42.090 ⇒ 00:35:49.609 Fernando Rodrigues Nepomuceno: And yeah, I think my main strategy in this way, actually.

254 00:35:51.966 ⇒ 00:35:54.400 Awaish Kumar: So like what is indexing.

255 00:35:56.620 ⇒ 00:36:11.179 Fernando Rodrigues Nepomuceno: Yeah, indexing is a strategy that we you can use when we identify that certain field is being too much using in a

256 00:36:11.610 ⇒ 00:36:19.149 Fernando Rodrigues Nepomuceno: a SQL. Query. Let’s say that we have identified that

257 00:36:20.079 ⇒ 00:36:28.020 Fernando Rodrigues Nepomuceno: I call a lot like Id is being too much used for group by purpose

258 00:36:28.280 ⇒ 00:36:40.240 Fernando Rodrigues Nepomuceno: in order to have improvements in the performance. We can create an index using this id column

259 00:36:40.470 ⇒ 00:36:44.899 Fernando Rodrigues Nepomuceno: to have some improvements in the performance.

260 00:36:45.980 ⇒ 00:36:50.689 Awaish Kumar: So what is the difference between clustered and non-clustered index.

261 00:36:54.333 ⇒ 00:36:56.400 Fernando Rodrigues Nepomuceno: I’m not sure about that.

262 00:36:56.520 ⇒ 00:37:11.459 Fernando Rodrigues Nepomuceno: This is specific question about about cluster, about clustering. I have this this concept, but I’m not sure how it impacts directly in index.

263 00:37:12.410 ⇒ 00:37:12.960 Awaish Kumar: Okay.

264 00:37:13.070 ⇒ 00:37:22.209 Awaish Kumar: And so like, have you worked with the like? Scds, like, what are different types of Scd.

265 00:37:25.304 ⇒ 00:37:27.970 Fernando Rodrigues Nepomuceno: At different times of sorry.

266 00:37:28.670 ⇒ 00:37:32.740 Awaish Kumar: S. CD. Like slowly changing dimensions.

267 00:37:32.740 ⇒ 00:37:39.250 Fernando Rodrigues Nepomuceno: Oh, it’s lately change dimensions. Yeah, I’ve already worked with type one and 2

268 00:37:39.720 ⇒ 00:37:49.800 Fernando Rodrigues Nepomuceno: actually into the type you one you basically make in a paint, in the.

269 00:37:49.950 ⇒ 00:38:02.639 Fernando Rodrigues Nepomuceno: in a auxiliary table that holds all the changes that we have for in specific

270 00:38:03.020 ⇒ 00:38:14.950 Fernando Rodrigues Nepomuceno: item in in the table. Let’s say that we have. For example, customer table where it’s changing

271 00:38:15.610 ⇒ 00:38:22.743 Fernando Rodrigues Nepomuceno: their address. For example, every time that we have a new address it will be updated a new

272 00:38:24.610 ⇒ 00:38:42.770 Fernando Rodrigues Nepomuceno: a new row with the newest information about the customer type Sd type 2 is where we have a different state structure where we are having different columns

273 00:38:43.150 ⇒ 00:38:48.890 Fernando Rodrigues Nepomuceno: being, but every time that we have any change in the main table.

274 00:38:50.310 ⇒ 00:38:56.070 Awaish Kumar: If you need, and what are different types of isolation.

275 00:38:59.850 ⇒ 00:39:15.620 Fernando Rodrigues Nepomuceno: Different types of translation. So I just know one where we it’s it’s a gap got from the concept of the assets. And this is what I’m more familiar with.

276 00:39:15.840 ⇒ 00:39:22.259 Fernando Rodrigues Nepomuceno: where we need to guarantee that every transaction has your own

277 00:39:22.570 ⇒ 00:39:26.860 Fernando Rodrigues Nepomuceno: final without impacting the the next one.

278 00:39:28.060 ⇒ 00:39:37.560 Fernando Rodrigues Nepomuceno: That would be the-the- concept of isolation that I don’t have. I’m not familiar with the the other types.

279 00:39:38.700 ⇒ 00:39:41.709 Awaish Kumar: Okay, okay, I I don’t have any

280 00:39:42.090 ⇒ 00:39:45.599 Awaish Kumar: more like technical questions. So I have just few

281 00:39:46.280 ⇒ 00:39:51.370 Awaish Kumar: generic question. Now, like, What are are your career goals?

282 00:39:53.340 ⇒ 00:39:56.050 Fernando Rodrigues Nepomuceno: What are my key goals in?

283 00:39:57.720 ⇒ 00:40:07.509 Fernando Rodrigues Nepomuceno: Okay? It goes. Yeah. So I mean, I may trying to advance. And things

284 00:40:07.750 ⇒ 00:40:16.680 Fernando Rodrigues Nepomuceno: mainly more related to real time processing. That’s 1 thing that I’d like to

285 00:40:17.470 ⇒ 00:40:27.925 Fernando Rodrigues Nepomuceno: deep inside in this talk dbt, of course. But I’m getting familiar with that. But I know that this is

286 00:40:30.008 ⇒ 00:40:40.369 Fernando Rodrigues Nepomuceno: he agreed to requirement so that it’s being demanded in data engineering fields. And so it’s like to advance more.

287 00:40:40.890 ⇒ 00:40:50.340 Fernando Rodrigues Nepomuceno: And in in infrastructure talks as well. I think these are the, my, my

288 00:40:51.910 ⇒ 00:40:59.939 Fernando Rodrigues Nepomuceno: 3, 3 of my my goals that that I have for the the, the maximums.

289 00:41:02.000 ⇒ 00:41:02.780 Awaish Kumar: Okay

290 00:41:03.030 ⇒ 00:41:12.339 Awaish Kumar: and like. And what are the things which are you which are like? You are not good at and are not interested in doing

291 00:41:12.770 ⇒ 00:41:13.860 Awaish Kumar: professionally.

292 00:41:16.050 ⇒ 00:41:24.570 Fernando Rodrigues Nepomuceno: So I’m more interested in to turn myself in a special specialist, actually.

293 00:41:25.542 ⇒ 00:41:34.089 Fernando Rodrigues Nepomuceno: being able to abstract more the business needs and put it in solutions. I think this is

294 00:41:34.250 ⇒ 00:41:41.170 Fernando Rodrigues Nepomuceno: what I’m willing driving, trying to to do in the

295 00:41:41.410 ⇒ 00:41:44.729 Fernando Rodrigues Nepomuceno: in the last the last month, and

296 00:41:45.010 ⇒ 00:41:53.850 Fernando Rodrigues Nepomuceno: I don’t see to myself too much as a in a manager role, you know, because I really love to

297 00:41:54.260 ⇒ 00:42:00.710 Fernando Rodrigues Nepomuceno: to do technical things, to put my hands in the dirty and the

298 00:42:01.150 ⇒ 00:42:05.869 Fernando Rodrigues Nepomuceno: build solutions based on what I have as knowledge made.

299 00:42:06.150 ⇒ 00:42:12.109 Fernando Rodrigues Nepomuceno: This is why I don’t see I don’t see myself too much in a management role.

300 00:42:14.470 ⇒ 00:42:18.220 Awaish Kumar: Okay and cool.

301 00:42:19.160 ⇒ 00:42:26.160 Awaish Kumar: It’s like, out of last 5 or 3 bosses you would work with

302 00:42:26.290 ⇒ 00:42:35.220 Awaish Kumar: if I go and ask them about your performance, and also about your Drew herself as a

303 00:42:35.410 ⇒ 00:42:36.530 Awaish Kumar: has a colleague.

304 00:42:36.710 ⇒ 00:42:38.320 Awaish Kumar: How would they rate you.

305 00:42:39.900 ⇒ 00:42:54.039 Fernando Rodrigues Nepomuceno: So I’d say that they could mention that. I’m really a a problem. So

306 00:42:54.430 ⇒ 00:43:02.100 Fernando Rodrigues Nepomuceno: and really focused on the business. And they always open minds to

307 00:43:02.250 ⇒ 00:43:15.679 Fernando Rodrigues Nepomuceno: here. Listen about other ideas, and put me my ideas in the same place together in order to have a best final solution.

308 00:43:16.416 ⇒ 00:43:30.080 Fernando Rodrigues Nepomuceno: I think they they could say that. I’m a really, really good team worker work being able to help and support

309 00:43:30.250 ⇒ 00:43:43.939 Fernando Rodrigues Nepomuceno: everyone who needs my my my support timely and having a good communication, good comprehension about the business and their needs.

310 00:43:45.490 ⇒ 00:43:46.090 Awaish Kumar: Right?

311 00:43:47.210 ⇒ 00:43:50.419 Awaish Kumar: Okay, yeah, that’s it from my side.

312 00:43:50.630 ⇒ 00:43:54.609 Awaish Kumar: If now, you can ask if you have any questions from us.

313 00:43:55.800 ⇒ 00:44:12.239 Fernando Rodrigues Nepomuceno: Oh, so the 1st one would be, what’s be your roadmap for the medium and long term in terms of implementations challenges that we are having your team and the

314 00:44:12.390 ⇒ 00:44:18.989 Fernando Rodrigues Nepomuceno: what what are you are expecting for the the next steps

315 00:44:19.100 ⇒ 00:44:27.199 Fernando Rodrigues Nepomuceno: to get the necessary maturity in terms of data, quality data processes and other stuff.

316 00:44:28.950 ⇒ 00:44:42.300 Awaish Kumar: Yeah, so like what we we are looking for, some person like data engineer slash analytics engineer. So who can work on data engineering side, plus also on the

317 00:44:42.500 ⇒ 00:44:48.027 Awaish Kumar: analytics side, like working with snowflake activity data transformations. And

318 00:44:49.100 ⇒ 00:45:02.008 Awaish Kumar: mostly like, we have some internal platform where we basically work to improve how we handle data quality. Like, for example, we use tools like Meta plane, and we have dexter

319 00:45:03.010 ⇒ 00:45:17.900 Awaish Kumar: we utilize dexter instead of airflow. For our workflow orchestration on the data engineering side. But most of the tech stack is based on clients. So this is for our internal work

320 00:45:17.930 ⇒ 00:45:37.769 Awaish Kumar: and for our clients like it’s, it’s it depends really on how like. Obviously, we have a big say, in what tools and technologies they should use. But obviously, if for the big enterprises it’s already defined, or like somebody else

321 00:45:38.758 ⇒ 00:45:42.249 Awaish Kumar: or they already have a data infrastructure.

322 00:45:42.590 ⇒ 00:45:48.430 Awaish Kumar: So we have to work within that right. But otherwise, my

323 00:45:48.809 ⇒ 00:46:18.459 Awaish Kumar: mostly, for, like the most of the clients, we figure out what kind of tools and technologies are going to be good for our client and then and satisfy their needs. We use the those. So we we what we are looking for someone who is a a fast learner. If we need to employ new tool, if we have to use some new tools, new technologies, like some someone who can learn fast and adopt to to the changing requirements. So can I start up

324 00:46:19.309 ⇒ 00:46:26.140 Awaish Kumar: like the environment we know, like we, we talk to clients and things can change. And also

325 00:46:26.639 ⇒ 00:46:30.229 Awaish Kumar: the the someone who can do some context switching because

326 00:46:31.106 ⇒ 00:46:42.439 Awaish Kumar: some like, if some, if we hire you as a full time person, and one of our client only requires 20 h of your time. So obviously you have to work for some other client for your other 2020 h.

327 00:46:42.460 ⇒ 00:47:06.399 Awaish Kumar: and so it it might require some you can contact switching like on the same day. You’re working for someone for 2 h, and then you’re working for someone on for another 2 h. Things like that. And maybe you can get some urgent like, you know as a data engineer. Like, there’s something fails. Data. Pipeline fails, something not refreshed. You get pinged a lot for for urgent requests.

328 00:47:06.895 ⇒ 00:47:32.704 Awaish Kumar: So you’re working on something, and we got some urgent request for another client which utilizes completely different tech stack. So it’s a big switch of like things like you have to switch your complete mind on on the contact switching part. So someone who is strong with his mind, we can do context switching. We can with a fast learner, we can. We can. We can adopt different tools and technologies. And

329 00:47:33.730 ⇒ 00:48:00.657 Awaish Kumar: yeah, that’s it. And like, we, we are doing a lot of. As I mentioned working with AI, we we have very flexibility like you. You can explore whatever way you wanna want to like. You want to be a tech engineer. Data analyst data analytics. Engineer, you want to be a AI engineer like, there’s a lot of AI work there. So like you can like the people who come at Brainford basically can experiment and work along different

330 00:48:02.780 ⇒ 00:48:14.560 Awaish Kumar: like what to say. The carrier different carriers, and then they can like settle themselves whatever they like. Most. So yeah, so in terms of that, there’s a like

331 00:48:15.210 ⇒ 00:48:21.370 Awaish Kumar: we are utilizing new tools and technologies. So yeah, I you, I think I can’t hear you.

332 00:49:02.980 ⇒ 00:49:05.495 Awaish Kumar: I can’t hear you, sue

333 00:49:11.140 ⇒ 00:49:12.209 Fernando Rodrigues Nepomuceno: Can you hear me?

334 00:49:12.540 ⇒ 00:49:13.859 Awaish Kumar: No yes.

335 00:49:14.630 ⇒ 00:49:41.060 Fernando Rodrigues Nepomuceno: Oh, sorry. I don’t know. It’s happening every time, but it’s okay. Maybe it’s an compatibility with from. I had fun with zoom. But yeah, just just one more question. I could understand the the next steps are your data pro products you more or most use internally, or by the customers.

336 00:49:43.080 ⇒ 00:49:44.920 Awaish Kumar: Data. What do you?

337 00:49:45.150 ⇒ 00:49:47.480 Awaish Kumar: What do you mean by data project? How.

338 00:49:47.650 ⇒ 00:49:56.609 Fernando Rodrigues Nepomuceno: No, no, not not the data project. But I understood that you have different data projects products. X.

339 00:49:56.610 ⇒ 00:49:59.830 Awaish Kumar: So we have clients. So it’s a consultancy from.

340 00:49:59.950 ⇒ 00:50:01.099 Awaish Kumar: We have clients.

341 00:50:01.100 ⇒ 00:50:02.550 Fernando Rodrigues Nepomuceno: Which for which we work?

342 00:50:02.550 ⇒ 00:50:02.939 Fernando Rodrigues Nepomuceno: Oh, okay.

343 00:50:03.230 ⇒ 00:50:10.700 Awaish Kumar: Right? So it’s a consultancy firm. So basically, if no, you don’t have a client, we won’t be here, so

344 00:50:11.100 ⇒ 00:50:18.040 Awaish Kumar: we work, for the clients. Clients are our main goal is to work for the client and satisfy them.

345 00:50:18.210 ⇒ 00:50:23.399 Awaish Kumar: But then, like we do have some internal work, which basically is

346 00:50:23.840 ⇒ 00:50:27.499 Awaish Kumar: we are doing to optimize our work, which we do for clients.

347 00:50:27.630 ⇒ 00:50:44.680 Awaish Kumar: So like building some platforms or tools like automating our project management handling data quality. So these are things which we internally, we are working on those. But that’s to like complement the client work. So it’s

348 00:50:44.920 ⇒ 00:50:46.020 Awaish Kumar: it’s it’s to support.

349 00:50:46.020 ⇒ 00:50:46.400 Fernando Rodrigues Nepomuceno: All right.

350 00:50:46.690 ⇒ 00:50:48.430 Awaish Kumar: Yeah, so.

351 00:50:48.430 ⇒ 00:50:53.620 Fernando Rodrigues Nepomuceno: Totally understood. Now, thanks. Thanks for clarifying this.

352 00:50:53.990 ⇒ 00:51:04.199 Fernando Rodrigues Nepomuceno: So I’m okay with my my questions are just another one. What would be the next steps into the process.

353 00:51:05.750 ⇒ 00:51:11.299 Awaish Kumar: Yeah, next steps would be like, I think, someone one of from operations is going to

354 00:51:11.540 ⇒ 00:51:16.832 Awaish Kumar: get back to you in this week without with more feedback.

355 00:51:17.850 ⇒ 00:51:20.590 Awaish Kumar: What? What would be the next step?

356 00:51:21.100 ⇒ 00:51:22.100 Awaish Kumar: 400

357 00:51:23.133 ⇒ 00:51:38.879 Awaish Kumar: to send my feedback today? And then it’s up to operations to like, decide on what is going to be like if you can meet? I I don’t know if you have met anyone before me.

358 00:51:40.140 ⇒ 00:51:43.140 Fernando Rodrigues Nepomuceno: No, no, this is my 1st interview.

359 00:51:43.140 ⇒ 00:51:49.200 Awaish Kumar: This is the 1st one. So after that you might meet one of one other guy from maybe

360 00:51:49.680 ⇒ 00:51:55.933 Awaish Kumar: like looking more like like kind of a culture interview, or something like that, and

361 00:51:56.805 ⇒ 00:52:02.289 Awaish Kumar: and the operations will let you know, like and then schedule that with you.

362 00:52:03.350 ⇒ 00:52:05.790 Fernando Rodrigues Nepomuceno: Okay. Kumar. Sounds good.

363 00:52:06.380 ⇒ 00:52:07.649 Awaish Kumar: Okay. Thank you.

364 00:52:08.630 ⇒ 00:52:09.630 Fernando Rodrigues Nepomuceno: Thank you.

365 00:52:09.790 ⇒ 00:52:11.500 Fernando Rodrigues Nepomuceno: I appreciate that. Take care.

366 00:52:11.850 ⇒ 00:52:12.380 Awaish Kumar: Right, good.

Brainforge Knowledge

Explorer

2025-07-15_data_engineer_interview_fernando_rodrigu_77bdab14

Graph View