Meeting Title: Brainforge Team Member Introduction Date: 2026-03-13 Meeting participants: Brylle Girang, Awaish Kumar


Brylle Girang: Hello, Wish. Awaish Kumar: Aye. Brylle Girang: Thank you for taking the time. Awaish Kumar: No worries. Brylle Girang: Okay, this is just an… I just wanted to meet with you officially, and then just get to know each other. I know that we have worked together for almost 4 weeks now, and we haven’t had the chance to, like. know each other personally. You good with that? Awaish Kumar: Yeah, yeah, sure. Brylle Girang: Okay, I’m going to start with, how did you end up with Brainforge? Awaish Kumar: Okay, I end up with Rain Forge, like, I, like… I annotate in a DBT… community, like, Slack channel. Brylle Girang: Yes. Awaish Kumar: Very… Brylle Girang: Let me guess, Kukota messaged you. Awaish Kumar: Yes, like, I… I was also messaging in this channel, and then he DM’d me. That, we can connect, and that’s how we… Basically, yeah, got collected. Brylle Girang: Okay, okay. That’s… that’s not… that’s not new. I think Otam has messaged almost everyone in the team right now, and that’s how they ended up in Brainforge. Awaish Kumar: So how… Brylle Girang: How did you start in the, in the data, in the data field? Awaish Kumar: Yeah, sorry, I just went drink water. Brylle Girang: No worries. Awaish Kumar: So, when I was, Actually, when I was graduating, data science was Kind of a hype. I graduated in 2017, So there… the data size was in, like, kind of hot at that time as well, like… data science and ML and all of these things. So I wanted to become a data scientist. we wanted to work in a data field, but then in Pakistan, we didn’t have many… Like, data… like, companies that do, actually, the data science. Like, it was really very new, especially in Pakistan, it was just getting started, so people just, like. So, like, the companies in Europe and US were working on data science projects, and what they did, like, to get the labor. To plug their data, they find it in Pakistan. So they found me, like, I started working in a consultancy, which was basically providing data Florida. for a company in UK, which basically was doing data science. So, I ended up being there, so I thought, like, that’s the closest I can be to a data science world. So, I ended up being there, like, doing web scrapping, and Like, All the backend work, all the web scrapping in for setup. things like that. That’s how I started, and then I continued there. Like, from a web scrapping engineer, I become a data engineer. After some time, when I I joined another company, it was a product company, like, they were building their own product, but they were selling… there was a data science startup in Denmark. So I started working for them, and Yeah, one was… one task was to still do the web scripting and all, collect data, but then I grew there to become a data engineer, so I started, like, With the cleanup, data pipelines, aggregations, and airflow, and all these tools, and the cloud thing, they all, like, started to end up on my plate, and I continue to, like, learn and develop those pipelines. Then I moved to Denmark, like, I worked there. for 3 years. I worked in a vacation rental industry, and then I worked, In a, like, a shipping and logistics company. Yeah, for both of them, I basically set up the data foundations. They were pretty new in the data world. And, although they had some data scientists or data analysts doing their work, but there was no data engineering info, like, all of our, like, legacy tools, SQL Server, and queries, and… FTP servers. So, yeah, I went there to basically build that infrastructure, like, modernize that infrastructure, directly use the cloud, and move data from FTPs, through some… using tools like Airflow. And then… There, I started working for a Canadian startup. Because I didn’t like the enterprise environment so much. I went back to being in a startup. That was also a data… Science startup, you can say, in a sense that the whole purpose of that startup was to… to build the dashboards, like, using AI, So that, like, you don’t have to ask, like, you don’t need, like, it was an AI analyst, for example. Brylle Girang: Yeah. Awaish Kumar: We were building an AI analyst, which, like, given the data. It will give you the answers. Why, if the revenue is down 5%, why that is? What are the reasons, and what actions you can take? From there, I moved… Yeah, I was a senior data engineer there, so I trained one other person there. And then they moved me to their sister company, Meno Games. It was a gaming startup, also. I moved there to build their infra there, because… Why it was all broken. Oh… After that, I joined Brainforge. Brylle Girang: Did a lot of stuff change, like, in your journey within the data engineering field? Are lots of things changing? Awaish Kumar: a lot of things, right? Obviously, like, setting up infra, setting up… writing Python scripts to get the web scrapping done. like, initially, it was, like, what? Like, you’re writing Python scripts to script some websites, and website has some CAPTCHAs and all of that, and you are using Your mind to basically bypass those. Now there’s a lot of tools in the market, okay, do web scraping with AI. Like, connect the website, get the data. And, basically, that’s it, right? It has become very easy. And also, like, yeah, like… Then, like, in that moment, like, you have to do basically all the engineering, like, I did, like, reverse engineering to figure out how, they basically secured the site, and I have to… find the… find the… those paths, like, I have to mimic those steps in a code to basically reach there to get to grab the data. It was tough back then. And similarly, like, the tools evolved, like, instead of… we were using Airflow, which was, like, 1.10 version, like, really, very… Like, have a… very legacy version of Airflow. Now it has become really advanced, offer it is acquired by Stronoma, there is a cloud version of Airflow. which you can just use, like, log in, start, and use, right? Before, you were using open source versions, managing it, all the infra. So… then we… then FiveTran and AirByte and tools like… ingestion tools like that, they enter in the market. And, it made even things more simpler for Injacent, like, before I have to write a… a script to move the data from Postgres to somewhere else, and From one source to another, I think. You have to write all of, like, the code to run… to hit those endpoints. like, they’ll take care of edge cases and things like that. Like, it’s really simple. We just crank the… Source, and just, like, hit enter, and that’s all. Brylle Girang: Okay, okay. So, Awash, if… I personally wanted to, like, understand more about data engineering, because I feel like I’m going to be able to help you guys out better if I at least know the things that you’re doing, right? Where do you think should I start? Awaish Kumar: Like, you want to learn what we are doing, or do you want to learn, like, how it is being done? Brylle Girang: I guess both, both of those. Yeah, I want to learn what you’re doing so that, you know, I can understand the goals, and then at the same time, I want to learn how you’re doing it so that we can… we can focus on, like, improving the workflows, etc, etc. I know that it’s going… it might take me, like, 10 years, but I want to start somewhere. Awaish Kumar: Like, it depends on where you wanna… Where you wanna go? Like, if you want to just… If you’re thinking about Just, like, being better at project management. because you all know everything around data, that’s one thing. If you really want to become a data engineer and start working as a data engineer, that’s another story, but obviously you can start reading, Basically, start reading about the data engineering, like, I forgot the name, right? There was a nice book regarding that. So it was, like, you can learn all the basic stuff data engineers do, and the best practices, and the… I can find that out. So basically that is, That is that. Second thing is where… You can get an intro… On, like, obviously on YouTube. unknown. What is? Like, what exactly we are doing. What else, like… purposes, there is a lot of tools involved. I don’t know how to convey, like, for example, like, if you… in a data engineering world, the one section is ingestion. There’s a lot of ways to do the ingestion. So, number one tool that we are doing in company is Polytomic. We are just using a third-party tool, right? We ask them to build a connector for us. Once it’s there, I just go in, set up a connection, and start to sync. It just… It’s just, like, two, three clicks, or something like that. Brylle Girang: Polyatomic is similar to, like, 5Tran, is that right? Awaish Kumar: Yes. Okay, okay, perfect. So, if you want to learn that, like, it’s nothing, right? You can do that, right? You just have to log into Polytomic. Select the connector, add some keys which you get from your client, and hit the run button. So, nobody can do that. There’s no rocket science in that. The only thing… only rocket science is that figuring out what to ingest. If you’re… if Polyatomic once asks, like, okay, for the Amazon, there’s a lot of API endpoints. Let us know what exactly do you need. Then you have to go to the Amazon’s API and figure out what endpoints, actually, you need the data from. If you go to the documentation of Amazon API, figure out those endpoints, and then… Let the polytomic know, I need data from these endpoints, and they will build a solution for you. That’s one thing. Second thing on the data ingestion part is, like, for EDAM, for example, we had so many small connectors that we didn’t ask Polyatomic to build. Instead, I wrote my own scripts in Python, and that are orchestrated using Daxter. So we use one tool called Daxter that is an orchestration tool, similar to Airflow. where I have my own scripts that are running for ingestion, and also there are some scripts for AI team. So that is, like, an orchestration tool where… which is back-end, it’s, like, compute is on cloud, so… and you… we pay for, like, the brain forge pays for the compute, whatever it is, at the end of the month, and That’s how it is, like, it’s a cloud-based, so it’s a… What’d you say, serverless, so you… we don’t deploy any servers. That extra is not deployed through any of the servers from us, so we just, get the, self-hosted version. cloud-hosted version, basically, and it is sourced from Pythium in the cloud, and we use it. And then you write your scripts, and those scripts are run into some cloud compute, and… We get the results. So, data engineering is basically… learning the full end-to-end pipeline of it. So what I was talking about is just… just ingestion. thing, or in ETL, you can call it an extract. part, where we have the data from sources. So that is one of the things. And this is very, very… I mean, like, the atomic unit, what I’m saying. So ETL is also part of data engineering, it’s not all data engineering. This is also the… even the part of ETL, and it’s just one part of the… ATL, and in that, I talked about ingestion, that’s also just from using a tool or an API call. There is a lot of different things you might have to do, you might have to write SQL to maybe if you have to get data from a A legacy database? something like AS400 server. There’s no connector for that in any of the tools, and there’s no API for that. You just have to connect. And write queries. and grab the data, and run those queries, maybe inside of a Python script, so you can read that. And… and move it around. So Yeah, this… so you can basically start Learning about the basics of data engineering, the terminology. Right? ET… what is ETL, what is check, what is transform load? And what is infrastructure? What is CICD? So… all those keywords around what we are doing, what is modeling, what is dbt, what is Snowflake. Learning about all these tools, what they do. And all the keywords of, like, when we are talking about a model, what that means. a dbt model, or… yeah. And, what is DVT? How… that works, and most of our work in the company right now is dbt modeling. So you can even, Learn more about that. What is, like, how… We have different sources, like Shopify, Amazon. How we do modeling for each of those, and What are the standard models we create for each of these sources, and… and and, like, what basically are… what… what is similar between multiple clients, right? We have already… I have already talked about this, like, Shopify, for example. I’m doing… Brylle Girang: Yeah. Awaish Kumar: for Shopify, for 10… 10 clients. We have did it for 10 clients, so… why I have to do it again and again? Right now, maybe we are doing it because we are using AI, and it’s really easy with that, like, I can just… I grabbed the descriptions, for the API, for the Shopify, and… and ask it to create, basically, it can create, best models for me, where I can just give more feedback on the business domain knowledge of that particular client. Apart from that, it is all the standard, like, what is coming from a standard API. So, it can be… can it be reusable? Yes. So, given the context for the client, given the metrics definition, and… what columns are coming from Shopify for a particular, particular client, it can be automated, like, at least with the cursor. So there are so many things that can be standardized, automated via cursor. But, like, I don’t really have a bandwidth to go after those. Brylle Girang: Okay, that’s… that’s where I’m most… most interested about. I think the best thing that I can do with helping you out is, like, trying to build these automations, but I do know that for me to be… for me to build those automations, I should have, like, a good understanding of what you’re doing. Do you store those ideas in one place, or how do you… how do you collect your ideas? Like, hey, this could be automated, I wish this could be automated, etc? Awaish Kumar: like, we have discussed a lot of times this with Utam, right? I have shared this, that I can do that. I can create this automation, I can… we can also create, like, like, maybe a Python package, like a library basically installs, which is a library, which basically, in the backend, maybe use AI or whatever. And you can just say, I can… I can install it, and maybe it’s a CLI, I can say, okay, install, add, Shopify models for me, right? For this client. with that. And… And basically, it should know where the context is, right? Now that we have standardize our platform, right? We use the brain-forged platform, where are the The walls, clients, and where the vontex lives, right? So it just should know everything, and it can create models for me. So it can happen that way, like, and you can take one of the Shopify modeling as an example, right? Given these tables, in a row, how we end up with this, so… We have, like. These models, for example, for… Right now, maybe Urban, like, I don’t know. I’m doing it for Element. But we… we previously also had, like. Clients will… where we have done seminar. More like, Javi Coffee, or… or urban stamps, maybe, I don’t… yeah. So… We can also find those… that code in also in our… Huh. Brylle Girang: Oh, okay. Awaish Kumar: Babase or somewhere, like, I, like, we… I… I also used to work on those things, back then, so we moved that… all of that code was now. Cloud is no more with us, and we don’t have access to the Gitapo, but we have the version of the Gitapo where we worked. In our superbase somewhere. And maybe also industry, so… We can, like, grab all that code, and it can build us the models, and… And with that, like, you can also Like, obviously there will be some few changes, because… each client sends data, set up… set up the, like, the Shopify environment differently. Based on that, the API and… the data which comes through API might differ, right? Maybe you find one field which is… which has data for one client, and it’s null for another, like, something like that. But these are nuances, like, that you have to… gather, like, manually, after you have the data, right? We can’t decentralize The client-specific configuration. Brylle Girang: Yeah. Okay, okay, gotcha. That’s really helpful. Thank you so much for your time, Aish. I, I appreciate you getting me through that. I guess if you have any more, like, if you have an idea of what to automate, you’re finding something really time-consuming that can easily be done, etc, and you don’t have the bandwidth to do that, please let me know. I… Awaish Kumar: Hi, I have lots of… One thing that I’m struggling with right now is Telling the cursor to use word connection, to connect, like, I… there are two things. One. whenever I build a model, I want to run it, on dbt, so I can… and also, I want to run it, like, after it is… it creates a model, I want to run its, some SQL queries on top of it, and then validate it, or how it looks in the QA versus how it looks in the prod. And, to do that, like, cursors sometimes forget, like, the context, right? Though, like. And then I have to, like, I have some scripts written in my environment where I can then point it to, like, use this. and connect to a Snowflake, right? So, what happens is, there is a connection, like, CTA cursor. And in that, we have a set of, like, what username and what everything else it should use. sometimes when I ask him to, like, carry Snowflake, although it has used that same connection in my chat, it goes back again to, like, okay, for the user of A, I have to find now the… the connection, and then it basically tries to use my connection name, and And because I’m… I’m… it can’t use my name as a login, because… For… for a person, it needs… there’s a multi-factor authentication. It can’t authenticate with Snowflake. I wanted to use service count. I have to quantity to use my exact connection name, or whatever, or something, or I have a script, which basically use this to connect it, and that it works fine, so… Maybe we can have some kind of guideline for that, like, okay, for each. Brylle Girang: Yeah, yeah. Awaish Kumar: It should know, like, what connection, and… The connection also should be standardized. Connection names can be standardized, like, everybody should know, okay, for this. And we can just say, like, for this client, execute this, carry on Snowflake, right? And it should know, like, what connection name to use, and… And execute it. Brylle Girang: Yeah, yeah, I think that’s going to be entirely possible using courser rules. Is it okay if I request, like, how it looks like on your end? Like, how do you do that? What do you tell cursor, etc. Maybe we can do that. Maybe you can screen record that if you have the time. Would that be fine? Awaish Kumar: Yup, I can do that. Brylle Girang: Okay, perfect, yeah. I just want to see how it looks like, so that the workflow that I build is going to be exactly, how it… how it should be, but that’s… that’s… I think that’s totally possible. Like, we can just build, like. a rule to map specific clients to specific rule… specific scripts, and then that should lessen the time that the… that the agent tries to search for something. Okay. Awaish Kumar: Okay, yeah, thank you. I just have to finish up on the CTA. Thank you. Brylle Girang: of H. Awaish Kumar: Thank you. Thank you. Bye. Brylle Girang: Bye-bye.