Meeting Title: Brainforge Interview w- Pranav Date: 2026-03-24 Meeting participants: Dan Hartley, Pranav Narahari
WEBVTT
1 00:01:00.990 ⇒ 00:01:02.120 Dan Hartley: Paper enough.
2 00:01:03.260 ⇒ 00:01:04.750 Pranav Narahari: Hey Dan, how’s it going?
3 00:01:05.110 ⇒ 00:01:07.210 Dan Hartley: Doing great! What about you?
4 00:01:07.590 ⇒ 00:01:09.500 Pranav Narahari: That’s awesome, I’m doing good as well.
5 00:01:09.700 ⇒ 00:01:10.090 Dan Hartley: Hmm.
6 00:01:10.090 ⇒ 00:01:16.939 Pranav Narahari: Sorry I’m just a little bit late, just coming from back-to-back meetings, just give me, like, 2 seconds.
7 00:01:17.190 ⇒ 00:01:18.469 Dan Hartley: No worries at all.
8 00:01:18.680 ⇒ 00:01:19.940 Pranav Narahari: I appreciate that.
9 00:01:58.930 ⇒ 00:02:02.800 Pranav Narahari: Okay… Cool.
10 00:02:03.010 ⇒ 00:02:03.670 Dan Hartley: Cool.
11 00:02:03.720 ⇒ 00:02:04.580 Pranav Narahari: Yeah.
12 00:02:05.180 ⇒ 00:02:07.440 Pranav Narahari: I think I’m all set. How’s your day been?
13 00:02:08.220 ⇒ 00:02:14.220 Dan Hartley: been great! I’m like, yeah, it’d be nice. Got to just, like,
14 00:02:14.970 ⇒ 00:02:16.779 Dan Hartley: Buzz, just like, pretty much.
15 00:02:16.930 ⇒ 00:02:22.739 Dan Hartley: Unlike every… unlike every other Tuesday, the work was just, like, normal today, so yeah.
16 00:02:23.130 ⇒ 00:02:24.039 Pranav Narahari: Okay, cool.
17 00:02:24.290 ⇒ 00:02:24.840 Dan Hartley: Yeah.
18 00:02:24.840 ⇒ 00:02:25.250 Pranav Narahari: That’s dope.
19 00:02:25.250 ⇒ 00:02:25.780 Dan Hartley: bye.
20 00:02:26.250 ⇒ 00:02:30.499 Pranav Narahari: What does your current day-to-day look like? Are you a student? Are you working?
21 00:02:30.810 ⇒ 00:02:45.580 Dan Hartley: No, so, I am… I’ve been a machine learning engineer for the past 6 years. I am currently working as a team lead, and… yeah, at HatchBricks, so my current day-to-day looks like, just like whenever I hop in, just, like, conduct some stand-ups.
22 00:02:45.580 ⇒ 00:02:57.590 Dan Hartley: go by with approving some PRs and check them out, and then just, like, go buy the tickets, assign them to just, like, some other dads, check out if there’s, like, any current problems, and then just, like, hands-on code. So, yes.
23 00:02:57.730 ⇒ 00:02:58.570 Dan Hartley: That’s.
24 00:02:58.570 ⇒ 00:03:01.479 Pranav Narahari: Awesome. That’s awesome.
25 00:03:01.950 ⇒ 00:03:18.990 Pranav Narahari: how I kind of like to structure interviews is, yeah, just kind of, like, a quick, like, introduction, and then I like to deep dive into a specific project, an AI-related project, and then ask you, like, some specific questions, and then we can, like, run through, like, maybe some specific examples of
26 00:03:19.010 ⇒ 00:03:35.110 Pranav Narahari: problems or just scenarios that happen here at Brainforge, and then we can have some time at the end to, like, go through questions. I know we’re starting a little bit late here, and I have a hard stop in 27 minutes, so I’m happy to, like, answer any additional questions, like, over email, though, just to kind of make sure you have.
27 00:03:35.110 ⇒ 00:03:35.750 Dan Hartley: over easily.
28 00:03:36.290 ⇒ 00:03:37.020 Pranav Narahari: Perfect.
29 00:03:37.020 ⇒ 00:03:38.079 Dan Hartley: Sorry, that one.
30 00:03:38.430 ⇒ 00:03:39.610 Pranav Narahari: Perfect, perfect.
31 00:03:39.750 ⇒ 00:03:58.199 Pranav Narahari: But yeah, to start off with, if, you can tell me a little bit about, in, like, 60 to 90 seconds, so we can kind of, like, dive into it further into the interview, about, like, some AI project, maybe even a RAG system that you built, and then just tell me a little bit
32 00:03:58.300 ⇒ 00:04:02.669 Pranav Narahari: Just at a product level, not technical implementation yet, what’d that look like?
33 00:04:03.470 ⇒ 00:04:07.629 Dan Hartley: Alright, so, you want it to be specific to our ads?
34 00:04:08.250 ⇒ 00:04:09.380 Pranav Narahari: That’d be great.
35 00:04:10.020 ⇒ 00:04:18.570 Dan Hartley: Alright, so, I mean, like, the most recent one that I could recall, was, I believe…
36 00:04:18.570 ⇒ 00:04:35.599 Dan Hartley: It’s been… I think it’s been quite a while since I’ve built a rag, but, yeah, I’m like, let’s just consider a Q&A for, what I would say, answering some HR-related policies that they have to build. So, I’ve been majorly working with clients, so with, with their,
37 00:04:35.600 ⇒ 00:05:00.580 Dan Hartley: Unlike with Hatchworks, they have, like, a huge client base, they used to hop in with some products or some ideas that they want to build off, and then, yeah, so it was basically, I built countless rags, considering them in the medical niche or something like that. Let’s consider TAM. So TAM was an AI chatbot that was considered being a rag. So what happened, the problem statement was that the user, I mean, the client, they wanted to build a mobile application on a chatbot that would give, provide nutritional advices to a
38 00:05:00.580 ⇒ 00:05:01.639 Dan Hartley: A lot of users.
39 00:05:01.640 ⇒ 00:05:20.810 Dan Hartley: And it would be really, what I would say, customized based on, whatever the AMDR ranges, whatever the AMDR ranges are there, whatever, well, so far, the AMDR ranges, it’s just, like, really, really specific. So, the initial structure and the initial problem statement with that. So, what I did was create an entire
40 00:05:20.940 ⇒ 00:05:35.180 Dan Hartley: unlike HRI bot, that would just, like, take an… crafted, like, a huge, huge prompt. I know. The context window was just, like, really small, so I decided to, like, divide it into multiple prompts, into multiple system prompts, so that in order for it to just, like.
41 00:05:35.220 ⇒ 00:05:48.830 Dan Hartley: grab it all, and then for each and every response that it used to have to give, I created a informational, rag that used to contain for each and every query that it used to just, like, go by. It used to just, like, have all those,
42 00:05:48.840 ⇒ 00:06:01.199 Dan Hartley: Stored there, so that for each and every response that the model is generating, it is there based on how a nutritionist or a professional nutritionist in that space is going to just, like, with that information, it’s going to respond.
43 00:06:01.200 ⇒ 00:06:01.550 Pranav Narahari: Oh, jee.
44 00:06:01.550 ⇒ 00:06:02.560 Dan Hartley: way.
45 00:06:02.560 ⇒ 00:06:11.039 Pranav Narahari: So, you kind of, and I just want to maybe make sure I get this part clear, is that source of truth that’s giving that nutritional information.
46 00:06:11.450 ⇒ 00:06:20.269 Pranav Narahari: What is that, what is that document store? And where… how did you guys collect it? Or was that already set up with the client?
47 00:06:20.270 ⇒ 00:06:38.559 Dan Hartley: Yeah, yeah. So, the… how was it collected, and for source of truth, I kind of just, like, forgot the name of that website, but it was just, like, a huge database of foods and everything that used to contain that. There were 4 professional nutritionists that were onboarded for this specific one, and they generated
48 00:06:38.560 ⇒ 00:06:42.459 Dan Hartley: Just for, like, multiple, multiple scenarios, and then it was from there.
49 00:06:42.460 ⇒ 00:06:50.809 Dan Hartley: Because, obviously, there are, some… a lot of, like, you know, compliances and regulations that we need to just, like, make sure to make sure that we have that.
50 00:06:51.550 ⇒ 00:06:54.070 Pranav Narahari: Gotcha, and you, and you were saying that…
51 00:06:54.210 ⇒ 00:06:58.679 Pranav Narahari: Who is, like, the target user, for this, for this application?
52 00:06:58.680 ⇒ 00:07:03.540 Dan Hartley: general… General users, day-to-life users, even me and you.
53 00:07:03.960 ⇒ 00:07:09.540 Pranav Narahari: Okay, so someone that’s just interested in learning about nutrition, so, like, what’s, like, an example… example query.
54 00:07:10.180 ⇒ 00:07:10.780 Pranav Narahari: Next system.
55 00:07:11.590 ⇒ 00:07:29.559 Dan Hartley: hey, I want to lose my weight, but I want… I can’t just stop on eating shawarmas, or I can just, like, stop on eating some great Indian food. I really want to just, like, lose my weight. How do I go by it? So, that is a… hey, I’m allergic to nuts, but I really love
56 00:07:29.580 ⇒ 00:07:48.170 Dan Hartley: having peanuts, is there any other alternative that I could just, like, use to it? Hey, I love, let’s just say I love some… a burger, but I am allergic to cheese, but I love cheese, so is there any specific, alternate, or any specific cheese that I could just, like, use based on
57 00:07:48.360 ⇒ 00:07:57.410 Dan Hartley: And then it would go by asking and asking multiple questions based on their medical, based on his medical history. That was stored somewhere else, so that it could…
58 00:07:57.410 ⇒ 00:07:58.770 Pranav Narahari: the application…
59 00:07:59.310 ⇒ 00:08:00.100 Dan Hartley: Yeah.
60 00:08:00.530 ⇒ 00:08:08.350 Pranav Narahari: Gotcha. So the application was targeted to ask about medical history, or something, a lifestyle…
61 00:08:09.080 ⇒ 00:08:09.670 Dan Hartley: Pretty straightforward.
62 00:08:10.580 ⇒ 00:08:11.740 Pranav Narahari: Sorry, what was that?
63 00:08:12.290 ⇒ 00:08:13.150 Dan Hartley: Pre-stored.
64 00:08:14.050 ⇒ 00:08:15.480 Dan Hartley: At the time of sign-up.
65 00:08:16.390 ⇒ 00:08:18.900 Pranav Narahari: Oh, okay, so that, that was, already asked.
66 00:08:19.130 ⇒ 00:08:27.670 Pranav Narahari: Okay, so… okay, and then that information was saved as additional context for the application? Yep. Okay, I gotcha.
67 00:08:27.930 ⇒ 00:08:30.759 Pranav Narahari: Okay, so yeah, that’s…
68 00:08:31.100 ⇒ 00:08:43.340 Pranav Narahari: Now, I think what I’d like to… I know you were kind of in the process of describing a little bit of, like, the architecture. I think what would be great is if we could work through one of those examples, and you can tell me how we would go through the system.
69 00:08:43.340 ⇒ 00:08:53.740 Pranav Narahari: And I think that would get a good understanding of the architecture that way. So, let’s say with, that example of, hey, I love cheeseburgers, but I have…
70 00:08:53.740 ⇒ 00:09:09.119 Pranav Narahari: a dietary restriction against lactose. A user asks that, say you can make up some medical history, lifestyle history, that they already pre-added via some form. Now, walk me through what happens next on the back end.
71 00:09:10.060 ⇒ 00:09:14.439 Dan Hartley: So what happens next is that, what we have is
72 00:09:14.510 ⇒ 00:09:18.150 Dan Hartley: Let’s just say, I have an entire database.
73 00:09:18.210 ⇒ 00:09:28.800 Dan Hartley: Right? I have an entire database of what are some foods and what are some healthy alternatives to it, right? So based on those specific lifestyles and,
74 00:09:28.810 ⇒ 00:09:31.290 Dan Hartley: unlike those dietary restrictions.
75 00:09:31.290 ⇒ 00:09:53.119 Dan Hartley: Right? So what happens is that it goes under the lactose restrictions section, it checks out, it checks out, it searches out, and this is just, like, an entire algorithm that’s working there on the back end. It was there. So what it does is it searches out for a specific, there was, like, an incomplete map of those other, restrictions for… like, for example, if you have some, like, lactose restrictions.
76 00:09:53.120 ⇒ 00:10:00.839 Dan Hartley: And let’s just say you say cheese. So there is an entire search that goes… happens on the backend, and then it searches and filters out all those
77 00:10:00.840 ⇒ 00:10:07.569 Dan Hartley: cheeses that a person can have if it’s lactose intolerant. Let’s just say a person cannot have any cheese at all.
78 00:10:07.570 ⇒ 00:10:31.670 Dan Hartley: Right? So, is there a way that he could use any specific alternate to it? So, it would come back, and just, like, all this, with all this information, it would go to the, to the vector store, there was just, like, entire search, the entire vector database, there were… it was all stored there, where, how in nutritional, what’d you say, let’s just say electronics intolerant cheese, I can have… I can have cheese, but there is… I’m just, like, lecture intolerant, so it would just, like, go there.
79 00:10:31.700 ⇒ 00:10:43.330 Dan Hartley: it was just like, search it out, and then it would hop back on. So, with that information, it would say that, okay, if you cannot have cheese, let’s just go buy it with this one. So.
80 00:10:43.330 ⇒ 00:10:55.589 Pranav Narahari: like, searching algorithm, how much, how much of that did you work on? Is this something that you just, oh, you know that it worked, it was just kind of, like, input-output? Or is this something that you did, like, you had active development on?
81 00:10:56.090 ⇒ 00:11:15.629 Dan Hartley: I was just, like, there throughout. I was involved end-to-end, so with this search algorithm, it was just, like, an entire heat map that used to be there, right? And then, with this information, it used to just, like, hop on, and then, it used to be passed. Now, one thing that I forgot is whether it was MCP,
82 00:11:15.630 ⇒ 00:11:23.490 Dan Hartley: whether there was an MCP that used to just, like, fetch it out, or were we passing it on onto real time? I think it was an MCP.
83 00:11:23.580 ⇒ 00:11:29.440 Dan Hartley: That used to just, like, we connected it with an MCP, so that it would just, like, come back and just, like, search it out.
84 00:11:30.720 ⇒ 00:11:32.539 Pranav Narahari: Sorry, I think,
85 00:11:32.810 ⇒ 00:11:41.150 Pranav Narahari: when you said, MCP, where were you… where were you using MCP or something? For what was the context for using something like that?
86 00:11:41.740 ⇒ 00:11:45.009 Dan Hartley: So you’re asking what was the context for using in MCP inside of this?
87 00:11:45.210 ⇒ 00:11:51.030 Pranav Narahari: Yes, yeah, or were you saying you were using it for… For extracting what information?
88 00:11:51.410 ⇒ 00:11:57.369 Dan Hartley: So, for extracting… so, just, let’s just say, if there… inside of the vector database.
89 00:11:57.370 ⇒ 00:12:19.180 Dan Hartley: So if you go by a normal query, there are millions and millions of food combinations, right? So essentially storing them all inside of the vector database, it’s not what I would say… it was not ideal at that point, because the cost and the scalability that we used to have, it was just, like, huge, right? So what we did was create an entire, let’s just say we connected a external database.
90 00:12:19.200 ⇒ 00:12:28.829 Dan Hartley: Right? We connected the database with the model inside of it, so with the MCP. So that, what it does is, whenever it’s just, like, querying it out, it just, like, go by, it just, like, come out.
91 00:12:29.350 ⇒ 00:12:33.980 Pranav Narahari: Gotcha. So, like, an MCP or something was able to generate a query to then…
92 00:12:34.200 ⇒ 00:12:38.030 Pranav Narahari: pull the data from the database. Okay, cool.
93 00:12:38.030 ⇒ 00:12:50.450 Dan Hartley: parallel to searching the vector database, if it used to find it. So inside of the vector database, there were different food combinations and different stuff that was stored there, and that were just, like, most recurrent ones, right? So we stored the most recurrent and the most, most, I mean, like,
94 00:12:50.740 ⇒ 00:12:53.149 Dan Hartley: what I would say,
95 00:12:54.330 ⇒ 00:13:06.809 Dan Hartley: what I would say most, recurring, and the most priority, queries inside of the regular database, and the rest that were just, like, generic ones, we used to store them inside of the database, because it was huge.
96 00:13:07.200 ⇒ 00:13:09.849 Pranav Narahari: Gotcha. So, yeah, it’s sounding like it’s…
97 00:13:10.130 ⇒ 00:13:19.700 Pranav Narahari: Potentially, caching, but also just whichever, embeddings were retrieved the most, you… you were… you were tracking that.
98 00:13:20.080 ⇒ 00:13:20.710 Dan Hartley: Yes.
99 00:13:20.710 ⇒ 00:13:30.620 Pranav Narahari: Which, which makes sense. Okay. How were you able to evaluate the accuracy and the output of this, application? What systems did you put into place?
100 00:13:31.150 ⇒ 00:13:35.949 Dan Hartley: That’s a great question. That’s a great question. So what we kind of did was,
101 00:13:36.110 ⇒ 00:13:41.399 Dan Hartley: There are 3 to more. So, there, there’s… there was just, like, two layers to it.
102 00:13:42.520 ⇒ 00:13:57.670 Dan Hartley: there was, like, two layers to it. The very first start… the very first layer was the entire, I used to check it inside of a preset, how correct, how accurate the information was, but essentially that was not something that we could do, that I could do. So what I would use to, I would just, like, search it out for just, like, specific answers.
103 00:13:57.670 ⇒ 00:14:21.290 Dan Hartley: And I’ve created them already a list I used to just, like, map them and compare the fuzzy string matching. With fuzzy string, I used to just, like, compare how accurate or how near these responses are based on the retreat context. How… what I… and obviously, how was it evaluated was all of these answers were went through… were went to those nutritionists, because everything needs to be run to them, because they… or they were the potential, so I would say.
104 00:14:24.470 ⇒ 00:14:31.200 Dan Hartley: evaluation and the potential factors for this specific month. So, they used to just, like, go over all those responses and just, like, check them out.
105 00:14:31.200 ⇒ 00:14:54.419 Dan Hartley: So, I know, with… when… while the entire… they used to check it for, constrained adherence, like, did it avoid lactose, or did it not? It was majorly a rule-based, and, the second option was, that I suggested, but they were just like… they were just, like, it was… they said that it’s kind of risky, and what I would say was, I would… let’s just use an AI agent to be as a judge.
106 00:14:54.430 ⇒ 00:14:56.169 Dan Hartley: I could just, like,
107 00:14:56.280 ⇒ 00:15:21.260 Dan Hartley: prompts, or just, like, fine-tune a specific large language model, and just, like, ask it to judge and just, like, rate them for helpfulness, correctness, and personalization. But it was, and to check if it’s just, like, really near to the ground truth, but they said that, no, it could be prone to errors, so we need to just, like, make sure that we are running through each and everything for them. So what I did was I created… I did create that, but, the entire options, so for everything, we set a threshold for each and every
108 00:15:21.260 ⇒ 00:15:22.400 Dan Hartley: concert that was
109 00:15:22.400 ⇒ 00:15:38.620 Dan Hartley: nearly 75% accurate, according to the judge agent. It used to pass on through, it used to just, like, get passed, and for every… for each and every accuracy that was lower than that, those… those checkpoints and those answers were just, like, shortlisted for all of those human in the loops. So they were just, like, they used to go back and, like, check it out.
110 00:15:39.170 ⇒ 00:15:55.809 Pranav Narahari: Gotcha. So, one other question that I have for you is, for this project, or maybe some other AI, RAG system that you built, what is something that, after you deployed it, maybe you mentioned how, like, working with specific clients,
111 00:15:55.840 ⇒ 00:16:03.150 Pranav Narahari: maybe you got feedback that something was broken in production, and then… what would be your next step after getting that feedback? How does your…
112 00:16:03.580 ⇒ 00:16:08.390 Pranav Narahari: How does your brain diagnose the issue and then eventually patch the issue?
113 00:16:09.190 ⇒ 00:16:29.450 Dan Hartley: That is a great question. So, and, so it’s… I… what I kind of do is just, like, and I’ve… once I just, like, fell on into this problem as well. So, what I kind of do, and what I like to do, is I structure and I think it. So, for example, what I try to do is just, like, the very first thing is just, like, I take a step back, I’m just, like, okay.
114 00:16:29.510 ⇒ 00:16:47.759 Dan Hartley: what’s… what happened? Is this reproducible? What I try to do, I reproduce it. So if it’s just, like, get the extract query, the context, I pull logs, right? So, what was the recruit document? What was the final LLM prompt plus response? So you see what the… what… so basically, this false check what the system saw.
115 00:16:47.760 ⇒ 00:17:10.889 Dan Hartley: And then I try to just, like, check out where the system broke. So I mentally walked through the pipeline, just, like, the input, the retrieval, the LLM, the post-processing, the MCP section. How did it basically… where… what it… how… I mean, like, where was the exact issue that was just… where it faced, right? Was it a wrong emittings? Was it bad ranking? Was it something that was… happened due to…
116 00:17:10.890 ⇒ 00:17:24.810 Dan Hartley: hallucinations, was it a post-processing issue where we were filtering out all of those, retrieves? Yeah, and then comes in just, like, prevented to just, like, add on some guardrails, and that is how we just let it go by.
117 00:17:25.859 ⇒ 00:17:39.669 Pranav Narahari: Gotcha. Let’s maybe move on to that second part that I was talking about in the beginning, which is, I’ll bring up some scenarios that we’ve run into at Brainforge, small hiccups that we ran into, and we had to kind of put on the…
118 00:17:40.249 ⇒ 00:17:50.239 Pranav Narahari: the… the helmet of, like, thinking of, like, how you’re thinking right now, which is how to diagnose an issue. So, let me paint the picture for you briefly.
119 00:17:51.239 ⇒ 00:18:02.209 Pranav Narahari: we had this e-commerce company that had MCP connections in a… you can just think of it as a chatbot, just think of ChatGPT. And these MCPs were…
120 00:18:02.529 ⇒ 00:18:08.299 Pranav Narahari: There’s… there’s various ones, but let’s just focus on one of them, which is Shopify.
121 00:18:08.719 ⇒ 00:18:24.479 Pranav Narahari: So, the Shopify one, it was meant to pull the data from Shopify, and you’re able to ask questions about Shopify data, which is, let’s just say, revenue, just to keep it super simple. One issue that we saw…
122 00:18:25.030 ⇒ 00:18:30.379 Dan Hartley: Sorry, sorry, it used to pull questions from Shopify, and we were able to, like, ask it.
123 00:18:31.020 ⇒ 00:18:40.079 Pranav Narahari: Yeah, yeah, just think of, like, a chatbot that has integration with Shopify data, right? And the MCP is able to get live data. So…
124 00:18:40.850 ⇒ 00:18:47.519 Pranav Narahari: One issue that we were having is that when we would ask questions into this chatbot, we were getting data that
125 00:18:49.000 ⇒ 00:18:51.929 Pranav Narahari: That was, not real data.
126 00:18:52.310 ⇒ 00:19:05.170 Pranav Narahari: And so when we look at the thinking block, right, we’re able to see the thinking steps that the AI is making, we see that there’s a section of generating data for XYZ date.
127 00:19:05.570 ⇒ 00:19:09.010 Pranav Narahari: Now, that’s an issue, if…
128 00:19:09.950 ⇒ 00:19:18.680 Pranav Narahari: I guess, just given that information, what does that lead you to believe? What are the next steps that you would go in to look into to diagnose the issue? And then…
129 00:19:18.800 ⇒ 00:19:21.689 Pranav Narahari: Yeah, we can start there, and then maybe go into, like, how you’d fix it.
130 00:19:23.680 ⇒ 00:19:39.670 Dan Hartley: That’s a good… that’s a good point. So, when we say generating data, that is not something that should happen. The very first thing that I would… that I… that comes into my mind is just, like, the very first thing was, is this right? So, for existing data, it should be there, but if… it should not generate it. Is that correct, right?
131 00:19:40.120 ⇒ 00:19:40.880 Pranav Narahari: Yeah.
132 00:19:42.320 ⇒ 00:19:47.040 Dan Hartley: Alright, so this immediately, what leads me to just, like,
133 00:19:47.390 ⇒ 00:20:07.109 Dan Hartley: believe is that the model is hallucinating itself. So, instead of calling the Shopify MCP, right? This should never exist in a system where real data is available, right? So, the MCP should be the source of truth here. So, this is how… the very first hypotheses that I would just, like, put on is
134 00:20:07.320 ⇒ 00:20:24.720 Dan Hartley: that I would say is that this is majorly, the large language model is considering this as a reasoning task instead of just, like, a tool usage task. So that is the very first thing that I would check. So, Ha, do you want me to just, like, go by how I would debug this?
135 00:20:24.720 ⇒ 00:20:26.750 Pranav Narahari: Yeah, I think you’re, you’re…
136 00:20:27.030 ⇒ 00:20:33.230 Pranav Narahari: you’re totally on the right track there. That’s absolutely what it was. I think…
137 00:20:33.490 ⇒ 00:20:36.470 Pranav Narahari: You basically diagnose the issue.
138 00:20:36.660 ⇒ 00:20:49.850 Pranav Narahari: Now… and you’re totally right, so I think you really got the whole thing. What we realized here was that, yeah, we were using, Vercel’s AI SDK, we were using, I believe, like, Sonnet 4, and…
139 00:20:50.090 ⇒ 00:21:00.730 Pranav Narahari: one of the parameters that we had set was reasoning. So… Now, given that information, like…
140 00:21:01.370 ⇒ 00:21:11.050 Pranav Narahari: What would you change, and what is maybe another parameter that you would also modulate to reduce the amount of free thinking?
141 00:21:12.860 ⇒ 00:21:22.080 Dan Hartley: Alright, so… Alright, so for modulation and, listen, for…
142 00:21:22.870 ⇒ 00:21:30.929 Dan Hartley: modulation, right? So what… so I… alright, let’s just a resume, the first thing is just, like, was it chain of thought?
143 00:21:31.230 ⇒ 00:21:43.540 Dan Hartley: If it was, I would just, like, disable that. The very first thing is disable reasoning. Second parameter is temperature. That’s the big one here, because, I would reduce it, because the more…
144 00:21:43.790 ⇒ 00:22:03.220 Dan Hartley: I would take it more to 1, more towards 1 is… I would increase this, because that is the generative quality, so the… I would disable the reasoning, and increase the temperature, because I need answers near to the ground truth. The generative element should be really less. That is how I would just, like, change it.
145 00:22:03.560 ⇒ 00:22:10.230 Pranav Narahari: That’s great. And so, what would then be…
146 00:22:10.440 ⇒ 00:22:18.660 Pranav Narahari: your pro… and I think you kind of described it, but I’ve just kind of maybe full circle. Okay, you’ve figured out the issue, you know what the fix is,
147 00:22:19.890 ⇒ 00:22:25.150 Pranav Narahari: What would you then tell your engineers to do to then get this into production?
148 00:22:26.880 ⇒ 00:22:37.849 Dan Hartley: That’s a good one. The very first thing I would just, like, ask him, I would just, like, create this entire document, right? I would document the issue clearly. I would also create an incident report.
149 00:22:38.050 ⇒ 00:23:02.939 Dan Hartley: And do I need to tell… I’ll divide it. So the very first thing is, I would just, like, initially hop onto a meeting, I hop onto a call. So I do this in two steps. The very first step is going to be a really quick list, because it’s something in production, it needs to go live in a really short time, right? So, I would just, like, create a document and just, like, hop into a meeting. I would enter the parameter adjustments there, the prompt, the guardrail improvements. I would also introduce an optional safety
150 00:23:02.940 ⇒ 00:23:18.089 Dan Hartley: If there is applicable inside of the section, I would ask them to test it before deployment, and we would just, like, deploy instantly. The next thing I would do is just, like, create an entire incident report. I would add in what went wrong, what was the issue.
151 00:23:18.090 ⇒ 00:23:22.439 Dan Hartley: how did we debug steps to reproduce each and everything? And I would document that.
152 00:23:22.530 ⇒ 00:23:33.220 Dan Hartley: to stores, so that it’s really easy to pick up from, really, the very, very port where we’ve just, like, just, like, went by. Because the key here is
153 00:23:33.400 ⇒ 00:23:57.539 Dan Hartley: Not to just, like, fix the immediate bug, but to ensure that the pipeline, it’s there, it’s being resilient, its guardrails are there, its parameter, and the testing is to work together and test it out, so that we are working together to make sure that we are not having such issues or inside of production any time near the future.
154 00:23:58.250 ⇒ 00:24:00.659 Pranav Narahari: Yeah, that’s… that sounds great.
155 00:24:00.790 ⇒ 00:24:11.510 Pranav Narahari: My next question is, how do you prevent regression? This scenario, I think, is super straightforward. What you just outlined is, you know, there’s… I would…
156 00:24:11.590 ⇒ 00:24:17.829 Pranav Narahari: I would feel really confident about, okay, you don’t need to do a bunch of other tests, or do any other,
157 00:24:17.830 ⇒ 00:24:34.180 Pranav Narahari: you know, evaluation to prevent regression in this specific, scenario. I mean, we didn’t when we patched this issue, right? But I’m wondering, for situations where you think this is a fix, but you want to make sure that you are not regressing the system in terms of
158 00:24:34.180 ⇒ 00:24:38.949 Pranav Narahari: accuracy, sometimes it could be evaluation time,
159 00:24:39.140 ⇒ 00:24:43.440 Pranav Narahari: What would you put in place to, prevent that regression?
160 00:24:44.640 ⇒ 00:25:02.530 Dan Hartley: Hmm… that’s a good one. So, in order to just, like, prevent, regressions that are inside of this, like, now, just, like, is there. So, the regression basically happens when new changes are breaking something that previously worked, right? So, I would clearly build a regression test suite
161 00:25:02.640 ⇒ 00:25:21.970 Dan Hartley: Right? Include a test dataset with test cases. I would collect samples where it would just, like, was working before. I would include… also include tool enforcement tests, so that for each and every step, we are always checking if the tool is called, if there are… I mean, like, if it’s all there. I would check the metrics.
162 00:25:21.970 ⇒ 00:25:30.269 Dan Hartley: Because after deployment, for each and everything, I would check for, I mean, like, if you want continuous regression detection in production.
163 00:25:30.270 ⇒ 00:25:36.960 Dan Hartley: I would check the hallucination rate, I would check the tool usage, I would check even the user feedback, how all these metrics
164 00:25:36.960 ⇒ 00:25:51.040 Dan Hartley: search for just, like, early warnings, for regressions, that we are just, like, there. I would prompt the versioned one and parameter close, so that all of the prompt templates, they are being, and all of the temperature settings, they are in version control.
165 00:25:51.040 ⇒ 00:26:02.619 Dan Hartley: So that we can roll back if any… if regression occurs, really, hops back on. And I would make sure that we are logging all failures from our hallucinations.
166 00:26:02.620 ⇒ 00:26:14.929 Dan Hartley: And it needs to feed back into the regression test suite, so that whatever we created in the first one, or the prompt tuning, or model evaluation, so that this makes sure that the system learns from real production cases.
167 00:26:14.930 ⇒ 00:26:18.509 Dan Hartley: Instead, without manually just, like, hopping each and everything on.
168 00:26:19.190 ⇒ 00:26:36.850 Pranav Narahari: That sounds great. Those are most of the topics I usually like to talk about in interviews. One thing that I missed in the beginning was I just like to get to know, like, how you use AI in your daily workflow, specifically on some of this technical stuff, but then also in just other stuff as well. So let’s start off with just, like.
169 00:26:36.930 ⇒ 00:26:43.339 Pranav Narahari: as a… coming from an engineering background, like, what are some of the stuff that you use AI for?
170 00:26:43.930 ⇒ 00:26:59.229 Dan Hartley: That’s a great one. So, I am, heavily towards… directed towards what I would say, AI usage, being an AI engineer. I… but I use it ethically, and I make sure that I’m using it in a way that it is basically
171 00:26:59.230 ⇒ 00:27:09.269 Dan Hartley: Because I know, being an AI engineer, we know how prone it is to errors, what goes wrong, and I’m like, what are the processes? Well, we’re building models from scratch, so it happens.
172 00:27:09.270 ⇒ 00:27:31.209 Dan Hartley: So what I kind of love to do is I majorly do all the documentations. I majorly just, like, let AI do the heavy lifting on the documentation side of things. While I’m coding, what I try to do is, I frequently use Cursor, I do use Cursor, I do use Windsurf. I do believe in AI-assisted coding, as long as I know what I am doing.
173 00:27:31.480 ⇒ 00:27:33.110 Dan Hartley: So there are…
174 00:27:33.110 ⇒ 00:27:51.800 Dan Hartley: three, so what I kind of do is just, like, I create the, what I used to get and fetch from AI is the entire boiler code. I don’t let it do the entire business logic and each and everything that used to be there. I am making sure that I’m doing that my own self, but the entire… for the entire heavy lifting and the boiler code, I try to just, like, get it out.
175 00:27:51.800 ⇒ 00:28:01.920 Dan Hartley: So, the very first… but how I do it is build the right context. So, very first thing, I would spend 5 to 10 minutes in building the context for this entire AI assistant on
176 00:28:02.550 ⇒ 00:28:17.949 Dan Hartley: what I’m trying to do, the problem statement, what I need, what is the expected output, how do I need to go by it, what are the relevant test cases here? And once I believe that it has the right architecture and the right information, I go by it.
177 00:28:17.950 ⇒ 00:28:27.979 Dan Hartley: Be it machine learning, I try to explore datasets with using AI, so the entire EDA pipeline, I used to just, like, get it out with temporal factors, each and everything.
178 00:28:28.050 ⇒ 00:28:46.109 Dan Hartley: Yes, but making sure that the entire, while I am evaluating the model, I try to just, like, make sure that the training pipeline it used to just, like, run it out. It won’t help, because that’s not something required, but the trust. But I try to make sure that the testing and the validation, that is something that I’m writing on my own self, and evaluating my own self, because I know.
179 00:28:46.330 ⇒ 00:28:51.140 Dan Hartley: what’s going wrong, at what point, how do I fix it? So yes, that is how I go by.
180 00:28:51.620 ⇒ 00:29:04.279 Pranav Narahari: That’s awesome. I know we are short on time, and I have a hard stop at 2.30 Eastern, but yeah, for the… if you have any quick questions, I’m happy to answer them right now.
181 00:29:04.280 ⇒ 00:29:04.700 Dan Hartley: Yes.
182 00:29:05.050 ⇒ 00:29:05.590 Pranav Narahari: Yeah.
183 00:29:05.590 ⇒ 00:29:14.500 Dan Hartley: I asked major questions with, I forgot… I forgot who I spoke to earlier, but I did… Huh?
184 00:29:14.500 ⇒ 00:29:15.569 Pranav Narahari: Was it Sam?
185 00:29:15.940 ⇒ 00:29:31.489 Dan Hartley: Yeah, Sam, Sam, I remember. I asked major questions with them, so I don’t think I have a lot of ones, but what I really want to know is, what is the team size for AI here at Brainforge, and what are the… and are we majorly focused? Because we talked about,
186 00:29:31.490 ⇒ 00:29:44.599 Dan Hartley: RAG’s law. And job description, when I read it, it was majorly about RAG. So, is it more RAG-focused and AI agent-focused, or do we have some more customer-grade… let’s just say we have more, directed and diversion niche?
187 00:29:45.150 ⇒ 00:30:07.050 Pranav Narahari: I would say, the complexity can go all the way down to creating a model, right? And so, that’s not something that we do here at Brainforge, at least we haven’t done it yet. You know, we’re a growing company, with that becomes bigger clients that want more complex solutions. However, we don’t like to overcomplicate things.
188 00:30:07.120 ⇒ 00:30:26.929 Pranav Narahari: for 99% of clients, it doesn’t make sense to build them a model. There’s so many downsides to that, and you need to make sure you have the correct prerequisites for that. So, same thing with fine-tuning a model. There could be applications for that in the future, but right now, like you said, like.
189 00:30:26.930 ⇒ 00:30:31.900 Pranav Narahari: the importance of prompting is super… is really important.
190 00:30:31.900 ⇒ 00:30:35.969 Pranav Narahari: creating that correct context for then to let the AI
191 00:30:35.970 ⇒ 00:30:48.539 Pranav Narahari: properly execute is extremely valuable. And so, where is that data coming from for the context is super important. So, these RAG systems, MCP servers,
192 00:30:48.570 ⇒ 00:30:52.799 Pranav Narahari: that’s where we feel like on the AI team, that’s where a lot of
193 00:30:52.900 ⇒ 00:31:10.059 Pranav Narahari: our clients can benefit from. And so, yeah, right… right now, the team is about… it’s me that’s kind of client-facing, and then there’s Sam that kind of serves as, like, the technical leader, and then we have a few engineers as well that,
194 00:31:10.260 ⇒ 00:31:18.500 Pranav Narahari: help support the building. And so… We’re growing, like… Very quickly,
195 00:31:18.740 ⇒ 00:31:34.760 Pranav Narahari: But at the same time, you will have a lot of ownership here, considering how small the AI team specifically is. So, yeah, I have to hop, but feel free to email me any more questions, and yeah, I’m happy to respond to whatever you have.
196 00:31:35.560 ⇒ 00:31:40.610 Dan Hartley: I would love to, and yeah, it was great talking, Pranav. I really hope we talk soon again.
197 00:31:40.970 ⇒ 00:31:43.049 Pranav Narahari: Yeah, totally. Alright, see you now.
198 00:31:43.460 ⇒ 00:31:44.480 Dan Hartley: Coop, bye.