Meeting Title: Brainforge Interview w- Pranav Date: 2026-03-05 Meeting participants: Pranav Narahari, Kaela Gallagher, Ned
WEBVTT
1 00:00:35.580 ⇒ 00:00:37.570 Ned: Hey! Hi, dear.
2 00:00:38.570 ⇒ 00:00:39.580 Ned: Hey, Pranav.
3 00:00:43.400 ⇒ 00:00:45.550 Ned: Hey, I can’t hear you.
4 00:00:45.840 ⇒ 00:00:47.539 Pranav Narahari: Oh, you can’t. Can you hear me now?
5 00:00:47.870 ⇒ 00:00:49.060 Ned: Yeah, I can hear you now.
6 00:00:49.060 ⇒ 00:00:51.260 Pranav Narahari: Perfect, perfect. Nice to meet you.
7 00:00:51.550 ⇒ 00:00:53.100 Ned: Yeah, nice meeting you.
8 00:00:53.620 ⇒ 00:01:05.070 Pranav Narahari: Yeah, so we only have 30 minutes here, so I don’t want to waste a ton of time, but I’d love to just kind of get to know you, a little bit about your background in, you know.
9 00:01:05.610 ⇒ 00:01:13.090 Pranav Narahari: however you kind of want to describe it, maybe in, like, 2-3 minutes, and then I can do a little bit of intro about myself, and then we can get into the rest of the interview.
10 00:01:13.810 ⇒ 00:01:15.060 Ned: Sure, sure, indefinitely.
11 00:01:15.130 ⇒ 00:01:32.280 Ned: Alright, so Pranev, I’m a lead AIML engineer with over 8 years of experience, all in data, focused basically on building and shipping the end-to-end AI solutions and products all by myself, and they solve the actual business problems. So, currently, as the lead data scientist at OneDI,
12 00:01:32.280 ⇒ 00:01:36.879 Ned: My focus has been, you know, heavily on the high-scale automations and operational efficiencies.
13 00:01:36.880 ⇒ 00:01:50.600 Ned: So I’ve built robust EDL pipelines using Apache Spark and AWS Kinesis, and deployed supervised models like XTBoost that actually achieved a 91% accuracy directly, basically, like, you know, cutting down the manual
14 00:01:50.700 ⇒ 00:02:06.220 Ned: intervention by 40%. So, prior to that at Pentex Solutions, I spearheaded the design of NLP automation tools, like, you know, for sentiment analysis and document matching as well. So I spent time at Visa, where I actually built the neural networks.
15 00:02:06.240 ⇒ 00:02:13.459 Ned: for call routing, and the language to SQL transition tools as well, to help non-tech users to query the databases.
16 00:02:13.490 ⇒ 00:02:32.380 Ned: regarding the, you know, specific requirements for Brainforge. So, you know, so what sort of, you know, mindset do I have? So I have been working at Pentex Solutions at, you know, OneDI as well. Both of these, basic, you know, basically the organizations are consultations.
17 00:02:32.410 ⇒ 00:02:46.600 Ned: So I have been within consultation space for quite a long time. So I have been, like, you know, within the client meetings so that I can understand the problem statements. Right afterwards I can provide the solutions. I have been creating the MVPs all by myself, and right afterwards, running those through
18 00:02:46.770 ⇒ 00:02:59.890 Ned: to the end client so that they, you know, we can make sure that everything is running as per their expectation. If I get a yes, we move towards the production-grade systems. So that’s the, you know, my mentality is.
19 00:02:59.890 ⇒ 00:03:08.329 Ned: So, you know, while my core, you know, is basically Python and ML frameworks like, you know, PyTorch and TensorFlow, I’m actually a…
20 00:03:08.330 ⇒ 00:03:17.500 Ned: full-stack engineer of why… why I say that? Because I love to capture the data within the data discovery phase. I love to, you know, create the ETL pipelines as well to…
21 00:03:17.500 ⇒ 00:03:38.439 Ned: cleanse the data across, to extract the data, transform the data, and afterwards load the data across as well. I’ll have to, you know, utilize that particular, you know, gold curated, like, I would say, data, right afterwards, you know, passing through any sort of ML techniques or AI engineering as well. Maybe creating the right chatbots on the top of it using vector databases, or creating the agentic workflows.
22 00:03:38.490 ⇒ 00:03:46.290 Ned: And right afterwards, I’d love to present the, you know, AI responses onto our front end as well, like, you know, React.js, maybe.
23 00:03:46.430 ⇒ 00:04:02.909 Ned: Yeah, so that I can, like, you know, control the end-to-end pipeline. Why Brainforge? Because, like, you know, at Brain Forge, I have seen that you are still in our startup phase. I would definitely love my booths to be for Lynn with respect to a particular startup, because, you know, why? I love to work.
24 00:04:03.030 ⇒ 00:04:21.600 Ned: That’s one thing. I love to work day and night. That’s, you know, the second thing, and I know that, you know, only a startup can provide me that particular opportunity, where I can have the full end-to-end ownership. That’s one thing. I can definitely run through all the solutions with the team as well, because I need to have the approvals with respect to my manager.
25 00:04:21.600 ⇒ 00:04:24.390 Ned: whoever I’m going to report to at Brain Forge.
26 00:04:24.390 ⇒ 00:04:38.569 Ned: And as well as, like, you know, having those particular discussions within the daily stand-up, so that we can discuss, you know, products’ roadmaps, or maybe, like, the project’s roadmaps as well, with respect to, you know, whatever end clients we are catering forward at Brainforge moving forward.
27 00:04:38.640 ⇒ 00:04:40.040 Ned: So that’s…
28 00:04:40.370 ⇒ 00:04:44.900 Ned: you know, mentality I have, that’s what I have been doing, like, you know, so far as well.
29 00:04:45.170 ⇒ 00:04:52.270 Pranav Narahari: Cool, yeah, and, you know, I’ll just take a minute or so just to give you a little bit of background about me, so you can ask me just the relevant questions.
30 00:04:53.080 ⇒ 00:05:00.910 Pranav Narahari: So, a little bit, like, prior to Brainforge, what I was working at, I was also working at a different agency in the AI automation space.
31 00:05:01.080 ⇒ 00:05:10.910 Pranav Narahari: prior to that, working as just, like, a software engineer in the cloud engineering space at a regional bank in the US. And so, I have a pretty…
32 00:05:11.180 ⇒ 00:05:26.080 Pranav Narahari: traditional software engineering background, however, starting in 2023, got more into the AI engineering and AI-enabled, engineering path, and so that was really exciting for me. That was just a natural kind of…
33 00:05:26.320 ⇒ 00:05:41.240 Pranav Narahari: segue into me picking up freelance clients, and then working at an agency, and then coming to Brainforge, where I feel like we’re really operating that model of AI-enabled engineering at the highest level. We’re taking advantage of
34 00:05:41.270 ⇒ 00:05:50.289 Pranav Narahari: development tools that we’re building in-house, as well as just the top of the line, in the market right now, to really maximize our efficiency, to…
35 00:05:50.290 ⇒ 00:05:50.890 Ned: Amazing.
36 00:05:51.090 ⇒ 00:06:00.360 Pranav Narahari: to just deliver for our clients. And so, a couple things that were interesting to me about what you just said, Ned, was how you feel very comfortable talking to clients.
37 00:06:00.860 ⇒ 00:06:03.850 Pranav Narahari: And so, you’ve been part of…
38 00:06:04.190 ⇒ 00:06:14.680 Pranav Narahari: these client meetings, you’ve been part of, taking in these problem statements, these statements of, these, these SOWs, and then turning that into that
39 00:06:15.010 ⇒ 00:06:22.400 Pranav Narahari: full, product at the end of the cycle. So, super interesting. That is…
40 00:06:22.580 ⇒ 00:06:36.430 Pranav Narahari: going to make you very successful here at Brainforge, if you can bring that here. So, how it works here at Brainforge is that you don’t really have a manager, per se. We’re a pretty flat organization, as it is right now.
41 00:06:36.770 ⇒ 00:06:40.210 Pranav Narahari: There is somebody, and…
42 00:06:40.560 ⇒ 00:06:44.809 Pranav Narahari: I’ll just be really brief right now, just so we can get into the actual meat of the interview, but…
43 00:06:44.810 ⇒ 00:06:45.190 Ned: Hmm.
44 00:06:45.190 ⇒ 00:07:01.970 Pranav Narahari: You basically have, like, a few other people that you work with. Sometimes, like, it’s a little bit of a push and pull, because, like, you’ll operate under different priorities, but it ends up making a very efficient system that makes the clients really happy.
45 00:07:01.970 ⇒ 00:07:12.060 Pranav Narahari: And so there’s different roles there, we can probably talk about that at the end if we have a little bit of time. But yeah, let me hop into, kind of, what I… like, the questions I want to ask you, and .
46 00:07:12.060 ⇒ 00:07:12.520 Ned: vicious.
47 00:07:12.520 ⇒ 00:07:16.070 Pranav Narahari: Just so, you know, we’re clear on, like, just timing and stuff.
48 00:07:16.190 ⇒ 00:07:22.589 Pranav Narahari: We have, what, like, 22 minutes left? There’s, like, 5 topics that I want to go through, so, you know…
49 00:07:22.810 ⇒ 00:07:28.369 Pranav Narahari: let’s… yeah, we’ll be, like, probably, like, around 4 minutes per topic, and, I think it’ll be good.
50 00:07:28.940 ⇒ 00:07:30.350 Pranav Narahari: Sure. And so… Sure.
51 00:07:30.570 ⇒ 00:07:49.829 Pranav Narahari: Yeah, the first thing that I want to ask you is just, how do you know that AI is the right solution for something? Also, how do you know that AI is not the right solution for a problem? And then also, what do you feel like is, a big misconception about LLMs in general?
52 00:07:50.400 ⇒ 00:07:59.970 Ned: Yeah, definitely, yep. So I think we can move forward with respect to your first question, you know, looking at why, you know, AI, or why not AI.
53 00:08:00.140 ⇒ 00:08:11.899 Ned: So, think of it in this way, like, I always tell my, you know, end clients in that sort of a way, where they are very, like, you know, enthusiast about AI, or maybe, like, you know, they are very much skeptical of AI sometimes.
54 00:08:11.980 ⇒ 00:08:26.589 Ned: Because they know that, you know, hallucinations would come, then they do not have that much accuracy, just like, you know, we do… we can’t achieve with the help of machine learning, or maybe, like, not machine learning, data science, because data science can prove you, you know, much greater accuracy than AI models itself.
55 00:08:26.640 ⇒ 00:08:39.210 Ned: But still, like, you know, why AI and why not AI? So if you’re definitely looking forward to have some sort of, like, you know, automations within your own processes, which can definitely bring down your, you know, let’s say, like, you know.
56 00:08:39.520 ⇒ 00:08:50.740 Ned: work, you know, hours itself, definitely we can utilize AI in that manner, if you need that AI to think something and come up with a response.
57 00:08:50.780 ⇒ 00:08:53.710 Ned: If just the clicks are there.
58 00:08:53.710 ⇒ 00:09:11.510 Ned: You know, there’s no AI within that. Even, like, you know, Power Automate can, you know, achieve this particular functionality to do the clicks, maybe, like, you know, open up something, add something, then right afterwards, you know, push it across in a CSV, then pick that particular CSV up and right away send that an email.
59 00:09:11.550 ⇒ 00:09:25.900 Ned: You don’t have to use AI for this one, just Power Automate can do this particular job. If AI is the right solution, based on the problem structure or data availability and user impact as well. I have some examples that are coming up in my mind.
60 00:09:25.900 ⇒ 00:09:33.640 Ned: rule-based systems, where the AI can be, you know, worked on. Let’s say, like, you know, it definitely works well with respect to any sort of clearly defined
61 00:09:33.640 ⇒ 00:09:40.109 Ned: deterministic processes with limited variability. For example, simple data validations, or workflow routings as well.
62 00:09:40.330 ⇒ 00:09:55.579 Ned: Traditional, you know, models, like, you know, MI models, are ideal when they are structured data sets, and we can definitely learn different patterns when there is structured data. Like, you know, with respect to any sort of predictive analytics, we can definitely use AI within that.
63 00:09:55.580 ⇒ 00:10:05.259 Ned: anomaly detections, recommendation systems, that’s where AI can jump in. And, you know, the most famous one within the industry itself, that’s Gen AI, you know that.
64 00:10:05.260 ⇒ 00:10:18.139 Ned: Everybody’s crazy about Gen AI, regarding chatbots, having, you know, analysis done with respect to an AI itself. So those are appropriate when the task involves any sort of unstructured data, that’s one thing.
65 00:10:18.520 ⇒ 00:10:27.919 Ned: Then we have language understanding, so context generations as well. If the content generation is involved, then definitely we can utilize GenAI.
66 00:10:27.920 ⇒ 00:10:40.149 Ned: Or maybe, like, you know, that we can definitely have rules, or conventional ML would be… conventional ML would be brittle at this stage, but, like, definitely require massive feature engineering as well. For example, like, let’s say for summarization.
67 00:10:40.150 ⇒ 00:10:45.779 Ned: Answer questions, or code, or, like, you know, having any set generation within that.
68 00:10:45.930 ⇒ 00:10:58.330 Ned: So, I usually validate by checking, like, you know, data readiness. That’s my very first check. Do we have enough data? Like, or maybe, like, quality data for any sort of ML or LLM training or fine-tuning?
69 00:10:58.710 ⇒ 00:11:18.370 Ned: right afterwards, I just, like, you know, look at the trade-off between complexity and ROI. So, can these rules that you’re giving it to me, is it, like, effectively, or, like, would an LLM provide a measurable improvement within an efficiency or accuracy itself as well? So we need to look at the trade-off.
70 00:11:18.500 ⇒ 00:11:32.020 Ned: Between the efficiency and the accuracy. And the last thing would be the user impact. So, will this particular solution significantly improve the decision-making or reduce the amount of efforts? So we need to look at it in this as well.
71 00:11:32.130 ⇒ 00:11:41.709 Ned: For example, like, you know, I have done a project as well. I know that the time is very, very short. I definitely would love to, you know, explain the project that I did, and the journey it had.
72 00:11:41.930 ⇒ 00:11:46.809 Ned: Definitely. If you want to move forward, I can definitely, like, you know, move forward with respect to that particular journey.
73 00:11:46.810 ⇒ 00:12:04.289 Pranav Narahari: I actually really appreciate that. Let’s maybe move forward for now, and let’s save that at the end. I can stay for a few extra minutes, too, for questions and, yeah, you know, additional depth. But, yeah, kind of diving into a little bit more of, like, the technical. That’s what I’m very interested in here for this, is,
74 00:12:04.680 ⇒ 00:12:14.859 Pranav Narahari: when deciding between different LLMs from different providers, even open source, what do you… what do you use as metrics for making that decision?
75 00:12:16.080 ⇒ 00:12:24.130 Ned: Alright, so open source LLMs, or maybe, like, you know, the LLMs that has been provided by, let’s say, OpenAI, or maybe…
76 00:12:24.680 ⇒ 00:12:27.730 Ned: other providers, like Claude or Groke as well.
77 00:12:28.130 ⇒ 00:12:28.470 Pranav Narahari: Yep.
78 00:12:28.470 ⇒ 00:12:37.919 Ned: Alright, yep. There are different, different trade-offs that we definitely need to look at while, you know, having the selection of a proper LLM model itself.
79 00:12:37.940 ⇒ 00:12:49.160 Ned: So, when choosing between the LLMs, I evaluate them across, like, with respect to different dimensions. So, you know, the dimensions that I have set up within my own, like, you know, thought process.
80 00:12:49.320 ⇒ 00:12:53.200 Ned: Is… the very first thing would be, like, performance and accuracy.
81 00:12:53.480 ⇒ 00:13:10.500 Ned: So, how well does these, like, you know, ML models, you know, handle the tasks I care about? For example, like, in any sort of question answering, summarization, code generation, etc. So, I measure with task-specific benchmarks, like F1 scores, Rogue, blue, or human evaluation, or semantic quality as well.
82 00:13:10.930 ⇒ 00:13:30.470 Ned: The second thing is efficiency and latency. That’s a very important and key thing for me as well. So, like, inference speed, that would come up, because I do not want to provide any sort of solution which is very, very slow to the end client itself. The end client would, again, come back to me and complaining that, you know, this model is very slow at the moment, what have you used?
83 00:13:30.560 ⇒ 00:13:36.520 Ned: So, inference speed, or, like, you know, moving forward, memory footprint as well, and cost.
84 00:13:36.820 ⇒ 00:13:54.119 Ned: So, I know definitely the cost as well. So, I know, like, you know, these sort of APIs would be costly to the end clients. If they want to move forward with it, that’s great. If they have any sort of, like, you know, if they do not care about the cost itself, then definitely we can move towards a particular, you know.
85 00:13:54.240 ⇒ 00:14:02.740 Ned: better model, which is, you know, within the industry itself. If not, then we can definitely move towards the, you know, open source models. Definitely, yes.
86 00:14:02.740 ⇒ 00:14:16.939 Ned: And the last thing would be, you know, for my own self itself, would be adaptability and integrations. So fine-tuning or prompt, you know, customization capability, support for embeddings, I would definitely love to check that.
87 00:14:16.940 ⇒ 00:14:25.589 Ned: RAG pipelines, which would be more, you know, suitable for my RAG pipeline, and definitely, you know, API or open source flexibility.
88 00:14:25.590 ⇒ 00:14:34.920 Ned: I definitely, like, you know, consider safety hallucination rates and community support for open source models. For instance, like, you know, my GenAI agent approach.
89 00:14:34.920 ⇒ 00:14:48.839 Ned: which I actually had within my project itself as well. So I compared open source Llama and GPT-based APIs, you know, using outputs, relevance, latency, and the vector store retrieval alignment before selecting the model itself.
90 00:14:48.840 ⇒ 00:14:49.840 Ned: for production.
91 00:14:50.860 ⇒ 00:14:59.670 Pranav Narahari: Yeah, that’s great. Maybe moving on to actually delivering products, so…
92 00:14:59.780 ⇒ 00:15:04.339 Pranav Narahari: What we can do is first maybe talk about, like, a real world example of, like.
93 00:15:04.590 ⇒ 00:15:14.280 Pranav Narahari: When you’ve had an issue with, like, a system, it’s in production, and so you’re not just in a sandbox environment, how do you go about…
94 00:15:14.280 ⇒ 00:15:27.880 Pranav Narahari: like, patching that issue. And you can use a real-life scenario that maybe has happened to you, you’ve deployed an AI system, the client has it in their hands, maybe there’s also the client’s customers using it.
95 00:15:27.880 ⇒ 00:15:28.430 Ned: Yeah.
96 00:15:29.350 ⇒ 00:15:38.199 Pranav Narahari: what do you go about doing once you find this production issue? What are, like, what are your next steps? Tell me about your cycle to then patching that issue.
97 00:15:39.210 ⇒ 00:15:54.390 Ned: Okay, so you’re talking about basically, like, you know, technical issue. I would, like, you know, love to talk about another, like, you know, more issues as well, moving forward, which actually happened within production, but still, like, you know, we… let’s move towards a technical issue, so that I can understand it more.
98 00:15:54.510 ⇒ 00:16:10.479 Ned: So, let’s say when the issue appears, so I know that you’re actually, like, looking at my thought process, so how would I think about a specific problem technically? Yeah. Because I have been taking the interviews myself as well, so I know how your thinking process is going at the moment.
99 00:16:11.070 ⇒ 00:16:14.479 Pranav Narahari: Yeah, yeah, I’d love to just let you think out loud, you know.
100 00:16:14.480 ⇒ 00:16:15.310 Ned: Sure.
101 00:16:15.310 ⇒ 00:16:16.000 Pranav Narahari: Yeah.
102 00:16:16.470 ⇒ 00:16:19.510 Ned: Alright, so I’ll go step by step, definitely.
103 00:16:19.580 ⇒ 00:16:28.949 Ned: So, firstly, let’s say, like, you know, when an issue appears in production, but not in the sandbox environment, just like you have told me, I follow, you know, a structured response.
104 00:16:28.950 ⇒ 00:16:39.899 Ned: Firstly. So, what I would do, I would just definitely take the immediate containment. Like, identify the impact, firstly, so that I can see, like, you know, what sort of an impact it is going on on the production environment.
105 00:16:39.900 ⇒ 00:16:46.599 Ned: If possible, I can isolate the failing service or data pipeline to prevent the, you know, cascading issues itself.
106 00:16:46.780 ⇒ 00:16:59.040 Ned: Right afterwards, what I’ll start, so that I can make sure, like, you know, my, you know, the issue is contained right afterwards, and the rest of the production is working as per the expectation. So firstly, I’ll contain the problem.
107 00:16:59.500 ⇒ 00:17:13.150 Ned: Right afterwards, what I’ll do, I’ll start the investigation of it. Like, you know, maybe collect the logs. I definitely love to have an audit log layer, so that I can just, like, you know, check whether if there’s any problem within the production environment or not.
108 00:17:13.150 ⇒ 00:17:35.549 Ned: Because, like, you know, within our processes, we have to, you know, obtain the audit logs, you know, right after we move towards production, and right after we see if there’s any, you know, client usage within the system itself, just to monitor it, so that we can make sure that the production environment is working as per the expectation. But still, like, you know, within this particular case, let’s say, I have contained everything, right afterwards, within my investigation phase.
109 00:17:35.550 ⇒ 00:17:49.539 Ned: I’ll check out the logs, any sort of metrics and traces from the production environment to reproduce the issue. Often, like, you know, within the sandbox, let’s say, like, differences like scale, concurrency, or data nuisances as well, to reveal the cause itself.
110 00:17:49.810 ⇒ 00:18:07.150 Ned: If necessary, if possible, I can definitely, like, go towards the end client as well, to ask him, like, you know, what’s the scenario, or what was the edge case? Can you please, like, reproduce the same issue in front of me, so that I can understand it? And right afterwards, I’ll apply this on the sandbox environment.
111 00:18:07.530 ⇒ 00:18:16.969 Ned: Yeah. Once I have a good understanding of this and investigated the whole problem, I’ll definitely, you know, provide a patch or a hot fix very quickly.
112 00:18:16.970 ⇒ 00:18:27.579 Pranav Narahari: like, specific example that kind of happened at Brainforge. I’m really interested to hear how you would probably tackle this issue. And so we had a…
113 00:18:27.580 ⇒ 00:18:30.180 Ned: I’ll go down to its most, like.
114 00:18:30.570 ⇒ 00:18:38.590 Pranav Narahari: essential information for this, for this question, but we have this chatbot that has certain MCP server integration,
115 00:18:38.590 ⇒ 00:18:39.090 Ned: away.
116 00:18:39.090 ⇒ 00:18:46.899 Pranav Narahari: the issue that the client was saying was happening was that these MCP servers that were bringing in, like.
117 00:18:47.090 ⇒ 00:18:50.569 Pranav Narahari: Shopify orders data, for example.
118 00:18:51.200 ⇒ 00:19:00.840 Pranav Narahari: was… not performing properly, and, well, actually, I won’t say that, actually. The output was using faulty data.
119 00:19:02.430 ⇒ 00:19:02.780 Ned: Okay.
120 00:19:02.780 ⇒ 00:19:03.620 Pranav Narahari: So…
121 00:19:04.380 ⇒ 00:19:11.189 Pranav Narahari: That’s what the client came back to us with, is like, okay, the response that… the insights that are driving,
122 00:19:11.420 ⇒ 00:19:14.600 Pranav Narahari: The insights that are provided back to the user are…
123 00:19:14.920 ⇒ 00:19:33.049 Pranav Narahari: using faulty data. And we can tell that just based on, like, you know, the thinking logs that are coming back to the user. This is like a form of hallucination, right? How would you go about investigating this, and then what are some potential fixes that come to mind?
124 00:19:33.880 ⇒ 00:19:49.870 Ned: So, some of the potential, like, you know, problems that I can see within this particular problem itself. Firstly, I would definitely love to see the responses and the queries that they have passed on to that particular chatbot to see, like, you know, what are the queries? If there is any sort of Jinja templates, why the responses are not correct as well.
125 00:19:49.870 ⇒ 00:19:59.569 Ned: So that’s the first thing that is coming into my mind. We just spoke to, like, you know, the semantic approach, if I move, you know, professionally towards this, or maybe, like, you know.
126 00:19:59.570 ⇒ 00:20:05.790 Ned: chunking it down. So what I’ll do, I’ll just, like, you know, firstly, just like I said it across to you, I’ll verify the source.
127 00:20:05.790 ⇒ 00:20:17.650 Ned: Check any sort of, like, you know, within the MCP server output against the ground truths or sampled data. Determine whether the faulty data is actually, like, corrupt, delayed, or any sort of, like, you know, misinformatted.
128 00:20:17.650 ⇒ 00:20:25.820 Ned: Right afterwards, I’ll check the pipeline to trace a, you know, to trace any sort of, like, you know, problem within the pipeline itself.
129 00:20:25.830 ⇒ 00:20:44.719 Ned: So, the data pipeline would straight away go in that sort of terms, which we actually have designed it for. Let’s say ingestion, pre-processing, embeddings, and LLM prompts, right? So, identify any sort of, like, you know, problem where the LLM might be interpreting any sort of bad or, like, you know, incomplete inputs. This…
130 00:20:44.720 ⇒ 00:20:54.410 Ned: if I can find out this problem, then this would definitely be a failure on our side, because it’s our job to firstly test it and send this across to, you know, the end client itself.
131 00:20:54.640 ⇒ 00:21:09.509 Ned: I can definitely explain to you my working style, but that would go out of the way. Let’s stick to this one, so that we can find out the root cause. Right afterwards, what I’ll do, I’ll just, like, you know, evaluate the hallucination within, you know, or maybe, like, with respect to the data issue itself.
132 00:21:09.510 ⇒ 00:21:24.039 Ned: Is it the, you know, LLM generating the misinformation? Or maybe, like, you know, this particular information not present in the input itself? Like, you know, which is a classic case of hallucination? Or is it just faithfully reflecting the incorrect input itself?
133 00:21:24.290 ⇒ 00:21:34.559 Ned: Right afterwards, once I have this investigated, I’ll just, like, you know, push up the fix. If the source is bad, let’s say… let me provide you the fixes as well. So, if the source is bad.
134 00:21:34.740 ⇒ 00:21:51.519 Ned: what I can do, I can implement the validations with respect to any, like, you know, let’s say, like, you know, schema checks or fallback logics as well, to prevent this particular hallucination. If, let’s say, if the LLM is, you know, really hallucinating, what I can do, I can improve the prompts.
135 00:21:51.590 ⇒ 00:22:02.209 Ned: That’s the first thing that I would do. Add any, you know, RAG approach within this, or limit the, you know, model exposure to any sort of unreliable, like, you know, fields itself.
136 00:22:02.480 ⇒ 00:22:07.400 Ned: And right afterwards, what I’ll do right afterwards, I’ll monitor. I’ll first test.
137 00:22:07.400 ⇒ 00:22:25.489 Ned: So, my first, you know, strategy would be test after providing the fix, and right afterwards, I’ll monitor. Add any sort of logins, because I can see that there are no logging at the moment. So, add any sort of loggings and automated checks for both source integrity and LLM outputs itself to capture any sort of future occurrences as well.
138 00:22:26.320 ⇒ 00:22:37.269 Pranav Narahari: Yeah, that sounds great. Okay, so moving on to the next one is, I’ll give you another example of a product that we worked on, at a very high level.
139 00:22:37.450 ⇒ 00:22:40.630 Pranav Narahari: And so, it’s basically a RAG system.
140 00:22:40.950 ⇒ 00:22:41.810 Ned: Hmm?
141 00:22:41.810 ⇒ 00:22:42.719 Pranav Narahari: Okay. Cool.
142 00:22:43.370 ⇒ 00:22:50.439 Pranav Narahari: Doesn’t have just a static data source. It’s very dynamic, and it’s also being…
143 00:22:50.820 ⇒ 00:22:59.360 Pranav Narahari: It needs to be curated over time based on how the chatbot performs. And so… one…
144 00:22:59.540 ⇒ 00:23:01.250 Pranav Narahari: One question that I have…
145 00:23:01.600 ⇒ 00:23:14.630 Pranav Narahari: starting off is, how do you think about creating these embedding updates? Like, what type of system do you think can be put in place to maintain, let’s say,
146 00:23:14.850 ⇒ 00:23:17.710 Pranav Narahari: A client,
147 00:23:19.430 ⇒ 00:23:39.090 Pranav Narahari: a document that a client is, maintaining. So, they’re non-technical, we can’t have them putting things into GitHub, they’re just, let’s say, updating, like, a Google Doc. Yeah. What system can we put in place so that we can create, like, the proper, embedding updates if they’re changing this, source of truth
148 00:23:39.090 ⇒ 00:23:52.359 Pranav Narahari: On a weekly basis. And also, I don’t want to have Google Docs as part of your design if it’s not necessary. If you can think of a better system for, like, a non-technical client to update the source of truth.
149 00:23:53.150 ⇒ 00:23:53.830 Ned: So wait.
150 00:23:55.990 ⇒ 00:23:57.230 Ned: Understandable.
151 00:23:59.810 ⇒ 00:24:01.110 Ned: Alright, yep.
152 00:24:01.360 ⇒ 00:24:15.370 Ned: Basically, I have, like, you know, worked on a very similar project, where actually, like, you know, we used SharePoint, which is very similar to Google Docs, you already know that. Right. I’m just thinking of another approach, where, you know, we can just, like, you know, put these…
153 00:24:15.580 ⇒ 00:24:28.239 Ned: Put simply this, but while I’m currently doing it, I’m just thinking more into rack system design as well. So, let’s say in a, you know, dynamic rack system, embeddings are never, like, you know, truly static itself, you know that.
154 00:24:28.240 ⇒ 00:24:35.590 Ned: So, they need to reflect the, you know, evolving knowledge base without tightly coupling to raw documents.
155 00:24:35.590 ⇒ 00:24:38.360 Ned: So, with respect to my particular approach would be.
156 00:24:38.360 ⇒ 00:24:51.879 Ned: I would definitely decouple the, you know, embeddings from the documents. Store the embeddings in the, you know, vector store, definitely, with respect to… with the metadata references. For example, document ID, version, timestamp, which I’m looking forward to have.
157 00:24:51.880 ⇒ 00:24:58.850 Ned: So that the, you know, LLM can retrieve the context without requiring the document access at, you know, the query time.
158 00:24:59.320 ⇒ 00:25:12.039 Ned: Right afterwards, what I can think of is the, you know, incremental updates. So, let’s say whenever the client updates or adds the content, it can be anywhere. Like, you know, we can provide a very small web portal to him.
159 00:25:12.040 ⇒ 00:25:19.999 Ned: So that the chatbot would be there, the storage can also be there, so that he can just, like, you know, update the storage itself as well.
160 00:25:20.190 ⇒ 00:25:28.810 Ned: So, with respect to the updates, because I’m thinking more into the RAG approach, so let’s say, like, you know, when the client updates or adds any sort of content.
161 00:25:29.120 ⇒ 00:25:41.859 Ned: What I can do, I can compute the embeddings only for the changed documents and update the vector store. This avoids re-computing the entire corpus, because I don’t want to do that. This would be an overburden, an overkill.
162 00:25:42.310 ⇒ 00:25:51.679 Ned: Right afterwards, what I’ll do, because, like, this is very critical in this particular phase, I would have to, you know, manage the versioning and stale handling as well.
163 00:25:51.840 ⇒ 00:25:57.080 Ned: So, what I would do, I would maintain a version or timestamp for each embedding.
164 00:25:57.150 ⇒ 00:26:10.519 Ned: During the retrieval or filter, or maybe, like, you know, prioritize the most recent embeddings to ensure that the chatbot sees up-to-date knowledge. And right afterwards, I’ll have an automated pipeline. So I’ll definitely implement a watcher service.
165 00:26:10.520 ⇒ 00:26:28.520 Ned: that listens for content changes, any sort of changes, and triggers the embedding recalculations as well. So this would include the, you know, combining with the caching and vector store indexing as well. So this ensures the performance and consistency, in my opinion, definitely.
166 00:26:28.800 ⇒ 00:26:36.290 Pranav Narahari: Yeah, I think that’s great. Let’s think about the same example that I just, gave out, like, just this product design. How would you.
167 00:26:36.290 ⇒ 00:26:36.760 Ned: Come on.
168 00:26:36.760 ⇒ 00:26:41.840 Pranav Narahari: create an evaluation framework for this, this type of RAG chatbot.
169 00:26:43.390 ⇒ 00:26:46.990 Ned: Evaluation framework, in terms of the responses itself.
170 00:26:47.480 ⇒ 00:26:55.450 Ned: There are multiple different, you know, segments, or multiple different tools or techniques which can act as an evaluation metric.
171 00:26:55.780 ⇒ 00:27:03.099 Ned: So what I normally do, like, you know, you know, within these particular ones, so I take another…
172 00:27:03.360 ⇒ 00:27:14.600 Pranav Narahari: you don’t need to create an evaluation framework for this product. If you want to tell me about something that you built in the past, that may be easier and probably gives you more context for the entire product as well that you built. So, yeah, feel free to do that.
173 00:27:14.620 ⇒ 00:27:26.109 Ned: I have, like, utilized, you know, the technique that we call it as LLM as a judge, because I have been taking another LLM, which can definitely judge the response of the previous one, and right afterwards provided the analysis. That’s what I have done.
174 00:27:26.110 ⇒ 00:27:45.729 Ned: I have applied the guardrails as well, so that I can make sure whatever that is coming out of AI or, like, you know, my LLM itself, it is up to the mark. And if not, do not pass it across. So that’s… these are the guardrails that I have, you know, worked on as well. Other thinkings that, you know, thoughts that are coming up into my mind is, you know, having the ground truth.
175 00:27:45.730 ⇒ 00:27:58.490 Ned: and synthetic queries as well. So what I can do, I can create a set of benchmark queries and expected answers derived from the authoritative, you know, context itself, and having the snapshots, curated examples.
176 00:27:58.540 ⇒ 00:28:12.139 Ned: And right afterwards, like, another example that is coming up in my mind is having the embeddings and retrieval testings. So what I can do, I can measure the retrieval quality by checking if the correct embeddings are returned for each query or not.
177 00:28:12.220 ⇒ 00:28:24.969 Ned: Specifically sticking to this particular example. I have already explained to you the LLM output evaluation. So, let’s say, like, you know, what I can do, I can compare the model responses to any sort of grand truth.
178 00:28:24.970 ⇒ 00:28:35.090 Ned: As well, using automatic metrics like Rogue, Blue, or embedding similarity as well, and, you know, have the supplement with the human review for nuisance cases as well.
179 00:28:35.100 ⇒ 00:28:39.449 Ned: So that we can have the human in the loop as well. So that’s what I’m thinking at the moment.
180 00:28:39.960 ⇒ 00:28:48.569 Pranav Narahari: Perfect, that’s great. So, I know we started a little bit late, too, so if you have any questions, we can go over by a few minutes.
181 00:28:48.770 ⇒ 00:28:50.729 Pranav Narahari: But yeah, feel free to ask any questions.
182 00:28:52.330 ⇒ 00:29:08.059 Ned: I think Sam was great. He actually, like, you know, answered, you know, plenty of my questions within that particular call itself. That’s awesome. I must say, he’s a very, very humble tech lead. I’m saying this for the second time, but he’s a very humble tech lead.
183 00:29:08.060 ⇒ 00:29:10.599 Pranav Narahari: Yeah, no, Sam’s great. He’s been great to work with.
184 00:29:11.050 ⇒ 00:29:12.130 Ned: Yeah, exactly.
185 00:29:12.210 ⇒ 00:29:22.029 Ned: Exactly. Yeah, with respect to the team structure, he has explained me, with respect to your clients, he explained me, he actually explained me some of the use cases that you currently have within your own team.
186 00:29:22.030 ⇒ 00:29:33.890 Ned: With your particular questions, I have caught up, you know, some of the bottlenecks or the challenges that you have faced across with your end clients. Maybe this may be the case because, like, you know, I was thinking in that particular context that
187 00:29:33.900 ⇒ 00:29:46.600 Ned: that you have faced these sort of problems, which you have currently asked me with respect to your clients itself. Like, let’s say hallucinations or everything and all. These are daily cases. We always, you know, see this happening in our daily lives, no problem in that.
188 00:29:47.100 ⇒ 00:29:55.200 Ned: With respect to this particular position, I’m very much motivated to come in across and bring my energy in this, you know, position itself.
189 00:29:55.200 ⇒ 00:30:07.290 Ned: Sam already knows that I love to work, you know, from 8am in the morning till 8pm in the night. I know this is very necessary for a particular startup. That’s why I have applied to Brainforge.
190 00:30:07.290 ⇒ 00:30:15.459 Ned: Second thing is my, you know, client-centric approach. That’s why I have applied. And third thing, that you would definitely have, you know, me in the same space.
191 00:30:15.460 ⇒ 00:30:33.120 Ned: Against what I have worked within, you know, One Day AI and Pintex Solutions, so that would be amazing. So I’ll definitely look forward to have, you know, the next round as soon as possible. I know it’s a panel interview, I’m more than, like, you know, I’ll definitely love to tell you about the projects that I have done at One Day and Pentec Solutions.
192 00:30:33.120 ⇒ 00:30:36.550 Ned: But I think, like, we can keep that for the panel interview.
193 00:30:37.060 ⇒ 00:30:38.480 Pranav Narahari: Sure, sounds great.
194 00:30:38.680 ⇒ 00:30:43.490 Pranav Narahari: Great talking to you, Ned. Yeah, we will reach out to you very shortly.
195 00:30:44.060 ⇒ 00:30:46.030 Ned: Yep, thank you very much.
196 00:30:46.390 ⇒ 00:30:47.899 Ned: Yeah, have a great day.