2026-03-05_brainforge_interview_w-_pranav

Meeting Title: Brainforge Interview w- Pranav Date: 2026-03-05 Meeting participants: Sowmya, Pranav Narahari, Kaela Gallagher

WEBVTT

1 00:00:23.690 ⇒ 00:00:24.600 Pranav Narahari: Hey, Samia.

2 00:00:25.090 ⇒ 00:00:26.200 Sowmya: Hi, Bruno!

3 00:00:27.110 ⇒ 00:00:29.120 Sowmya: Good morning, how are you doing today?

4 00:00:29.300 ⇒ 00:00:31.119 Pranav Narahari: I’m good, I’m good, I hope you’re doing well.

5 00:00:31.540 ⇒ 00:00:33.790 Sowmya: Yeah, I’m doing great, thank you for asking.

6 00:00:35.050 ⇒ 00:00:37.500 Pranav Narahari: Great. You ready to get started?

7 00:00:37.810 ⇒ 00:00:38.979 Sowmya: Yeah, sure.

8 00:00:38.980 ⇒ 00:00:39.640 Pranav Narahari: Cool, yeah.

9 00:00:39.640 ⇒ 00:00:40.600 Sowmya: Good stock.

10 00:00:40.800 ⇒ 00:00:44.919 Pranav Narahari: To… before we get, like, you know, fully into the questions and everything,

11 00:00:45.160 ⇒ 00:00:55.800 Pranav Narahari: Can you tell me just, like, a little bit about yourself? Maybe, take 2 to 3 minutes there, just kind of giving me, like, your background from however long ago you want to start to up until now?

12 00:00:57.020 ⇒ 00:01:00.950 Sowmya: Sure, let me start, like, let me introduce myself.

13 00:01:00.970 ⇒ 00:01:19.270 Sowmya: I have close to, like, 5 years of experience building the machine learning systems that actually make it into production and get ready, use it by the real teams. Most recently at Bank of America, I have been working on customer analytics and document intelligence tools using large language models.

14 00:01:19.270 ⇒ 00:01:25.170 Sowmya: A big part of my role is taking ideas from experimentation to something stable and scalable.

15 00:01:25.170 ⇒ 00:01:40.120 Sowmya: So, like, working on the data pipelines, building APIs, and also deploying models, and making sure they are monitored properly. I have also spent, like, time, like, improving how to evaluate the AI systems, like, especially in the LLM-based features.

16 00:01:40.120 ⇒ 00:01:55.860 Sowmya: So that, like, they are reliable and also grounded in real time. And also, I enjoy most, like, is in building the practical AI solutions that solve the all-business problems and iterating quickly based on the feedback. That’s how my experience

17 00:01:56.070 ⇒ 00:01:56.770 Sowmya: Cute.

18 00:01:57.240 ⇒ 00:02:02.069 Pranav Narahari: That’s awesome. Yeah, and just to, like, before we fully dive into things too, I’ll give you a little bit about my background.

19 00:02:02.410 ⇒ 00:02:02.770 Sowmya: Or…

20 00:02:03.130 ⇒ 00:02:13.049 Pranav Narahari: Here at Brainforge, I am, like, part of the delivery team, so I am writing code, shipping code, but then I’m also,

21 00:02:13.050 ⇒ 00:02:32.509 Pranav Narahari: sitting in front of clients, giving presentations about, okay, what did we complete this past week? How are our goals for the week, month, scope of the entire project looking? So, it’s a little bit of, like, that project management side, but it’s still the core thing that I do is that shipping code.

22 00:02:32.550 ⇒ 00:02:35.510 Sowmya: Great, that was great to…

23 00:02:35.510 ⇒ 00:02:43.339 Pranav Narahari: you know, start off on that… on that foot, and given my background, too, like, feel free to, like, ask me specific questions you think that I’d be best at answering.

24 00:02:44.340 ⇒ 00:02:45.360 Sowmya: No, I’m sure.

25 00:02:45.570 ⇒ 00:03:00.350 Pranav Narahari: Cool. So, I have, like, 5 different topics that I want to, like, kind of go into. Given that we have, like, only, what, 27 more minutes left, we can just try to be as brief as possible, given maybe, like, 5-ish minutes per topic.

26 00:03:00.430 ⇒ 00:03:16.069 Pranav Narahari: I guess the first topic that I want to talk about is how you think about AI judgment. And so, when do you feel that it’s right to use AI for a certain problem? When is it not the right solution?

27 00:03:16.200 ⇒ 00:03:20.179 Pranav Narahari: And then… Yeah, maybe we can start there, and then I can ask some follow-ups.

28 00:03:21.070 ⇒ 00:03:35.380 Sowmya: Sure, like, that’s really a good question. Like, honestly, like, one that comes a lot, like, when we are working with the business teams. So far, for me, AI is the right solution when the problem involves, like, patterns.

29 00:03:35.530 ⇒ 00:03:59.420 Sowmya: And also, that, and also, like, problems having the large amount of data, and also, like, having the examples rather than fixing the roles. It is especially useful for problems like prediction, natural language understanding, and also image recognition, recommendation system, and also automation. So, for example, in my projects, we use AI models to churn the prediction.

30 00:03:59.420 ⇒ 00:04:03.029 Sowmya: To identify the customers who are likely to leave.

31 00:04:03.030 ⇒ 00:04:27.640 Sowmya: like, and LLM-based or document intelligence systems to automatically understand and retrieve insights from large volumes of unstructured data. So that’s how, like, we usually follow. And… but I am also very cautious about not forcing AI where simpler automation or analytics can do the job. Like, if the problem is deterministic, like a fixed workflow or a simple threshold-based decision.

32 00:04:27.720 ⇒ 00:04:34.820 Sowmya: I would rather use this rule engine or SQL-based logic, because it’s faster, right, and cheaper, and easy to maintain.

33 00:04:35.220 ⇒ 00:04:37.770 Pranav Narahari: Definitely, that’s a great answer.

34 00:04:37.970 ⇒ 00:04:52.389 Pranav Narahari: I think, yeah, what’s also kind of, interesting to me, and I’d like to get your opinion on, is what do you think is, some misconceptions that people have with LLMs on the problems that it can solve, or what you feel like they can’t solve?

35 00:04:54.170 ⇒ 00:05:16.599 Sowmya: Yeah, that’s a, like, really interesting one, and honestly, I have seen quite a few, like, misconceptions around LLMs, like, especially… so, like, there’s several, like, like, that should be avoided, right? So, one is AI can replace humans completely, whereas in reality, it is designed to augment human decisions making.

36 00:05:16.600 ⇒ 00:05:33.409 Sowmya: And another misconception is AI models are always accurate, but they are performing heavily, depending on the data quality, training methods, and proper evaluation. People also assume that AI systems can be unbiased and practice morals.

37 00:05:33.410 ⇒ 00:05:36.670 Sowmya: Can he inherit the biased data?

38 00:05:36.670 ⇒ 00:05:52.410 Sowmya: And fairness and governance are also important, right here. And finally, many believe that AI systems can believe, that doesn’t require any maintenance, and also, like, continuous, we need the requirements, and also continuous monitoring also needed.

39 00:05:53.300 ⇒ 00:05:55.249 Pranav Narahari: Yeah, I totally agree with that.

40 00:05:55.830 ⇒ 00:06:11.240 Pranav Narahari: Yeah, kind of moving on, how do you… with, like, the whole array of models that are continuously changing, that are out there in the public, like, all of OpenAI’s models, Anthropics, Gemini’s, you name it, even the open source ones.

41 00:06:11.240 ⇒ 00:06:11.680 Sowmya: Thank you.

42 00:06:11.680 ⇒ 00:06:19.059 Pranav Narahari: How do you decide on which model to use for what application? What are some of the key characteristics that you look into?

43 00:06:20.560 ⇒ 00:06:28.529 Sowmya: Yeah, that’s a great question, because honestly, like, it’s one of the practical challenges, like, when coming to the models.

44 00:06:28.530 ⇒ 00:06:51.430 Sowmya: Since LLMs like OpenAI and Gemini are continuously evolving, so we design systems to be model, acoustic, or it should be, it should not be tightly coupled, and also we create an abstraction layer so that different models can be scrapped easily, and in my projects, I typically used models like OpenAI, GPT-4,

45 00:06:51.740 ⇒ 00:06:54.000 Sowmya: Or GPT-4,

46 00:06:54.080 ⇒ 00:07:01.500 Sowmya: for embedding generation, and also text embeddings, and also semantic search in RAG pipelines, depending on the requirements.

47 00:07:01.500 ⇒ 00:07:23.780 Sowmya: like, cost, latency, or privacy. We may also route, you know, like, requests to the other providers, such as Anthropic, Cloudy, or Azure OpenAI. Like, we also run the evaluation pipelines, like A-B testing, to compare model performance, or also, if at all the newer model performs better, like, we can upgrade without, like, disrupting the system, right?

48 00:07:24.620 ⇒ 00:07:26.909 Pranav Narahari: Sure, yeah.

49 00:07:27.230 ⇒ 00:07:34.389 Pranav Narahari: I guess a lot of these models, they are very similar. There’s a lot of overlapping. Maybe you shouldn’t put a lot of…

50 00:07:34.580 ⇒ 00:07:49.890 Pranav Narahari: Not a ton of thought needs to be… not a ton of thought needs to be put into just, you know, the provider itself. However, they… when we’re using them, especially via API, you can really configure the parameters. How…

51 00:07:50.100 ⇒ 00:07:56.169 Pranav Narahari: How have you used this parameter configuration specifically for a problem that you were working with?

52 00:07:58.240 ⇒ 00:08:01.949 Sowmya: Okay, like, let me, like, that’s a really good one, because while…

53 00:08:01.950 ⇒ 00:08:24.839 Sowmya: coming to the primary configurations, like, what we will do is, like, most LLM, LLMs, or OpenAI, Gemini, or Anthropic, provide similar capabilities, like, through APIs. The best approach is to integrate them using a provide, like, a provide configuration layer, instead of, likely, to hardcode it in the specific model.

54 00:08:24.840 ⇒ 00:08:33.980 Sowmya: Like, we define a primary configuration where we can specify the provider, model name, and also API key and parameters like temperature or,

55 00:08:34.200 ⇒ 00:08:51.600 Sowmya: like that, so that… so the application then calls the provider as the abstraction layer. Like, this allows us easily, like, switch the models, for example, GPT-4 to Gemini, or by simply just upgrading the configuration, rather than changing all the codebase.

56 00:08:51.600 ⇒ 00:09:03.120 Sowmya: It also enables, like, the fallback models, like, also the routing strategies, and of the… if the primary model fails, like, we can optimize the cost or latency. This, this works.

57 00:09:03.120 ⇒ 00:09:05.520 Sowmya: Better, like, in a better way, right?

58 00:09:06.330 ⇒ 00:09:14.270 Pranav Narahari: Yeah, so one common problem that we face as AI engineers is just the output that these LLMs provide, and

59 00:09:16.770 ⇒ 00:09:32.009 Pranav Narahari: updating, tuning that output such that it fits the exact use case and gives us the information, or gives us the output that we’re looking for. And a good way to do that is via these parameters. Hallucination’s a big issue, right?

60 00:09:32.010 ⇒ 00:09:38.520 Pranav Narahari: What are some parameters that you can configure that can mitigate hallucination?

61 00:09:39.980 ⇒ 00:09:50.210 Sowmya: Yeah, like, that’s really a good point, because while coming to the hallucination, like, it’s a key part to the AI, because, like, what we call, like,

62 00:09:50.590 ⇒ 00:10:09.310 Sowmya: like, from my experience, like, if hallucination in LLMs, like, happens when the model generates the information that is incorrect, we are not drawn into the input data. Like, several parameters and factors are also influenced hallucinations. One of the important parameters is temperature.

63 00:10:09.310 ⇒ 00:10:16.539 Sowmya: Like, where higher temperature increases randomness, and it can lead to, you know, like, more hallucinated responses.

64 00:10:16.540 ⇒ 00:10:30.690 Sowmya: So, another one is, top K, like, like a nuclear sample, like, which controls the probability distributions of the tokens, so where, you know, higher value can increase the creativity, but also risk the input.

65 00:10:30.690 ⇒ 00:10:44.379 Sowmya: incorrect outputs, so that will be the major challenge. While coming to the max tokens, it can also affect the hallucinations, because very long responses causes the model to generate, like, unsupported content.

66 00:10:44.380 ⇒ 00:11:00.900 Sowmya: In addition, the prompt design and also context quality plays a key role here. If the retrieval context in a rack system is weak or, like, irrelevant, the model is likely to hallucinate. So, reduce that, like.

67 00:11:01.030 ⇒ 00:11:16.639 Sowmya: We typically use the lower temperature, or the stronger prompts, or, like, rad retrieval grounding, and also response validation mechanism. So, these are the key factors, you know, like, we can easily mitigate the, like, hallucinations.

68 00:11:17.350 ⇒ 00:11:20.019 Pranav Narahari: Totally. Yeah, that’s a great answer.

69 00:11:20.480 ⇒ 00:11:23.659 Pranav Narahari: This is actually, like, a problem that I was just facing last week, and we had to.

70 00:11:23.660 ⇒ 00:11:24.060 Sowmya: Yeah.

71 00:11:24.080 ⇒ 00:11:28.840 Pranav Narahari: some deep research on it. And just for you, like, for some additional context on…

72 00:11:28.920 ⇒ 00:11:34.180 Pranav Narahari: What we found with some of these latest models is that they usually also have a parameter for thinking.

73 00:11:34.210 ⇒ 00:11:49.800 Pranav Narahari: And so, for example, like Anthropic, they actually won’t let you modulate the temperature parameter, if you’re using thinking. And it makes sense, because with thinking, you are trying to allow for more in-depth thought, maybe more broader…

74 00:11:49.800 ⇒ 00:11:58.619 Pranav Narahari: understandings of what the data is, being used as context, and so you need temperature actually always to stay at 1.

75 00:11:58.910 ⇒ 00:12:14.850 Pranav Narahari: So that’s another balance that, like, with, like, the later models that allow for thinking, it’s something that we need to figure out as engineers, is this… do we want to enable thinking, or do we actually want to enable, you know, not thinking, so we can have a more strict temperature?

76 00:12:14.970 ⇒ 00:12:18.029 Pranav Narahari: But yeah, great answer.

77 00:12:19.020 ⇒ 00:12:19.810 Sowmya: Thank you.

78 00:12:20.640 ⇒ 00:12:25.400 Sowmya: So, Ola, how long it would take, it took for, to resolve the issue?

79 00:12:26.130 ⇒ 00:12:38.760 Pranav Narahari: Yeah, so… for that type of issue, it’s actually not a lot of programming, right? Because it’s just a parameter that we need to change, right? However, where there is a lot of time that we need to… what we needed to spend a lot of time on was…

80 00:12:39.120 ⇒ 00:12:49.879 Pranav Narahari: okay, what are the parameters that we have available? And how do those parameters influence the exact output that the client is looking for?

81 00:12:49.880 ⇒ 00:12:50.280 Sowmya: Okay.

82 00:12:50.280 ⇒ 00:12:56.510 Pranav Narahari: Because… the client, you know, is the e-commerce space. They were looking for,

83 00:12:56.960 ⇒ 00:12:59.709 Pranav Narahari: I think they were looking for certain…

84 00:13:00.050 ⇒ 00:13:17.780 Pranav Narahari: like, adds data to be synthesized and given, like, some actual insights on that, something of that nature. And so we had to take all that into consideration. What we were finding was that with, a higher temperature and using thinking, for whatever reason, it was bringing in fake data.

85 00:13:18.030 ⇒ 00:13:21.189 Pranav Narahari: And so, that is not okay, right? So…

86 00:13:21.190 ⇒ 00:13:21.630 Sowmya: Okay.

87 00:13:21.630 ⇒ 00:13:36.589 Pranav Narahari: That’s where we had to be more strict on the temperature to make sure that even with, like, a great system prompt, sometimes we would find that it would slip through the cracks. I think the system prompt actually was, like, the main driver of, like, you know, fixing that issue.

88 00:13:36.590 ⇒ 00:13:36.970 Sowmya: young.

89 00:13:36.970 ⇒ 00:13:50.209 Pranav Narahari: That being said, it’s not a lot of, software development that was needed there. We didn’t have to, like, you know, patch, like, a crazy feature in GitHub and write tons of code, but there’s just a lot of research on the back end.

90 00:13:50.380 ⇒ 00:13:52.039 Sowmya: You know, it’s like…

91 00:13:52.510 ⇒ 00:13:52.920 Pranav Narahari: Yeah.

92 00:13:52.920 ⇒ 00:13:57.090 Sowmya: That’s great to hear that, like, production.

93 00:13:57.660 ⇒ 00:13:58.580 Pranav Narahari: Yeah.

94 00:14:02.050 ⇒ 00:14:12.599 Pranav Narahari: What are some of, like, your… the tools that you use currently as a… as an engineer that are some of, like, the latest AI-driven, like, development tools?

95 00:14:14.580 ⇒ 00:14:19.859 Sowmya: Yeah, that’s, like, honestly, like, coming to the tools, I use, like,

96 00:14:20.010 ⇒ 00:14:38.509 Sowmya: I can say, as an AI engineer, like, typically, from data processing to model development and monitoring, so while coming to the model development, I use, like, Python, like, frameworks like PyTorch, TensorFlow, and, Circuit-Learn, and also for LLM applications.

97 00:14:38.510 ⇒ 00:14:45.710 Sowmya: I use PlanChain, Langraph, or similar orchestration frameworks or tools.

98 00:14:45.710 ⇒ 00:14:59.439 Sowmya: And similar to RAG pipelines, in order to data processing and pipelines, I use Spark, Databricks, and also Airflow. Like, while coming to the deployment, I containerized the models using Docker and deploy it on Kubernetes.

99 00:14:59.440 ⇒ 00:15:10.099 Sowmya: And this, you know, like, we can use the FastAPI services for inference APIs. For experiment tracking, I… and model management, I use the MLflow for monitoring.

100 00:15:10.140 ⇒ 00:15:18.499 Sowmya: For production system, I use tools like Prometheus and Grafana. Those are the tools, like, I use in my day-to-day work.

101 00:15:18.820 ⇒ 00:15:31.149 Pranav Narahari: Gotcha. So, at Brainforce so far, our clients haven’t required us… or, like, the scope of the projects that we’ve had so far haven’t required us to build our own LLMs, or our own models.

102 00:15:31.490 ⇒ 00:15:37.709 Pranav Narahari: The… the depth that we’ve gone to so far has been primarily just, like.

103 00:15:37.710 ⇒ 00:15:41.709 Sowmya: complex, like, RAG systems. However…

104 00:15:41.710 ⇒ 00:15:44.440 Pranav Narahari: There may have been, like, main projects that we just

105 00:15:45.770 ⇒ 00:15:48.710 Pranav Narahari: But, yeah, how do you feel about…

106 00:15:48.920 ⇒ 00:16:00.230 Pranav Narahari: designing a RAG system, and specifically a RAG system at scale, that it’s going to be in front of a client, potentially going to a client’s customer base, so it’s very sensitive.

107 00:16:02.120 ⇒ 00:16:15.609 Sowmya: Yeah, that’s right, like, whenever, like, designing a RAG system, or, like, that means, like, because, like, RAG, that’s a customer-facing and handles the sensitive data.

108 00:16:15.610 ⇒ 00:16:28.590 Sowmya: like, I usually start by understanding the customer queries and also knowledge sources, such as, like, we use it to have, like, FAQs, or product documentation, support tickets, or, you know, policy documents.

109 00:16:28.590 ⇒ 00:16:37.609 Sowmya: So, firstly, like, we will, pre-browse the documents here by speak, like, splitting them into smaller chunks, and then generate the embeddings

110 00:16:37.610 ⇒ 00:16:49.920 Sowmya: for each chunks, so these embeddings can be restored in the vector database based on the client’s use case, or else larger PDFs, like, or else we can go with the semantic search.

111 00:16:49.980 ⇒ 00:17:06.620 Sowmya: Or, you know, like, we can go with the hybrid search, like semantic plus keyword search. Like, here we can use that. Look, whenever, you know, like, a customer asks the question, system converts the query into the embeddings, and, you know, retrieves the most relevant document chunks.

112 00:17:06.619 ⇒ 00:17:13.609 Sowmya: So these retrieved documents are then passed, like, context to the LLM to generate the grounded response.

113 00:17:13.609 ⇒ 00:17:29.530 Sowmya: So, here, you know, the main challenge is that, like, to improve the reliability, we also implement the citation-based responses, like a prompt template, and also monitoring to track the response quality, and also we can reduce the hallucinations, as we discussed earlier.

114 00:17:29.530 ⇒ 00:17:30.660 Sowmya: About this. Yeah.

115 00:17:32.180 ⇒ 00:17:33.030 Sowmya: Yeah.

116 00:17:33.030 ⇒ 00:17:40.360 Pranav Narahari: That sounds great, how do you think about… documents, or…

117 00:17:40.970 ⇒ 00:17:56.320 Pranav Narahari: sources of truth that have mixed media. So, let’s say if it has… or, mixed data types. Like, let’s say, text, images, maybe in some case, like, links to, like, other, text files or images.

118 00:17:57.090 ⇒ 00:18:07.219 Pranav Narahari: it’s totally fine if you haven’t, like, worked on something of that, like, complexity, but I guess my… I’m more interested on, like, how you would think about designing a system like that.

119 00:18:09.240 ⇒ 00:18:27.390 Sowmya: Yeah, definitely, like, that actually is something I have been exploring quite a bit, like, like, like, a lot of, let me think about this, and, like, rephrase the answer, like, for me. Yeah. And when I think about, like, designing.

120 00:18:28.150 ⇒ 00:18:39.600 Sowmya: When a document contains, like, mixed data types, like, text, images, and also links, I designed this system as, you know, multi-model rack pipeline. This is very important, because

121 00:18:39.600 ⇒ 00:18:49.789 Sowmya: The first step would be the pre-processing, each data type differently, like, you know, the text is changed, embeddings, and also embedding models, and, you know.

122 00:18:49.880 ⇒ 00:19:05.100 Sowmya: like, we do have the embedding models. So, while images, you know, like, we process the… using vision models, like OCR, to extract the captions or text. So, for image, without text, we generate, like,

123 00:19:05.100 ⇒ 00:19:22.740 Sowmya: image embeddings, like, using the multimodal embeddings, like, again. So, all these embeddings are stored in a vector database with the metadata linking, and then linking them back and forth. So, like, original documents, like, when a user comes and, like, retrieve… we retrieve the relevant context.

124 00:19:22.740 ⇒ 00:19:35.449 Sowmya: across these different moralities, like, then we retrieve the text and also metadata, and then pass to the LLMs as a context, so that it can generate a grounded response.

125 00:19:35.450 ⇒ 00:19:39.079 Sowmya: So, this is the… while coming to the architecture, you know.

126 00:19:39.250 ⇒ 00:19:53.650 Sowmya: Firstly, the document ingestion, like PDFs, images, or, like that. And next would be, like, data preprocessing, text chunking, embeddings, and also image OCR, or visual embeddings. While coming to the links, we have,

127 00:19:53.980 ⇒ 00:20:03.469 Sowmya: crawl, or to extract the content, like, to extract the multi-model embeddings of embeddings for each modality, and, that must be…

128 00:20:03.470 ⇒ 00:20:14.259 Sowmya: responsible vector store. So, to store the embeddings with the metadata, and also then consider a real part, you know, retrieval, like, similarity search.

129 00:20:14.260 ⇒ 00:20:31.810 Sowmya: And then, after that, we can just, context filling, like, combining the retriever text with the image descriptions. And also, like, lastly, LLM will generate the grounded response. This is the architecture I would follow, like, whenever I wanna face this kind of situations.

130 00:20:32.170 ⇒ 00:20:47.709 Pranav Narahari: Yeah, that’s a… that’s a great architecture. That design makes a lot of sense. Yeah, one of the last topics I want to talk about, before we get into, like, some questions, if you have any, are, how do you… when we build, like, these complex LLM systems.

131 00:20:48.180 ⇒ 00:20:50.310 Pranav Narahari: How would you go about evaluating them?

132 00:20:53.680 ⇒ 00:21:03.220 Sowmya: Yeah, like, evaluation, I’d say, like, one of the most critical, right, like, often overlooked parts of building the reliable LLM systems.

133 00:21:03.220 ⇒ 00:21:17.309 Sowmya: So, when I think about evaluation, so, I, like, building the complex LLM systems, most important thing is the data pipelines, retrieval systems, and LLM orchestration, and also monitoring layers.

134 00:21:17.360 ⇒ 00:21:34.250 Sowmya: For example, you know, in a typical architecture, we may have, like, document ingestion, embedding generation, and also vector retrieval, and also, you know, LLM generation rail. Working together, we use, you know, like, frameworks like Langchain or agent orchestration tools.

135 00:21:34.720 ⇒ 00:21:40.590 Sowmya: To help to manage the multi-step workflows such as tool usage, retrieval, and reasoning.

136 00:21:40.860 ⇒ 00:21:47.190 Sowmya: To validate these systems, like, we use the automated and human evaluation methods, so that

137 00:21:47.390 ⇒ 00:21:58.190 Sowmya: Automated evaluation methods, like, you can use the metrics like retrieval accuracy, cement, similarity, and also latency and, hallucination detection.

138 00:22:00.290 ⇒ 00:22:06.900 Sowmya: Apart from these, we can use the Airby testing as well to benchmark these, to compare the outputs across the…

139 00:22:07.130 ⇒ 00:22:17.299 Sowmya: prompts and also those. And while coming to the critical system, we had human-in-the-loop evaluation to review the correctness and also relevance of the responses.

140 00:22:17.420 ⇒ 00:22:35.710 Sowmya: So, while coming to the, monitoring in the production, like, we’re continuously monitoring the production using logging, feedback loops to avoid the, like, any issues, and we also make sure the systems are reliable and make sure the data is available.

141 00:22:36.740 ⇒ 00:22:38.330 Pranav Narahari: Gotcha. Yeah.

142 00:22:39.910 ⇒ 00:22:58.680 Pranav Narahari: Yeah, last question before we kind of get into any questions that you have. I kind of mentioned how we had an issue in production about hallucination, and how we went about mitigating that in the AI system. Do you have any examples of, AI systems that you’ve developed, or developed as part of a team?

143 00:22:58.770 ⇒ 00:23:07.139 Pranav Narahari: That were pushed into production, and then only then did you realize that there were some issues that needed to be patched, and what did you do to fix them?

144 00:23:09.210 ⇒ 00:23:27.209 Sowmya: Absolutely, like, that’s… in my projects, like, I do have a couple of issues, you know? Like, while coming to the one specific issue, I would like to say, you know, like, we have, like, enterprise document intelligence system, like, which is an access system, where hallucination was a key consideration.

145 00:23:27.210 ⇒ 00:23:37.460 Sowmya: Like, like you, we have faced that. Like, we are more relied on the accurate information. Sometimes LLM would generate the answers that sounded correct, but we are not,

146 00:23:37.810 ⇒ 00:23:56.520 Sowmya: actually supported by the retriever documents. For example, like, if a user asks, like, for retrieved policy, or in the policy document, that might generate the detailed explanation, even when the document does not contain the exact information. So, the hallucination.

147 00:23:56.520 ⇒ 00:24:04.710 Sowmya: the system, like, the system will… leads to the hallucination. Like, this is the typical hallucination, like, where the moral…

148 00:24:04.900 ⇒ 00:24:12.069 Sowmya: fills the doc, like, answer with… it eats knowledge, like, you, whatever you call,

149 00:24:12.070 ⇒ 00:24:26.710 Sowmya: like, randomly, takes the randomly, in order to… how we solve this product, you know, to reduce the hallucination, like, we strictly, retrieve the grounding, like, LLM stature, and also only from the retrieval documents.

150 00:24:26.710 ⇒ 00:24:42.790 Sowmya: Also, we also focused on citation-based responses here, because the model had to reference only the document chunk it is used, so that we… again, also, we also focused on lower temperature settings to reduce the creative generation, and also

151 00:24:43.580 ⇒ 00:24:49.859 Sowmya: We also focused on the prompt constraints, like, like the prompt which is instructed, I don’t have.

152 00:24:49.940 ⇒ 00:24:59.689 Sowmya: like it, like, I don’t have any information if there is, model is not… that doesn’t have the, like, the correct documents or correct, information.

153 00:24:59.710 ⇒ 00:25:14.599 Sowmya: And also supported answers, like, see, you know, this is our, while coming to a RAT-based document intelligent system, so hallucination occurred when the model generated answers, but, like, which are not supported by the documents.

154 00:25:14.600 ⇒ 00:25:26.030 Sowmya: So, we use the retrieval grounding, and citation-based responder, pond constraining, and lower temperature settings here. So, this is the one I have worked on. So, to fix the issue.

155 00:25:27.080 ⇒ 00:25:42.109 Pranav Narahari: Yeah, that makes… that makes sense. So I have, in 5 minutes, I have, like, a strict drop. I have to… I can’t stay on for longer, but if you have, you know, any questions, I’m happy to answer them in the next 5 minutes, or, you know, after the call, you can email them to me.

156 00:25:43.610 ⇒ 00:25:54.229 Sowmya: Yeah, sure, like, I just have some couple of questions, like, I’ll make sure to complete it faster, and what is the biggest challenge the team is currently facing with the AI features?

157 00:25:54.930 ⇒ 00:25:57.140 Pranav Narahari: Yeah, that’s a great question.

158 00:25:57.710 ⇒ 00:25:59.350 Pranav Narahari: Let me… let me think.

159 00:26:03.470 ⇒ 00:26:13.880 Pranav Narahari: One, one thing that you mentioned that you’ve worked on that I feel like currently we probably don’t have a lot of support is, on actually…

160 00:26:14.380 ⇒ 00:26:17.250 Pranav Narahari: Building models, and having really…

161 00:26:17.440 ⇒ 00:26:24.200 Pranav Narahari: deep understanding of, like, what a good… and alternatively, like, when we’re not building a model,

162 00:26:24.950 ⇒ 00:26:34.579 Pranav Narahari: building RAG systems that are extremely complex, you know, I think we do have the skill set for that, but I think for future…

163 00:26:36.780 ⇒ 00:26:51.049 Pranav Narahari: Sorry, one second. Yeah, for, like, for future projects, that’s only gonna become more and more, technically complex. And so, that was… that’s definitely some area that we need to… that we need to fill for whoever is, like.

164 00:26:51.090 ⇒ 00:26:58.190 Pranav Narahari: AI engineers at Brainforge in the future, so, that was great to hear that you have some, you have some experience there.

165 00:26:59.060 ⇒ 00:27:12.369 Sowmya: Yeah, I got you, thank you for answering, and just, just one more question, like, if I would join, like, what should I need to focus me on first? Like, these challenges, or, like, any other issues, like, that needs to be focused?

166 00:27:12.930 ⇒ 00:27:28.559 Pranav Narahari: Yeah, I would say, from a technical perspective, it seems like you’re very well versed. I think you would, you’d fit in great there. There wouldn’t be a lot of upscaling needed there. I would want to make sure that you feel comfortable using

167 00:27:28.690 ⇒ 00:27:35.219 Pranav Narahari: like… in AI-enabled development, and so using tools like Cursor,

168 00:27:35.220 ⇒ 00:27:35.840 Sowmya: Yeah.

169 00:27:35.840 ⇒ 00:27:38.120 Pranav Narahari: using them…

170 00:27:39.470 ⇒ 00:27:55.849 Pranav Narahari: Via, like, feeling very comfortable using agents to help supplement your work, and you focusing on very… the strategic and creative efforts of your position, that’s gonna be huge, because that’s what allows us to scale, that’s what allows you to work in multiple different work streams.

171 00:27:55.850 ⇒ 00:28:03.519 Pranav Narahari: So that’s one… one aspect of, like, what being successful at Brainforge looks like. Another part of it is…

172 00:28:03.640 ⇒ 00:28:19.609 Pranav Narahari: having the technical side, but also being, agile enough to support certain client, operations. So, like I said, I’m an AI engineer, I have a technical background, I can ship code,

173 00:28:19.840 ⇒ 00:28:31.010 Pranav Narahari: But also, being able to describe your work to a non-technical leader at a different company who is a subject matter expert on the product that you’re

174 00:28:31.530 ⇒ 00:28:43.520 Pranav Narahari: that you’re building to solve their problem. If you can fill in that gap, that’s what’s gonna really, that’s gonna really help you out, but it’s also gonna make you much more successful at Brainforge.

175 00:28:44.460 ⇒ 00:28:49.389 Sowmya: Okay, that’s great, like, that’s a great,

176 00:28:49.600 ⇒ 00:28:57.299 Sowmya: which that we need to know whenever, like, we are working in any organization. Yeah, like, I will make sure to…

177 00:28:57.610 ⇒ 00:28:58.690 Sowmya: to that.

178 00:28:59.580 ⇒ 00:29:01.210 Sowmya: And that’s all I…

179 00:29:01.210 ⇒ 00:29:19.279 Pranav Narahari: everybody. So, it’s, it’s really about your interest, too. It’s not like everybody’s giving client presentations if they don’t… if they don’t want to, then they don’t do it, for sure. And there’s other people at the company that fill in that role. However, I guess it’s more of, like, an opportunity here.

180 00:29:19.280 ⇒ 00:29:20.210 Sowmya: Good side.

181 00:29:20.210 ⇒ 00:29:21.790 Pranav Narahari: Yeah.

182 00:29:22.010 ⇒ 00:29:38.099 Pranav Narahari: which I think for a lot of people, you know, engineering has become kind of like, you work in your own, just like, your own desk, just kind of glued to your laptop. Here, it’s a lot more interaction with people, discussing how we can build systems that can

183 00:29:38.100 ⇒ 00:29:44.310 Pranav Narahari: do a lot of that grunt work for you. However, like, we can have the really, like, high-value conversation.

184 00:29:45.010 ⇒ 00:29:46.649 Pranav Narahari: Like, power that grunt work.

185 00:29:46.770 ⇒ 00:29:50.419 Pranav Narahari: That’s… that’s really important here at Brainforge.

186 00:29:51.200 ⇒ 00:30:06.339 Sowmya: Okay, got you, got your point. Thank you so much for your time, and I think it’s… the time is running, so I think you have a meeting, so I don’t want to take too much time. Thank you so much, and you have a great rest of the day.

187 00:30:06.590 ⇒ 00:30:10.500 Pranav Narahari: Thank you, yeah, you too, and yeah, we will follow up, shortly.

188 00:30:11.020 ⇒ 00:30:11.959 Sowmya: Thank you! Bye!

189 00:30:11.960 ⇒ 00:30:12.999 Pranav Narahari: Have a goodbye.

Brainforge Knowledge

Explorer

2026-03-05_brainforge_interview_w-_pranav_d39e694b

Graph View