2025-09-18_pranjali-jake_making_sense_of_rag_how_it

Meeting Title: Pranjali-Jake: “Making Sense of RAG: How It Bridges AI and Your Data” Date: 2025-09-18 Meeting participants: Jake Nathan, Pranjali Basmatkar, Mike Klaczynski, Uttam Kumaran

WEBVTT

1 00:00:59.180 ⇒ 00:01:00.400 Jake Nathan: April and July.

2 00:01:03.680 ⇒ 00:01:05.160 Pranjali Basmatkar: Hey, Jay, how are you doing?

3 00:01:05.740 ⇒ 00:01:07.119 Jake Nathan: Good, how are you doing?

4 00:01:07.120 ⇒ 00:01:07.940 Pranjali Basmatkar: Wood.

5 00:01:08.280 ⇒ 00:01:11.209 Jake Nathan: Yeah, thank you for making time to do this, I really appreciate it.

6 00:01:11.210 ⇒ 00:01:13.180 Pranjali Basmatkar: Oh, of course, of course.

7 00:01:13.310 ⇒ 00:01:15.330 Pranjali Basmatkar: Where are you based out of? Are you in the Bay Area?

8 00:01:15.810 ⇒ 00:01:20.729 Pranjali Basmatkar: I’m actually in Austin, Texas. Have you… have you been to Austin before?

9 00:01:20.930 ⇒ 00:01:26.170 Pranjali Basmatkar: Not really, I haven’t. I mean, I think I did, like, a layover once, but then nothing else.

10 00:01:26.170 ⇒ 00:01:30.540 Jake Nathan: I gotcha, yeah. Have you been in the Bay Area for a while?

11 00:01:30.810 ⇒ 00:01:37.719 Pranjali Basmatkar: Yeah, yeah, pretty much all of my time in the US has been in the Bay Area, so… yeah, this is home for now.

12 00:01:37.720 ⇒ 00:01:51.839 Jake Nathan: Nice, nice, that’s awesome. Yeah, I was there a few months ago, and I’ve been in Austin my whole life, but honestly, if I were to move somewhere else, I think it would be somewhere in the Bay Area, just because… I don’t know, I loved it. It was incredible.

13 00:01:52.220 ⇒ 00:01:53.730 Pranjali Basmatkar: Awesome.

14 00:01:53.940 ⇒ 00:02:10.500 Jake Nathan: Well, cool. Well, yeah, like I said, thanks for making time. I know I kind of sent you the questions beforehand, and I think what would be best is, you know, we’ll just see, with the time that we have, we’ll just kind of work through them, and and go from there, but any other questions before we get started?

15 00:02:11.080 ⇒ 00:02:22.809 Pranjali Basmatkar: Not really. I mean, honestly, this is one of the first ones I’ve done in this setting, so I’ll let you take the lead and answer whatever questions you have. We’ll go from there.

16 00:02:23.500 ⇒ 00:02:36.580 Jake Nathan: Sounds good, and yeah, the process, so, like, once we do this interview, I’ll take some time, like, rewatch the interview, and… and come up with the first draft, and then I’ll make sure to, you know, like, anything that you say, like.

17 00:02:36.580 ⇒ 00:02:44.360 Jake Nathan: you’ll… you’ll get final say on everything before we publish, so it’s not like, I’m just gonna immediately publish this, and you’d be surprised, so…

18 00:02:44.360 ⇒ 00:02:47.019 Jake Nathan: Yeah, just wanted to make that note. So…

19 00:02:47.030 ⇒ 00:02:57.820 Jake Nathan: Yeah, I think you kind of got the sense of the article, and so, yeah, the first question is, you know, why does RAG exist, and what does it do that LLMs can’t do?

20 00:02:58.440 ⇒ 00:03:08.410 Pranjali Basmatkar: Right. So, I think we can sort of, like, break the question down, like, two parts. Why does RAG exist is probably, one that’s…

21 00:03:08.950 ⇒ 00:03:29.160 Pranjali Basmatkar: already sort of, like, something we’ll talk about through the interview. And then, I guess, the second one is, what are the limitations with the current LLMs? So, I mean, everyone in the world today is using all kinds of LLMs. All of these LLMs are really smart, they can really answer a lot of questions, but there are some key limitations that

22 00:03:29.160 ⇒ 00:03:41.349 Pranjali Basmatkar: sort of, like, prevent enterprises from just, like, adopting AI and adopting, LLMs in general. So, the very first thing is just, like, knowledge cutoff. So, LLMs are always trained on a set of

23 00:03:41.350 ⇒ 00:03:45.030 Pranjali Basmatkar: Training data, and the training data expires at a certain age.

24 00:03:45.030 ⇒ 00:04:09.960 Pranjali Basmatkar: So, let’s say if the model was trained on data up till 2024, so any new information since after that point is not going to be part of that LLM’s parametric knowledge, and that’s a huge pitfall. So, if the information that the LLM has is not latest or not fresh, then that information can’t be trusted. That information might not even be accurate.

25 00:04:09.960 ⇒ 00:04:34.199 Pranjali Basmatkar: any longer. So, RAG comes in sort of, like, at the context layer, and contextual talks about this a lot, I’m sure this has come up a lot. So, there’s the intelligence layer, where all of the LLMs exist, where, you know, they use their parametric knowledge, the data that they were trained on to answer all these questions. They’re really good at English, they’re really good at different languages, generalizable tasks. And then there’s the data layer, where all of the data from

26 00:04:34.200 ⇒ 00:04:36.569 Pranjali Basmatkar: Enterprises, organizations, is stored.

27 00:04:36.820 ⇒ 00:04:43.320 Pranjali Basmatkar: So DAG is sort of like the bridge between the intelligence layer and the data layer that connects both of these,

28 00:04:43.320 ⇒ 00:05:00.890 Pranjali Basmatkar: both of these really crucial layers in a way that it’s relevant. So, RAG will basically capture all of the relevant information from the data layer, and then send it to the LLM so that it can use its generalizable knowledge and, you know, give out fresh, relevant information.

29 00:05:00.890 ⇒ 00:05:24.939 Pranjali Basmatkar: So that’s one thing that is, like, a very important use case for RAG. Secondly, hallucination is something that I think everyone has been talking about for a very long time, and it’s still not a solved problem. So LLMs, while they have all of this information in their parametric knowledge, also are sometimes prone to coming up with information that might not be relevant, or might not be

30 00:05:24.980 ⇒ 00:05:49.600 Pranjali Basmatkar: appropriate, or maybe they’re just misunderstanding the information that they have been trained on. So, having RAG as, like, a middle layer or a context layer helps us ground all of the information that the LLM is coming up with into the data that’s been provided by the organizations. So it’s a way to guarantee groundedness, factuality, and you basically just get, like, an overall

31 00:05:49.600 ⇒ 00:05:52.360 Pranjali Basmatkar: Much better, response at the end.

32 00:05:53.440 ⇒ 00:06:03.199 Jake Nathan: That totally makes sense, and yeah, you kind of alluded to this, but, where do you see, like, right now, RAG being the most useful?

33 00:06:03.900 ⇒ 00:06:22.240 Pranjali Basmatkar: I think there’s… there’s use cases for RAG in pretty much every field, I believe, but some that can be really impact… that… that can have the most impact with RAG could be, like, financial services, or basically any services that rely on, you know.

34 00:06:22.240 ⇒ 00:06:37.849 Pranjali Basmatkar: fresh information to be used. So, imagine, like, a financial services firm which, you know, uses, really recent information. It can be, like, you know, trading information, stock prices, or,

35 00:06:37.850 ⇒ 00:06:49.950 Pranjali Basmatkar: blogs from financial companies or, other… other organizations out there. So, using this recent information and coming up with, like, a cohesive

36 00:06:49.950 ⇒ 00:07:04.989 Pranjali Basmatkar: advice for a financial firm can be very useful. So I guess financial firms can be one. Healthcare is a very, important field that can really benefit from RAG. Having access to, you know, new,

37 00:07:05.240 ⇒ 00:07:17.409 Pranjali Basmatkar: new medical texts or new blogs from conferences where they’re talking about newer research can really benefit, the course of treatment. And, what else?

38 00:07:17.410 ⇒ 00:07:28.889 Pranjali Basmatkar: legal, compliance, everything. Honestly, I feel like, any use case where there’s new information that can go into the world, will benefit with RAG.

39 00:07:29.510 ⇒ 00:07:40.129 Jake Nathan: Yeah, and you kind of alluded to that at the end, but I’m curious if there’s been really any industry or any business problem where you’ve kind of thought, actually, this shouldn’t be a place for RAG.

40 00:07:40.570 ⇒ 00:07:53.929 Pranjali Basmatkar: So, there’s… there’s still sort of, like, some use cases where RAG in itself might not fit completely. Some, some use cases where you probably don’t,

41 00:07:53.930 ⇒ 00:08:05.720 Pranjali Basmatkar: don’t have access to the complete information. Say, you only have parts of information available, and you want to sort of, like, base your response on, just, like.

42 00:08:05.950 ⇒ 00:08:27.090 Pranjali Basmatkar: like, a large amount of historical information that can’t possibly be chunked or retrieved or fed into the context window of a model. So maybe those are the use cases where RAG could fall short, but then we have already reached a point where we are talking about, you know, Asians and, dynamic Asians that can, you know, overcome these

43 00:08:27.090 ⇒ 00:08:30.869 Pranjali Basmatkar: challenges and make RAG work as a context layer again.

44 00:08:31.670 ⇒ 00:08:47.940 Jake Nathan: Okay, that… and, yeah, do you, like, like you said, it seems like, there… pretty much any use case rag would make sense, but, like, off the top of your head, like, is there, has there been, like, a business problem where you’ve actually, like.

45 00:08:48.670 ⇒ 00:08:53.409 Jake Nathan: told, like, a leader to… to not use RAG in that case.

46 00:08:53.940 ⇒ 00:09:02.259 Pranjali Basmatkar: So, honestly, RAG does apply to most use cases, but when we do advise an enterprise to,

47 00:09:02.260 ⇒ 00:09:26.430 Pranjali Basmatkar: maybe take, like, a few steps before, you know, engaging with RAG. It’s generally to do with other logistical things. So, with RAG, you kind of want to have access to the data you want to base your LLMs on. So, data readiness is really important, and having a clear picture of what use case you want to solve, and what business problem you want to solve, and what’s the overall metric you want to index on.

48 00:09:26.480 ⇒ 00:09:28.570 Pranjali Basmatkar: So, I mean…

49 00:09:28.660 ⇒ 00:09:42.079 Pranjali Basmatkar: Sort of, like, evaluating these systems and having a highly accurate system is one thing, but you also want to make sure that the business unit is saving time, or… or, you know, benefiting from this overall system.

50 00:09:42.080 ⇒ 00:09:51.309 Pranjali Basmatkar: So, a lot of times, the readiness to adopt RAG is, is generally evaluated using these things.

51 00:09:51.310 ⇒ 00:10:08.260 Pranjali Basmatkar: And if all of these things check out, it’s a much faster way to sort of, like, loop through the entire process, you know, set up POCs, set up pilots, do some human-the-loop evaluation, see how good the use case is looking, and then, you know, hit production and go zoom zoom.

52 00:10:08.840 ⇒ 00:10:27.850 Jake Nathan: That totally makes sense, and that kind of leads us to the next question. It seems like, from what you’re saying, one mistake that companies might make is they might just want rag immediately, so they just go right into it, but like you’re saying, it seems like a mistake is you should actually take some time to prepare beforehand and actually figure out the

53 00:10:27.850 ⇒ 00:10:35.150 Jake Nathan: business problem that you’re trying to solve. So, on that note, are there other mistakes that you see companies make when they’re trying to

54 00:10:35.150 ⇒ 00:10:37.689 Jake Nathan: think about RAG, or ultimately implement it?

55 00:10:37.840 ⇒ 00:10:46.070 Pranjali Basmatkar: Yeah, so, a lot of enterprises that Contextual has worked with have always been the ones who have tried to build RAG in-house.

56 00:10:46.070 ⇒ 00:11:01.349 Pranjali Basmatkar: this always almost happens. But there’s a reason why a lot of in-house RAG systems sort of, like, never go past the POC stage. They’re always stuck there, they never are production ready. So, I mean, I’ve noticed a few different,

57 00:11:01.480 ⇒ 00:11:21.220 Pranjali Basmatkar: challenges that most in-house RAG systems could face. One is, basically, they see RAG as sort of, like, a static system. They do, like, a one-time ingestion, they would do one-time extraction, and they basically use just the same static information for their LLM to sort of, like, call upon.

58 00:11:21.220 ⇒ 00:11:25.939 Pranjali Basmatkar: And that kind of completely defeats the whole purpose of, you know, RAG getting fresh information.

59 00:11:25.940 ⇒ 00:11:41.269 Pranjali Basmatkar: So, it’s basically not a one-time ingestion process or one-time extraction process. It needs to be, like, a continuous process that needs to be fueled with newer information every now and then. So that’s one thing. And then,

60 00:11:41.270 ⇒ 00:11:47.770 Pranjali Basmatkar: RAG is really sort of, like, broken down into two, two phases. One is the retrieval phase, where you’re

61 00:11:47.770 ⇒ 00:12:01.220 Pranjali Basmatkar: you know, scanning through all of these many documents, and then retrieving the most relevant ones. And then there’s the generation phase, where you use these retrieved chunks and send them to the LLM. So,

62 00:12:02.500 ⇒ 00:12:24.650 Pranjali Basmatkar: there’s few ways retrieval can go wrong. Basically, you could do just, like, bad chunking, you could not preserve a lot of context in each of your chunks, and then when ultimately you’re trying to do retrieval and you’re trying to do generation, you’ll just have, not very contextful chunks, and then those, those won’t really lead to any good results at the end.

63 00:12:24.820 ⇒ 00:12:49.559 Pranjali Basmatkar: I guess one other thing is, once you do have a system ready, you kind of also want to have a very solid plan for evaluation. So, again, that just comes down to how to technically evaluate your retrieval system, how to technically evaluate your generation system, how do you check against hallucination and groundedness, and those sort of things.

64 00:12:49.560 ⇒ 00:13:03.030 Pranjali Basmatkar: all of these things can sort of, like, add up to, create, like, a rack system that kind of works, but will never be production ready. So, yeah, these are just some of the things that I’ve seen in the last few years.

65 00:13:03.410 ⇒ 00:13:13.410 Jake Nathan: That… that makes sense, and especially it’s… it is pretty ironic that, you know, people implement RAG, but then, like you’re saying, use the same static data, because that’s, you know.

66 00:13:13.600 ⇒ 00:13:14.060 Pranjali Basmatkar: That’s fair.

67 00:13:14.060 ⇒ 00:13:20.339 Jake Nathan: So the problem you’re trying to solve in the first place, so, that’s interesting. And, to your last point,

68 00:13:20.360 ⇒ 00:13:34.130 Jake Nathan: when someone actually implements a RAG system, how do you measure? Like you said, you want to measure each part of it, like, what… what metrics are you specifically measuring? How do you keep track of them? Like, what’s that measurement process like?

69 00:13:34.460 ⇒ 00:13:45.759 Pranjali Basmatkar: Right. So, I guess when you’re setting up, sort of, like, a rack system, in the POC stage, you kind of want to evaluate your retrieval as well as your generation. So, you can evaluate your retrieval

70 00:13:45.770 ⇒ 00:13:59.429 Pranjali Basmatkar: by using just, you know, any of the metrics that we have been playing with for many years now, like precision at K, recall at K. You just want to make sure that the n retrieved chunks are actually, you know, high on precision and recall.

71 00:13:59.430 ⇒ 00:14:07.709 Pranjali Basmatkar: Followed by that, for generation, there’s a few different things you can do to make sure that, given the right information is being passed

72 00:14:07.720 ⇒ 00:14:30.919 Pranjali Basmatkar: through the retrieval, is the generation model still able to, you know, use it and effectively come up with the right answer? So, you can do that by just, you know, seeing how grounded your responses are in the retrieved data, just faithfulness, hallucination checks and stuff like that, and relevancy. So, relevancy is one really important thing where

73 00:14:30.920 ⇒ 00:14:34.800 Pranjali Basmatkar: Say, you know, you wanna ask a question about…

74 00:14:35.090 ⇒ 00:14:41.490 Pranjali Basmatkar: I don’t know, what was the stock price last month, or something like that for a finance firm? And, if…

75 00:14:41.490 ⇒ 00:14:51.950 Pranjali Basmatkar: If your retrieval does a good job of extracting what those information… what that information was for last month, your generation model should also be able to draw the context that last month actually meant.

76 00:14:51.950 ⇒ 00:15:06.539 Pranjali Basmatkar: you know, August, and not June or July or anything in the back. So, there’s a lot of relevancy-related checks that you can do just by unit testing and stuff like that, and expanding the query. So, there’s several things you can do to evaluate that.

77 00:15:07.280 ⇒ 00:15:19.980 Jake Nathan: Yeah, and that, you mentioned, like, a few ways to evaluate it on a technical perspective. Do you, like, in the… in your work, have you seen, like, how do businesses evaluate it from a business perspective?

78 00:15:20.150 ⇒ 00:15:44.159 Pranjali Basmatkar: Right. So, businesses really care about, like, the end-to-end outcome a lot more. So, accuracy is a given for them. They want the system to be accurate, from the get-go. But then, is the system fast enough? Is the latency low enough? What was the time that a, like, a human expert would generally take to solve this problem, versus how long is this RAG agent taking to solve this problem?

79 00:15:44.160 ⇒ 00:15:50.199 Pranjali Basmatkar: So, the time saved, over there is an important criteria for them to

80 00:15:50.200 ⇒ 00:16:01.690 Pranjali Basmatkar: do AI adoption, and then, and then, you know, enabling their in-house SMEs to use these tools. So, yeah, just, just basically time being saved.

81 00:16:01.690 ⇒ 00:16:06.189 Pranjali Basmatkar: how, how useful is it? If, if the…

82 00:16:06.190 ⇒ 00:16:23.329 Pranjali Basmatkar: agent does come up with a response, how much more work would an… would a human SME need to do on top of that to make it usable in the format they need it? You know, stuff like that. So they try to evaluate, like, an end-to-end system a lot more than just, like, accuracy.

83 00:16:24.340 ⇒ 00:16:40.419 Jake Nathan: Yeah, that makes sense, and have you noticed, have you seen in your work, have companies, used the wrong metrics, you think, to measure? Like, have you ever seen them measure something that you actually would advise against them measuring, or it doesn’t actually paint the picture of if RAG is working?

84 00:16:40.420 ⇒ 00:16:42.400 Pranjali Basmatkar: Yeah, yeah, so,

85 00:16:42.600 ⇒ 00:16:48.619 Pranjali Basmatkar: I mean, I think, when you start off a POC, you kind of have, like, a narrow view of the problem statement.

86 00:16:48.620 ⇒ 00:17:13.200 Pranjali Basmatkar: But the entire idea behind having, like, a pilot use case or having, like, a proof of concept is to have, data in the distribution of the production use case. So that’s where a lot of times, systems sort of, like, fail, because they anticipate that the use case will be in a certain distribution. They’ll see a certain set of queries, and then they sort of, like, over-index on that.

87 00:17:13.310 ⇒ 00:17:28.670 Pranjali Basmatkar: And during your POC phase, you’ll do really well on it, because you think this is what production will look like. And then, when you do actually hit production, you basically are sort of, like, in front of users and all kinds of, honestly,

88 00:17:28.670 ⇒ 00:17:45.020 Pranjali Basmatkar: queries that might or might not even make sense. And at that point, your agent can completely, like, fail, having never seen those kind of queries. So, yeah, yeah, there’s a lot of, gap that can come in, between

89 00:17:45.020 ⇒ 00:17:53.630 Pranjali Basmatkar: having, like, a good POC agent versus a production agent if you don’t anticipate, the distribution of the queries that you’ll be seeing.

90 00:17:54.300 ⇒ 00:18:04.659 Jake Nathan: Yeah, that, that completely makes sense to me, and one thing you had mentioned, a few minutes ago was, you know, like, one of the huge benefits of RAG is that,

91 00:18:04.660 ⇒ 00:18:17.050 Jake Nathan: you’re using current data. And so, over time, that data obviously changes, and so how do you make sure that your system has the most current data without breaking the system and causing issues?

92 00:18:17.500 ⇒ 00:18:35.759 Pranjali Basmatkar: Yeah, so, there’s actually a few ways to go about it, and the way RAG is generally set up, it can be really flexible. RAG is generally designed to fit into your organization’s way of data warehousing and managing. So,

93 00:18:35.760 ⇒ 00:18:43.990 Pranjali Basmatkar: there can be several integrations, like, even Contextual has several integrations to different, you know, databases, or…

94 00:18:43.990 ⇒ 00:19:05.269 Pranjali Basmatkar: different locations that can house these, this data. So, basically, continuous ingestion is a way to handle this. Every time a new data, a new set of data is available, or new metadata is available, you want to continuously, sort of, like, work on processing and ingesting it, maybe in a batch setting, or maybe in, like, a…

95 00:19:05.270 ⇒ 00:19:14.180 Pranjali Basmatkar: you know, like a scheduled way, so that you’re keeping your data stored, you know, up to date. So that’s one way to sort of, like,

96 00:19:14.180 ⇒ 00:19:25.730 Pranjali Basmatkar: keep the data flowing. So this can be done just, like, with APIs or integrations to different databases, wherever you basically house your information.

97 00:19:26.350 ⇒ 00:19:41.639 Jake Nathan: Totally, yeah, that totally makes sense. And, you’re… you’re obviously an expert when it comes to RAG. You keep… I’m sure you keep up with the latest news, like, how do you see RAG evolving this year and into next year, even?

98 00:19:41.810 ⇒ 00:20:05.650 Pranjali Basmatkar: Right, so I think RAG has already sort of, like, evolved a little bit, this year. So, we had, basically, we used to only have, like, static RAG agents that followed, like, a specific set of workflows, to reach a response, but now, the RAG system is more agentic. And, going forward, the entire context layer that houses RAG

99 00:20:05.650 ⇒ 00:20:30.630 Pranjali Basmatkar: is gonna get more and more agentic, and that’s how I see it sort of, like, progressing. So the agents are not just following a static query path or a static steps, static workflow steps, but they’re dynamic in thinking. So, given a particular problem, they’ll be sort of, like, coming up with a plan, doing tool calls on top of your retrievers, or on top of your

100 00:20:30.630 ⇒ 00:20:33.210 Pranjali Basmatkar: In-house artifacts, like.

101 00:20:33.380 ⇒ 00:20:40.570 Pranjali Basmatkar: your, you know, your tuned model, or your domain-specific models, and stuff like that. And, from there,

102 00:20:40.760 ⇒ 00:20:49.139 Pranjali Basmatkar: Yeah, from there, you’ll basically have information, that’s, that’s tool called and,

103 00:20:49.210 ⇒ 00:20:59.399 Pranjali Basmatkar: And it’s basically reaching a point where it can think a lot smarter than a static workflow. So that’s basically a Gentic rag that I can see happening in the future.

104 00:20:59.400 ⇒ 00:21:09.919 Pranjali Basmatkar: Other than that, I guess, multimodality is something that I see coming up a lot. So, for the longest period of time, we were really fixated on text and coming up with, like.

105 00:21:09.990 ⇒ 00:21:26.109 Pranjali Basmatkar: receiving textual information and coming up with textual responses. But, I think it’s… it’s about time, like, our input is going to be multimodal. It can be images, videos, anything, and the responses generated from these RAT systems will likely also be multimodal.

106 00:21:26.110 ⇒ 00:21:36.120 Pranjali Basmatkar: So, yeah, there’s, like, a few, few different ways things can go. Personalization and adapting to a particular user, is also,

107 00:21:36.360 ⇒ 00:21:39.539 Pranjali Basmatkar: a place where I see things going. So yeah, pretty exciting.

108 00:21:40.210 ⇒ 00:21:59.399 Jake Nathan: Definitely, and let’s say, you know, I’m, reading this article, I… I… I’m a business leader, and I’m… I’m excited. I want to start implementing, or at least, kind of start that journey. What’s, like, the very first step that I should take, like, even in the first hour, like, what should I be doing?

109 00:21:59.670 ⇒ 00:22:04.180 Pranjali Basmatkar: Right. I guess, I guess the…

110 00:22:04.230 ⇒ 00:22:16.289 Pranjali Basmatkar: I think we kind of, like, talked about this a little bit, but, setting up pilots and POCs is, I think, the best way to evaluate how ready you are to adopt an AI-based drag system.

111 00:22:16.290 ⇒ 00:22:25.670 Pranjali Basmatkar: So, I guess the first thing that I would do is probably look at all the business units I have, and think about which are the ones that would benefit the most from having a ride system.

112 00:22:25.670 ⇒ 00:22:27.250 Pranjali Basmatkar: And,

113 00:22:27.250 ⇒ 00:22:46.229 Pranjali Basmatkar: and kind of start evaluating the readiness to receive an AI system like this for each of them. So, do they have access to data already? What kind of data is it? Is it textual data? Is it structured data in a particular database?

114 00:22:46.230 ⇒ 00:22:53.819 Pranjali Basmatkar: Do we have security clearances to, enable this data to be used for RAG systems or LLMs?

115 00:22:53.820 ⇒ 00:23:07.470 Pranjali Basmatkar: And, that’s… that’s something that I would think about first. And once I’ve sort of, like, narrowed down on that, maybe sort of, like, come up with one or two, business units and their use cases as your pilots.

116 00:23:07.470 ⇒ 00:23:25.210 Pranjali Basmatkar: And from there, you kind of, like, start defining what success should look like. So, exactly what are you hoping to get out of having a RAG system? Are you hoping to save time? Are you hoping to save cost? What kind of accuracy is your baseline and non-negotiable? So, coming up with all of these.

117 00:23:25.210 ⇒ 00:23:27.919 Pranjali Basmatkar: Metrics this will sort of, like,

118 00:23:28.170 ⇒ 00:23:51.790 Pranjali Basmatkar: help you understand what you’re looking for, and then when you do build out a rack system and start seeing results, if they look like they’re matching, then you’re good to go. You’re gold. So, I think, those things are really crucial. Security, I would want to emphasize, is really important, because a lot of enterprises have a lot of confidential data, which they don’t really have the liberty to send to, like, a different

119 00:23:51.790 ⇒ 00:24:02.199 Pranjali Basmatkar: third-party model or stuff like that. So in that case, they would also want to think about, you know, on-VPC deployment, where they’re, you know, running these models on-prem.

120 00:24:02.200 ⇒ 00:24:10.589 Pranjali Basmatkar: And, you know, keeping all of their data contained within their network. So, just different considerations based on what kind of enterprise you are.

121 00:24:11.090 ⇒ 00:24:22.640 Jake Nathan: Yeah, I appreciate you walking through all that, because I think it gives, you know, business leaders a more approachable way, if they feel overwhelmed, like, how am I going to do this for my entire enterprise? That gives them

122 00:24:22.640 ⇒ 00:24:38.009 Jake Nathan: a good first step. And I want to make sure we spend plenty of time, too, talking about contextual AI. So, kind of my first question there is just, with everything that we’ve been talking about, how does contextual AI fit, in… in what we… kind of the steps that we went through?

123 00:24:38.490 ⇒ 00:24:50.920 Pranjali Basmatkar: Right, so, like I said, contextual sort of, like, fits very much in the context layer between the intelligence layer and the data layer. And, this is one thing that we have focused on since the day one.

124 00:24:50.920 ⇒ 00:25:14.619 Pranjali Basmatkar: I guess, some other things that Contextual really indexes on a lot is specialization over generalization. So, all of your LLMs are really good at general tasks, but Contextual sort of, like, focuses on domain-specific, enterprise-specific tasks, where, most of the LLMs sort of, like, fall short on. So, there’s several tools that

125 00:25:14.620 ⇒ 00:25:25.920 Pranjali Basmatkar: Contextual has built internally that, will help you specialize all of these artifacts to your domain and your distribution, so that you basically get the best results that you could ever.

126 00:25:26.270 ⇒ 00:25:32.120 Pranjali Basmatkar: And, some other things, I guess, yeah, one of the things that… that…

127 00:25:32.120 ⇒ 00:25:49.509 Pranjali Basmatkar: most of the RAG use cases have seen is that they get stuck in the POC stage forever. They’re never production ready. So, the one thing that Contextual kind of, like, focuses on a lot is being production ready from day one. So, even when we sort of, like, start off with a use case, or a pilot, or a POC,

128 00:25:49.510 ⇒ 00:26:03.689 Pranjali Basmatkar: We are kind of, like, always thinking about what production would look like, and we’re indexing on, you know, how the data will be stored, and how, the data will be accessed by their users, or,

129 00:26:03.960 ⇒ 00:26:22.750 Pranjali Basmatkar: or… or basically just, like, working around the governance and compliance within that organization, so that, the RAG use case, once it’s, you know, there in terms of accuracy, can really fit in very easily into their ecosystem, so that all of the users within the organization can start using it.

130 00:26:23.030 ⇒ 00:26:27.080 Pranjali Basmatkar: So, that’s… that’s one thing that really, helps.

131 00:26:27.080 ⇒ 00:26:48.490 Pranjali Basmatkar: And, I guess just addressing all of the shortcomings of the LLMs, like hallucinations, groundedness, factuality. So, most of our artifacts, like the GLM or the re-rankers that we have internally, these are all really indexed on being extremely grounded. So, what that helps is that it really helps in,

132 00:26:48.490 ⇒ 00:27:07.680 Pranjali Basmatkar: reducing hallucinations, and makes sure the response is extremely grounded in the data that’s given to it. So, that’s something that we really care about. And, this really sort of, like, helps us gain the trust of most of the enterprises that want to consider moving into the AI space and, you know, want to try out RAG.

133 00:27:09.030 ⇒ 00:27:21.439 Jake Nathan: That totally makes sense, and one thing we wanted to ask about in particular is you have made, some recent announcements on the text-to-SQL side of things, so can you just talk more about that and what your thoughts are?

134 00:27:21.720 ⇒ 00:27:40.840 Pranjali Basmatkar: Right, I… maybe I can add a little, but then maybe Mike can, sort of, like, help me out here. But yeah, we’re, we’re kind of on the leaderboards for the text-to-SQL, and I think last month, maybe I’m a little shaky on this. Mike, maybe help me here.

135 00:27:40.840 ⇒ 00:27:43.500 Mike Klaczynski: We’re number one on facts for our GLM for grounding.

136 00:27:43.990 ⇒ 00:27:44.670 Pranjali Basmatkar: Yeah.

137 00:27:45.580 ⇒ 00:27:51.959 Mike Klaczynski: So that’s about OpenAI, Claude, everybody else. I’ll share a link, because I’ve been referencing this a lot.

138 00:27:52.620 ⇒ 00:27:53.430 Jake Nathan: Awesome.

139 00:27:54.460 ⇒ 00:28:11.130 Pranjali Basmatkar: Yeah, so internally, we have, like, a bunch of streams that are focusing on structured data as a whole, and some of our internal artifacts, like Mike mentioned, are already on the top of the leaderboard. So, yeah, we’re just excited to sort of, like, integrate into… integrate this into our product and make it available for everyone.

140 00:28:12.010 ⇒ 00:28:27.830 Jake Nathan: That’s awesome, yeah, and thanks for sending that link, Mike. And yeah, I just, was curious, you personally, like, are there other features that you’re particularly excited about that, you know, released recently, or features that are coming up that you can talk about, like, just as far as contextual

141 00:28:27.830 ⇒ 00:28:33.630 Jake Nathan: goes, we’d love to just kind of hear a preview of what’s to come, or what you’re really excited about.

142 00:28:35.820 ⇒ 00:28:36.330 Pranjali Basmatkar: Mike Doolan.

143 00:28:36.330 ⇒ 00:28:37.040 Mike Klaczynski: do it.

144 00:28:37.410 ⇒ 00:28:41.850 Mike Klaczynski: Yeah, I’ve got a couple ideas. I think, everybody’s heading more towards the…

145 00:28:42.380 ⇒ 00:28:56.109 Mike Klaczynski: reasoning models. So now we have the ability to integrate with pretty much any reasoning model out there. So in addition to having our platform and our models, you can extend that and use, you know, whatever, ChatGPT, or Gemini, or whatever it might be.

146 00:28:58.370 ⇒ 00:29:12.240 Mike Klaczynski: I think the other part is all the context engineering work. So now, instead of just thinking of ourselves as a RAG platform, because we’re in the middle of that intelligence layer and that data layer, there’s a lot of different ways that users can now…

147 00:29:13.010 ⇒ 00:29:15.569 Mike Klaczynski: There’s different ways to say this, but, like.

148 00:29:15.710 ⇒ 00:29:31.200 Mike Klaczynski: Most AI systems are non-deterministic, and we’re trying to add a little bit more programmatic and deterministic ways into it. So, for example, one of those is entity extraction. So let’s say I’ve got 150 different documents, and I want to extract 20 fields across all of those.

149 00:29:31.600 ⇒ 00:29:40.759 Mike Klaczynski: Typically, if you’re in a V… You know, a VC or working in private equity, you’d have to sit down with an analyst or an intern to go through and scan every document and extract it.

150 00:29:40.920 ⇒ 00:29:52.910 Mike Klaczynski: Now, instead of just uploading these into an AI and hoping they work, you can actually put in a YAML file and define the fields you want, and say, this is the schema and the structure.

151 00:29:52.910 ⇒ 00:30:10.469 Mike Klaczynski: So, it’s more of a shared, responsibility model. It’s no longer just throwing things over to the AI and saying, figure it out, and hoping it works. It’s actually saying, I have intelligence, I have tribal knowledge, I understand what the data structure looks like, let me define that, and then feed that into contextual.

152 00:30:10.470 ⇒ 00:30:14.019 Mike Klaczynski: And now we’ll actually be able to use that knowledge and that information

153 00:30:14.020 ⇒ 00:30:27.909 Mike Klaczynski: to make that AI model significantly more effective. So, I think, you know, that’s coming out. We’re already using it with some customers, but that’ll be generally available soon, depending upon when, you know, this gets published.

154 00:30:27.940 ⇒ 00:30:28.740 Mike Klaczynski: You know.

155 00:30:28.870 ⇒ 00:30:40.189 Mike Klaczynski: we may be able to reference that and say it’s now generally available. But anyway, I think that’s what I’m really excited for. Pranjali, anything else? I mean, there’s a couple different pods that we’re working in, but I think

156 00:30:40.310 ⇒ 00:30:41.719 Mike Klaczynski: These are kind of the big ones.

157 00:30:41.910 ⇒ 00:31:00.099 Pranjali Basmatkar: Yeah, these are definitely the big ones. Yeah, I think being able to be LLM agnostic is, one of the bigger things. So, Contextual basically uses their own models and artifacts, but also has access to several other third-party models. So, basically, anytime,

158 00:31:00.160 ⇒ 00:31:10.790 Pranjali Basmatkar: other intelligence layer companies make massive progresses. Contexture basically is hand-in-hand with them, with their rack system, so I think that’s… that’s a really fun thing.

159 00:31:11.680 ⇒ 00:31:30.950 Jake Nathan: Yeah, I bet that gets you all excited, like you said, because you’re kind of in lockstep with them. And, one question I want to go back to is, how do you give, I guess, users confidence that the output of, what contextual’s telling you, how do you give them confidence in that? Like, how are you evaluating and scoring that output?

160 00:31:31.230 ⇒ 00:31:51.570 Pranjali Basmatkar: Yeah. So, there’s, like, like, like we talked about in the evaluation section, there’s, there’s several metrics that can assess accuracy, but then, to assess generation, there’s, attribution, groundedness, hallucination checks. So, with contextual, you get attributions for every claim that’s made in the response. So, say.

161 00:31:51.570 ⇒ 00:32:12.080 Pranjali Basmatkar: for a response, there’s maybe, like, 5 claims. So you’ll have exact attribution links to where this claim was basically originated from, and the user, while it’s sort of, like, looking at all of the responses, can go back to each of these documents and verify if they feel like. So that really adds a lot of trust in making sure

162 00:32:12.080 ⇒ 00:32:20.160 Pranjali Basmatkar: the responses are grounded in the, you know, the retrieved documents. We also provide, like, a groundedness score, so, that basically

163 00:32:20.220 ⇒ 00:32:39.889 Pranjali Basmatkar: cross-references each claim against the retrieved data, makes sure that it’s actually present, and gives you a confidence, groundedness course to sort of, like, let you know that, you know, this is a well-grounded claim, this is maybe not that well-grounded, so maybe spend some time looking at this, or maybe confirm this. So, this tool can really be,

164 00:32:39.890 ⇒ 00:32:46.280 Pranjali Basmatkar: Really helpful and can help people, know what to do next and where to look at next.

165 00:32:47.260 ⇒ 00:32:55.729 Jake Nathan: Got it. That totally makes sense. And, one, one other question is, like… Do you,

166 00:32:55.840 ⇒ 00:33:02.230 Jake Nathan: For a user, do you expose, like, logs or explanations or metadata to understand, like, how the system works?

167 00:33:02.380 ⇒ 00:33:05.490 Jake Nathan: Or, like, what’s the observability of the platform?

168 00:33:05.490 ⇒ 00:33:30.240 Pranjali Basmatkar: Right. So, in terms of observability, you basically have access to all of the retrieved documents. So, once our retrieval system is done, sort of, like, processing all of the files, ingestion, extraction, and everything, you get to see exactly what top-end documents were retrieved that were relevant to a particular query, and you’ll be able to sort of, like, click through them and understand what each of them really means.

169 00:33:30.240 ⇒ 00:33:39.489 Pranjali Basmatkar: So, you can… you get really good observability when it comes to retrievals, and for groundedness and for, generation scores, you… you get…

170 00:33:39.490 ⇒ 00:33:57.619 Pranjali Basmatkar: scores at the end for each response. So you’re able to sort of, like, go through the entire loop of retrieval, and then generation, and then the response creation. And we also, sort of, like, have access to different evaluation tools that we enable enterprises to use, so they can sort of, like, take a look at,

171 00:33:57.690 ⇒ 00:34:09.330 Pranjali Basmatkar: accuracy, and the style of talking, and the style of writing, and different unit tests that they can create to just make sure the response quality is exactly what they need it to be. So you can do, like, a full loop.

172 00:34:10.190 ⇒ 00:34:25.840 Jake Nathan: Gotcha. Awesome. Well, I appreciate that. And this is kind of a question for both of y’all. We’ve kind of noticed at Brainforge, you know, a lot of times when clients start out, they might be using things like N8N, Zapier, Make, Clay…

173 00:34:25.989 ⇒ 00:34:39.440 Jake Nathan: How, I guess, how would you consider, or kind of give them advice on moving from kind of those starting tools to something like contextual, and, like, why should they consider contextual in that journey?

174 00:34:41.560 ⇒ 00:34:42.230 Pranjali Basmatkar: I do want to go.

175 00:34:42.239 ⇒ 00:34:42.929 Mike Klaczynski: question.

176 00:34:43.519 ⇒ 00:34:45.039 Mike Klaczynski: Yeah,

177 00:34:45.259 ⇒ 00:34:56.929 Mike Klaczynski: So, the interesting thing is we’re actually integrating with those technologies. So, we just launched our Crew AI integration yesterday. We’re working on our N8N integration. Utam, we were chatting about this.

178 00:34:56.959 ⇒ 00:35:12.689 Mike Klaczynski: So, it really comes down to what is the customer trying to accomplish. If they want more of a generalized framework that they’re putting together, then they can use those tools, but if they really are more focused on enterprise data that’s not generally available in LLMs, they’re going to need some sort of

179 00:35:13.169 ⇒ 00:35:28.339 Mike Klaczynski: context engineering or RAG pipeline, and that’s where we fit in. And so, again, we’re like one component or one tool call within that broader agentic framework, and so, you know, they can orchestrate whatever they want using those frameworks, and then they can call out to us.

180 00:35:28.939 ⇒ 00:35:30.719 Mike Klaczynski: And on the partnership side.

181 00:35:30.849 ⇒ 00:35:48.739 Mike Klaczynski: you know, the way I really frame what we do is we take documents and we give really accurate answers. We’re like a widget. That’s what we do. And you can plug that widget into many, many different systems, and now, obviously, we’re expanding based on customer demand on what additional things are, but again, at the core.

182 00:35:48.839 ⇒ 00:35:59.339 Mike Klaczynski: you feed us really accurate… you feed us really complex documents at really high scale with high modality, and, like, we’ll give you really, really accurate answers. And then what you want to do with those, it’s up to you how you want to orchestrate them.

183 00:35:59.359 ⇒ 00:36:18.629 Mike Klaczynski: Now, some of these enterprise customers don’t want a lot of different systems, they just want to consolidate into one platform, and they want to be able to add more determinism into it, right? That’s how you get that accuracy of, I want this answer to be reproducible and highly accurate. Well, with our system, you can steer a lot of that much better, because it’s confined.

184 00:36:18.709 ⇒ 00:36:22.199 Mike Klaczynski: But we don’t want to exclude anybody from using whatever tools they want, so…

185 00:36:23.269 ⇒ 00:36:36.719 Mike Klaczynski: I guess that’s… that’s a bunch of different thoughts. If you want to use our platform, we’re giving you all the capabilities, but if you do want to extend our capabilities with other systems, like agent-to-agent, like we’re doing that with Google, or N8N, or Crew AI, or whatever it might be.

186 00:36:36.859 ⇒ 00:36:40.549 Mike Klaczynski: We’re fully supportive of that, and we will continue to build those integrations for clients.

187 00:36:43.530 ⇒ 00:36:51.949 Jake Nathan: Awesome. Yeah, that makes sense to me. It seems like you can kind of meet anyone where they’re at on their journey, so that makes a ton of sense.

188 00:36:53.270 ⇒ 00:37:10.859 Jake Nathan: Cool. Well, those were, some great answers. I really appreciate you, making time to do this. I’m excited to listen through the interview again and put together, something awesome, but this has been great, and Utam is listening in. I don’t know if he has the chance to…

189 00:37:10.860 ⇒ 00:37:33.350 Uttam Kumaran: Hey, sorry, I’m just at a conference here in Chicago, so it’s a bit loud, but this is amazing. Yeah, I was just chatting with Jake, like, let’s make sure to… because some of our clients, we find, that are just starting early, they just… all they know is these, like, random AIs with a do-it-yourself platforms, so I think it was great to kind of hear that, and yeah, this is… this is going to be great, so I appreciate the time today.

190 00:37:35.090 ⇒ 00:37:36.450 Jake Nathan: Awesome, thanks for your time.

191 00:37:36.450 ⇒ 00:38:00.450 Jake Nathan: Great. Well, yeah, so next step is, like I said, I’ll put a first draft together for y’all to kind of react to, and we can look over it together and make sure everything looks good, and we can kind of talk about next steps from there, but this is definitely something that we’re gonna promote on our socials, put it on the blog, and just share with our clients and prospects, so…

192 00:38:00.750 ⇒ 00:38:05.790 Jake Nathan: We’re really excited to do this, and thank you again for making time. It was great to talk.

193 00:38:06.800 ⇒ 00:38:07.960 Mike Klaczynski: Absolutely, thank you.

194 00:38:08.490 ⇒ 00:38:09.610 Pranjali Basmatkar: Thank you so much, Jake.

195 00:38:09.830 ⇒ 00:38:11.299 Jake Nathan: Okay, talk to y’all later.

196 00:38:11.680 ⇒ 00:38:12.250 Mike Klaczynski: Bye.

Brainforge Knowledge

Explorer

2025-09-18_pranjali-jake_making_sense_of_rag_how_it_3939231f

Graph View