Meeting Title: Text to SQL sync Date: 2025-10-24 Meeting participants: Demilade Agboola, Awaish Kumar, Samuel Roberts


WEBVTT

1 00:00:38.380 00:00:39.380 Awaish Kumar: I mean…

2 00:00:39.380 00:00:40.270 Samuel Roberts: Hey!

3 00:00:42.860 00:00:44.080 Samuel Roberts: How are you guys?

4 00:00:44.500 00:00:46.549 Demilade Agboola: Pretty good. How are you?

5 00:00:46.550 00:00:48.519 Samuel Roberts: Doing alright, doing alright.

6 00:00:48.930 00:00:49.710 Demilade Agboola: Okay, just…

7 00:00:49.980 00:00:52.799 Samuel Roberts: Nothing too crazy today. Long meeting before this?

8 00:00:53.400 00:00:55.720 Samuel Roberts: Yeah, just kind of went over.

9 00:00:56.330 00:00:57.310 Samuel Roberts: Yeah, no worries.

10 00:00:57.310 00:00:57.950 Demilade Agboola: And it’s…

11 00:00:59.170 00:01:14.880 Samuel Roberts: Cool. Yeah, so I just wanted to talk about this, like, text-to-SQL thing for, like, the… or specifically for Eden right now, but in general, trying to make it a system that could work for other things. And I think, Demolade, when I kind of showed off the kind of initial

12 00:01:14.880 00:01:19.919 Samuel Roberts: one-shot thing I did with Cursor earlier, you were commenting about how it, you know.

13 00:01:19.930 00:01:23.959 Samuel Roberts: it didn’t have a lot of the context for, like, what was it, fact transactions, I think?

14 00:01:24.990 00:01:29.979 Samuel Roberts: And so, what I’m kind of wondering here is…

15 00:01:30.100 00:01:32.789 Samuel Roberts: I’m not as familiar with the dbt…

16 00:01:33.170 00:01:39.640 Samuel Roberts: side of things here. I’ve kind of looked through it, and I’ve chatted with Cursor about it, and it can infer some things.

17 00:01:39.950 00:01:43.330 Samuel Roberts: I’m wondering… What kind of other…

18 00:01:43.950 00:01:50.780 Samuel Roberts: knowledge is missing from this that I should be able to include as, like, a context file.

19 00:01:50.920 00:02:05.210 Samuel Roberts: Perhaps. And so, what I’m kind of just looking to talk through is… through is, I’ve got this… oops, sorry, let’s rearrange things for a second. I’ve got, like, I can…

20 00:02:05.370 00:02:20.830 Samuel Roberts: well, I don’t know if it’s worth sharing yet, but basically, in that repo, I made a separate, like, text-to-SQL folder so that cursor would have all the context of the dbt project. And that got me part of the way there. And now I’m looking to add either more context per

21 00:02:21.360 00:02:27.280 Samuel Roberts: dbt file, or per table, or per mart, I’m not really sure where best to, like.

22 00:02:27.430 00:02:32.389 Samuel Roberts: to bring that in. And so I’m kind of just looking for a little bit of insight into, like, maybe this…

23 00:02:32.580 00:02:48.080 Samuel Roberts: projects specifically, or just more generally, like, how you guys go about it with the dbt marts, or projects, and then the marts, and the intermediate, and the raw. I’m seeing all this stuff, and I just don’t have a great sense of, like, where should I put a layer of a little more context, if that makes sense.

24 00:02:53.060 00:03:03.649 Demilade Agboola: I think there’s… apart from, like, maybe raw on staging, there’s probably context in, like, the intermediate on… on Matt’s… Matt sold us the most.

25 00:03:04.020 00:03:04.700 Samuel Roberts: Okay.

26 00:03:04.700 00:03:09.599 Demilade Agboola: Because we’re on staging tends to just be, like, all in on the…

27 00:03:10.080 00:03:18.069 Demilade Agboola: tables as they are, you know, or maybe renaming or casting data types. There isn’t a lot going on there. In terms of, like, logic of

28 00:03:18.210 00:03:22.750 Demilade Agboola: They were filtering out this, or we are excluding this.

29 00:03:23.140 00:03:25.860 Demilade Agboola: It tends to happen more in intermediate and…

30 00:03:25.860 00:03:27.429 Samuel Roberts: In the intermediate? Okay.

31 00:03:28.220 00:03:29.649 Demilade Agboola: the MOTS models.

32 00:03:30.270 00:03:31.060 Samuel Roberts: Okay.

33 00:03:31.290 00:03:39.100 Samuel Roberts: So that’s where I should kind of be looking most, I guess, for trying to pull some stuff out, into a context file, I guess?

34 00:03:39.820 00:03:40.190 Demilade Agboola: Yes.

35 00:03:40.660 00:03:50.520 Samuel Roberts: Okay, I guess, so, so sort of right now, I have it, it, it’s just kind of a general, you know, it understands a little bit. It’s using the, kind of.

36 00:03:50.710 00:03:57.870 Samuel Roberts: LM’s knowledge of SQL, but then I’m passing in some, like, an HSMD file, a rough, like,

37 00:03:58.680 00:04:01.509 Samuel Roberts: guide to BigQuery specifically.

38 00:04:01.810 00:04:08.790 Samuel Roberts: And then I was starting to do something that was going to extract some schema stuff out, and so pointing that to intermediate is probably the best bet.

39 00:04:08.980 00:04:14.919 Samuel Roberts: Are there other… things in here that I… that, you know.

40 00:04:15.490 00:04:20.510 Samuel Roberts: I guess my question is, is there any other context beyond this, from, like, a business…

41 00:04:20.660 00:04:24.000 Samuel Roberts: Level, or anything else that would be…

42 00:04:24.650 00:04:28.320 Samuel Roberts: like, not something that the LLM could discover in here?

43 00:04:34.570 00:04:42.400 Demilade Agboola: I mean, I think the major differences that I can think of are… I did mention it, like, fact transactions are where we filter out

44 00:04:43.370 00:04:50.850 Demilade Agboola: Things like consult orders, and… But again, it’s canceled and error on us.

45 00:04:51.870 00:04:53.699 Demilade Agboola: We filter those out.

46 00:04:54.190 00:04:56.439 Samuel Roberts: And that’s not something that I would see and hear anywhere.

47 00:04:56.740 00:05:02.120 Demilade Agboola: No, there’s a wear filter that filters that out, so you can… Okay, so it does have that, okay. Yeah.

48 00:05:02.380 00:05:03.020 Samuel Roberts: Okay.

49 00:05:03.300 00:05:08.610 Demilade Agboola: But I think some of the context might be… For certain things.

50 00:05:09.610 00:05:17.499 Demilade Agboola: I know that certain models, like, downstream, use, like, the fact orders, Also, I’m trying to use…

51 00:05:17.650 00:05:20.549 Demilade Agboola: All orders for certain analysis.

52 00:05:21.400 00:05:25.129 Demilade Agboola: I can’t remember the exact examples right now, I have to look into them.

53 00:05:25.130 00:05:27.740 Samuel Roberts: But, like, such things means that if you ask.

54 00:05:27.740 00:05:33.820 Demilade Agboola: the… Yeah, because certain things means if you ask the LLM questions.

55 00:05:34.050 00:05:38.829 Demilade Agboola: It would need to just know whether you want the analysis on all orders, or just…

56 00:05:39.110 00:05:43.009 Demilade Agboola: A certain subset of the others, so that’s kind of the only thing to, like, matter.

57 00:05:43.290 00:05:50.560 Samuel Roberts: Okay, that’s good to know. So I can tell it to maybe ask that question or something, or… or make sure that we tell it specifically that, but…

58 00:05:50.680 00:05:51.760 Samuel Roberts: Okay.

59 00:05:54.320 00:06:01.189 Samuel Roberts: OH, I see you mentioned that Notion, Google Sheets, and database schema, and sample data. So are there other pieces of knowledge that I can pull in here?

60 00:06:02.150 00:06:10.309 Awaish Kumar: Like, I want to just understand, what… what are you planning to do? Like, you want to build a custom solution to generate queries, or…

61 00:06:10.710 00:06:15.449 Samuel Roberts: Yeah, so, Utam had kind of tasked me with putting something together that’s kind of…

62 00:06:15.770 00:06:18.789 Samuel Roberts: Somewhat similar to that Omni, you know.

63 00:06:18.950 00:06:27.830 Samuel Roberts: assistant that, you know, you can kind of chat over the data, and so I got something that kind of can generate some basic sequence.

64 00:06:27.830 00:06:33.170 Awaish Kumar: but I’m trying to add a little bit more context to it, the way Omni had…

65 00:06:33.170 00:06:33.920 Samuel Roberts: More stuff there.

66 00:06:33.920 00:06:39.770 Awaish Kumar: I got, I got your point. I have been working in Omni, so… Okay.

67 00:06:40.330 00:06:47.819 Awaish Kumar: the… Thing is that, like, normally cursor, like, what the, like.

68 00:06:47.950 00:06:52.480 Awaish Kumar: Whatever is in GitHub repo, cursor is, like.

69 00:06:52.830 00:07:02.910 Awaish Kumar: almost able to get that context, what I have seen, right? So, if it is something in another file, or in the intermediate table,

70 00:07:03.770 00:07:20.260 Awaish Kumar: like, Kasha is able to do, and obviously you can also have a… all access to get the code base, and you can access, like, from the file, if I… if I… in my text, if I mention two tables, you can look at those files, and

71 00:07:20.370 00:07:25.360 Awaish Kumar: Look, all the tables that are being referenced in that SQL query.

72 00:07:25.560 00:07:41.019 Awaish Kumar: Right? So if that references tables from intermediate table more, intermediate layer, or if that references a table from source, or a raw layer, you can easily get to that point, and

73 00:07:41.180 00:07:45.810 Awaish Kumar: send in the con… send it in the context file. So what…

74 00:07:46.520 00:07:57.120 Awaish Kumar: LLM or Kherson cannot do right now is that I want him to, like, instead of, like, if I want to build a new query, which is

75 00:07:57.320 00:08:00.790 Awaish Kumar: Which is not in… in the…

76 00:08:01.790 00:08:09.920 Awaish Kumar: the cursor, and it might have some data which I haven’t explored. So, if I want the cursor to…

77 00:08:09.930 00:08:22.290 Awaish Kumar: helped me generate a query, for the data, which even I’m not… I’m familiar… like, quite familiar with. Like, I have maybe looked at it for a few times, but not, like, thoroughly.

78 00:08:22.290 00:08:30.700 Awaish Kumar: So, like, how can Kherson know that? I can’t, like, I’m not able to write everything down, because maybe I haven’t… I don’t have any…

79 00:08:30.780 00:08:32.320 Awaish Kumar: enough,

80 00:08:34.929 00:08:47.529 Awaish Kumar: knowledge, or I can say, like, even if I have, I don’t have enough, like, time to write down everything for all the columns, what are the accepted values, and all these things.

81 00:08:47.960 00:08:52.689 Awaish Kumar: So that’s the issue, and what… What would be the ideal…

82 00:08:52.830 00:08:58.559 Awaish Kumar: solution is that I can connect the… for example, there’s a…

83 00:08:58.970 00:09:10.940 Awaish Kumar: there’s a page, whatever, the agent, and I can connect it with some Google BigQuery project. It can read the information schema, so it gets the table and their fields.

84 00:09:11.210 00:09:13.950 Awaish Kumar: And then it can carry those

85 00:09:14.080 00:09:17.159 Awaish Kumar: Tables to get some sample data, right?

86 00:09:17.280 00:09:21.749 Awaish Kumar: So… From my text, first of all, the…

87 00:09:21.990 00:09:31.390 Awaish Kumar: the step one is that AI agent can, figure out what tables are… are related, or, like…

88 00:09:31.450 00:09:34.749 Samuel Roberts: should be included in the context, right?

89 00:09:34.800 00:09:43.869 Awaish Kumar: And then from those tables, it can find the schema directly from database, and some sample data to understand what it looks like.

90 00:09:43.950 00:09:55.249 Awaish Kumar: Right? And then, notion Docs. Like, we have Google Sheet and Notion Doc, where we store business knowledge, as you were mentioning. Right.

91 00:09:55.250 00:10:05.669 Awaish Kumar: So, like, I have 5 different status of an order, in the table. You can get those 5 status, but you don’t know the description, or…

92 00:10:05.710 00:10:09.060 Awaish Kumar: AI agent won’t know either, right? So… Okay.

93 00:10:09.640 00:10:14.930 Awaish Kumar: What, what you have to do is now that maybe I have… I don’t know…

94 00:10:15.460 00:10:23.420 Awaish Kumar: it either, and I’ve asked someone, and I’ve put it in Notion Doc. So now that knowledge lives in the Notion Doc.

95 00:10:23.620 00:10:33.639 Awaish Kumar: But the AI agent doesn’t know that, or maybe some of that knowledge lives in the Google Sheet. But the AI agent, like, doesn’t know that, so…

96 00:10:33.830 00:10:41.840 Awaish Kumar: we can, like, have a few connectors, like, we are… for Brainforge, we have Notion, we have Google Sheets, we have,

97 00:10:42.200 00:10:52.789 Awaish Kumar: databases, like BigQuery.shift, Postgres at Superbase, and then we have,

98 00:10:54.050 00:11:03.390 Awaish Kumar: Yeah, like, the GitHub codebase. Like, these have 3-4 different connectors. If we have that, to integrate all of them together, and then…

99 00:11:03.750 00:11:06.199 Awaish Kumar: Generate some, queries.

100 00:11:07.750 00:11:08.600 Samuel Roberts: Okay.

101 00:11:09.740 00:11:15.220 Samuel Roberts: So, I think maybe for at least this example that I’m trying to work on with Eden, I would love…

102 00:11:16.860 00:11:23.879 Samuel Roberts: maybe… well, I don’t know, I’m trying to think of how much of this information I need to pull into, like, an agent’s MD file to pass into Cursor.

103 00:11:24.340 00:11:27.010 Samuel Roberts: Or if it’s an MCP thing. Okay.

104 00:11:27.010 00:11:40.320 Awaish Kumar: So right now, I kind of do a similar thing. Like, I… if I have to write some KG using ChatGPT, I will go to the ChatGPT, I will, write down what I know, I will cut-paste some SQL files from…

105 00:11:41.130 00:11:46.709 Awaish Kumar: Codebase, and also some sample data,

106 00:11:46.870 00:11:51.620 Awaish Kumar: from the carry, and then it will generate a carry for me, so…

107 00:11:51.760 00:11:56.419 Awaish Kumar: Okay. Right now, I’m doing it manually, so that’s what I need. If I can…

108 00:11:57.260 00:12:04.190 Awaish Kumar: I don’t have to write it. I can just, write down what I need, and it just collects everything automatically.

109 00:12:04.190 00:12:05.030 Samuel Roberts: Right.

110 00:12:05.370 00:12:06.330 Samuel Roberts: Okay.

111 00:12:06.510 00:12:10.750 Samuel Roberts: Alright, that’s… Good to know. I’m just trying to think…

112 00:12:10.870 00:12:15.390 Samuel Roberts: how best to, like, make this a one-stop shop kind of thing.

113 00:12:15.800 00:12:24.949 Samuel Roberts: And I guess I don’t have a ton of context into this client, at least, so I’m not sure how much I should be pulling in from other sources, like the Notion or the…

114 00:12:25.380 00:12:27.159 Samuel Roberts: Sheets or something.

115 00:12:27.610 00:12:38.100 Awaish Kumar: No, like, we have defined, like, we did similar exercise when we were building these agents, like Slack agents. We have bots, like.

116 00:12:38.450 00:12:43.880 Awaish Kumar: Casey and, miguel have worked on that. So while we were doing that, we…

117 00:12:43.990 00:12:52.519 Awaish Kumar: we… what we did, like, we have Notion Dog, we have Script, we have pipelines ready to get the data from Notion to Superbase.

118 00:12:52.640 00:12:55.350 Awaish Kumar: So now, you just have read from Superlist.

119 00:12:56.330 00:12:57.670 Samuel Roberts: Oh, okay.

120 00:12:57.830 00:13:04.299 Awaish Kumar: And then we have tagged them all with their clients, so if I eat in… so it’s not like you have to…

121 00:13:04.620 00:13:20.049 Awaish Kumar: read full Notion, whatever is in Notion, right? We can… we can just point out, like, Aiden have these three different, Notion docs, which are… which provide knowledge base, and for Urban STEM, we have these three. For…

122 00:13:20.360 00:13:28.590 Awaish Kumar: other clients, we have these 1, 2, 3, whatever, Notion Docs, which provides business domain knowledge, and one Google Sheet. That’s all.

123 00:13:29.070 00:13:40.139 Awaish Kumar: Okay. Pipeline for Google Sheet as well. That’s being loaded in as a, I think. So, we have almost everything in there.

124 00:13:40.590 00:13:43.250 Samuel Roberts: So this stuff is already in Subabase, then? Is that what you’re saying? Like…

125 00:13:43.250 00:13:56.250 Awaish Kumar: Yeah, if… I don’t know if somebody, like, disabled my pipeline, but we have the pipelines ready. If it’s not there, we can just turn them on, and we will be… they will be migrated to Supervis.

126 00:13:57.460 00:13:58.280 Samuel Roberts: Okay.

127 00:13:58.430 00:13:59.960 Awaish Kumar: Yeah, I’d love to…

128 00:14:00.030 00:14:01.889 Samuel Roberts: Check that and figure that out.

129 00:14:02.410 00:14:06.299 Awaish Kumar: Yeah, we have them in Dexter, so we have pipelines to move.

130 00:14:06.300 00:14:06.710 Samuel Roberts: Okay.

131 00:14:06.710 00:14:13.010 Awaish Kumar: We have pipelines to move GitHub codebase, we have pipelines to move,

132 00:14:13.190 00:14:18.530 Awaish Kumar: google Sheets, so all of them are being moved already to Superbase.

133 00:14:19.150 00:14:28.129 Samuel Roberts: Okay, that’s good to know. Okay, that is helpful then. I’ll try to take a look at that and see what I can find there then. Okay. So then, from this, I now know.

134 00:14:28.130 00:14:37.579 Awaish Kumar: Only thing missing is the connection to the database and actually getting some sample data. That’s not there.

135 00:14:38.620 00:14:45.509 Samuel Roberts: Okay, that’s… okay, great. That actually gives me… okay, that helps a little bit. So then I know the intermediate…

136 00:14:47.240 00:14:53.899 Samuel Roberts: parts are good for context there, and then the superbase has more business logic, or business information, I should say.

137 00:14:54.000 00:14:55.110 Samuel Roberts: And then…

138 00:14:56.040 00:15:03.089 Samuel Roberts: sample data is then. Okay, great. Alright, that actually helps a lot. I think… I think I have some place to run with this, then, for now.

139 00:15:04.690 00:15:08.710 Samuel Roberts: Okay, is there anything else I should know that I didn’t know before?

140 00:15:09.180 00:15:11.479 Awaish Kumar: Nope, I think that’s all.

141 00:15:11.990 00:15:18.640 Demilade Agboola: Also, another thing I can do is I can send you sample questions that we get asked by stakeholders.

142 00:15:18.640 00:15:25.690 Samuel Roberts: Perfect. That’d be great, yes, because I’m just… I’m going off just kind of generic ones now, but yeah, more specific ones would be very helpful.

143 00:15:26.650 00:15:35.770 Awaish Kumar: We have Notion knowledge sharing docs, and we can add them, all those questions, and even if we have the answers, we can just paste them there.

144 00:15:36.460 00:15:40.470 Samuel Roberts: Okay, yeah, that’d be great. Yeah, if you could do that, I would really appreciate that, and that will help me a lot.

145 00:15:40.730 00:15:41.770 Samuel Roberts: Working on this.

146 00:15:43.010 00:16:02.030 Samuel Roberts: All right. I think that’s all I’ve got right now, then. I’ll probably, if I need to, I’ll just ping you guys on Slack or something if I’m running into anything else, but I appreciate the time. Definitely, will keep an eye out for those questions, because that will help me test something more specific than just generic things, and then I’ll dig into some of that Superbase stuff and find what I can find.

147 00:16:03.180 00:16:03.800 Awaish Kumar: Cool.

148 00:16:03.800 00:16:04.320 Demilade Agboola: Okay.

149 00:16:04.320 00:16:05.560 Awaish Kumar: Alright.

150 00:16:05.870 00:16:07.010 Demilade Agboola: Oscar, thank you.

151 00:16:07.010 00:16:08.550 Samuel Roberts: Thank you all, yep, bye.

152 00:16:08.550 00:16:09.230 Demilade Agboola: Bye.