Meeting Title: Text to SQL sync Date: 2025-10-24 Meeting participants: Demilade Agboola, Awaish Kumar, Samuel Roberts
WEBVTT
1 00:00:38.380 ⇒ 00:00:39.380 Awaish Kumar: I mean…
2 00:00:39.380 ⇒ 00:00:40.270 Samuel Roberts: Hey!
3 00:00:42.860 ⇒ 00:00:44.080 Samuel Roberts: How are you guys?
4 00:00:44.500 ⇒ 00:00:46.549 Demilade Agboola: Pretty good. How are you?
5 00:00:46.550 ⇒ 00:00:48.519 Samuel Roberts: Doing alright, doing alright.
6 00:00:48.930 ⇒ 00:00:49.710 Demilade Agboola: Okay, just…
7 00:00:49.980 ⇒ 00:00:52.799 Samuel Roberts: Nothing too crazy today. Long meeting before this?
8 00:00:53.400 ⇒ 00:00:55.720 Samuel Roberts: Yeah, just kind of went over.
9 00:00:56.330 ⇒ 00:00:57.310 Samuel Roberts: Yeah, no worries.
10 00:00:57.310 ⇒ 00:00:57.950 Demilade Agboola: And it’s…
11 00:00:59.170 ⇒ 00:01:14.880 Samuel Roberts: Cool. Yeah, so I just wanted to talk about this, like, text-to-SQL thing for, like, the… or specifically for Eden right now, but in general, trying to make it a system that could work for other things. And I think, Demolade, when I kind of showed off the kind of initial
12 00:01:14.880 ⇒ 00:01:19.919 Samuel Roberts: one-shot thing I did with Cursor earlier, you were commenting about how it, you know.
13 00:01:19.930 ⇒ 00:01:23.959 Samuel Roberts: it didn’t have a lot of the context for, like, what was it, fact transactions, I think?
14 00:01:24.990 ⇒ 00:01:29.979 Samuel Roberts: And so, what I’m kind of wondering here is…
15 00:01:30.100 ⇒ 00:01:32.789 Samuel Roberts: I’m not as familiar with the dbt…
16 00:01:33.170 ⇒ 00:01:39.640 Samuel Roberts: side of things here. I’ve kind of looked through it, and I’ve chatted with Cursor about it, and it can infer some things.
17 00:01:39.950 ⇒ 00:01:43.330 Samuel Roberts: I’m wondering… What kind of other…
18 00:01:43.950 ⇒ 00:01:50.780 Samuel Roberts: knowledge is missing from this that I should be able to include as, like, a context file.
19 00:01:50.920 ⇒ 00:02:05.210 Samuel Roberts: Perhaps. And so, what I’m kind of just looking to talk through is… through is, I’ve got this… oops, sorry, let’s rearrange things for a second. I’ve got, like, I can…
20 00:02:05.370 ⇒ 00:02:20.830 Samuel Roberts: well, I don’t know if it’s worth sharing yet, but basically, in that repo, I made a separate, like, text-to-SQL folder so that cursor would have all the context of the dbt project. And that got me part of the way there. And now I’m looking to add either more context per
21 00:02:21.360 ⇒ 00:02:27.280 Samuel Roberts: dbt file, or per table, or per mart, I’m not really sure where best to, like.
22 00:02:27.430 ⇒ 00:02:32.389 Samuel Roberts: to bring that in. And so I’m kind of just looking for a little bit of insight into, like, maybe this…
23 00:02:32.580 ⇒ 00:02:48.080 Samuel Roberts: projects specifically, or just more generally, like, how you guys go about it with the dbt marts, or projects, and then the marts, and the intermediate, and the raw. I’m seeing all this stuff, and I just don’t have a great sense of, like, where should I put a layer of a little more context, if that makes sense.
24 00:02:53.060 ⇒ 00:03:03.649 Demilade Agboola: I think there’s… apart from, like, maybe raw on staging, there’s probably context in, like, the intermediate on… on Matt’s… Matt sold us the most.
25 00:03:04.020 ⇒ 00:03:04.700 Samuel Roberts: Okay.
26 00:03:04.700 ⇒ 00:03:09.599 Demilade Agboola: Because we’re on staging tends to just be, like, all in on the…
27 00:03:10.080 ⇒ 00:03:18.069 Demilade Agboola: tables as they are, you know, or maybe renaming or casting data types. There isn’t a lot going on there. In terms of, like, logic of
28 00:03:18.210 ⇒ 00:03:22.750 Demilade Agboola: They were filtering out this, or we are excluding this.
29 00:03:23.140 ⇒ 00:03:25.860 Demilade Agboola: It tends to happen more in intermediate and…
30 00:03:25.860 ⇒ 00:03:27.429 Samuel Roberts: In the intermediate? Okay.
31 00:03:28.220 ⇒ 00:03:29.649 Demilade Agboola: the MOTS models.
32 00:03:30.270 ⇒ 00:03:31.060 Samuel Roberts: Okay.
33 00:03:31.290 ⇒ 00:03:39.100 Samuel Roberts: So that’s where I should kind of be looking most, I guess, for trying to pull some stuff out, into a context file, I guess?
34 00:03:39.820 ⇒ 00:03:40.190 Demilade Agboola: Yes.
35 00:03:40.660 ⇒ 00:03:50.520 Samuel Roberts: Okay, I guess, so, so sort of right now, I have it, it, it’s just kind of a general, you know, it understands a little bit. It’s using the, kind of.
36 00:03:50.710 ⇒ 00:03:57.870 Samuel Roberts: LM’s knowledge of SQL, but then I’m passing in some, like, an HSMD file, a rough, like,
37 00:03:58.680 ⇒ 00:04:01.509 Samuel Roberts: guide to BigQuery specifically.
38 00:04:01.810 ⇒ 00:04:08.790 Samuel Roberts: And then I was starting to do something that was going to extract some schema stuff out, and so pointing that to intermediate is probably the best bet.
39 00:04:08.980 ⇒ 00:04:14.919 Samuel Roberts: Are there other… things in here that I… that, you know.
40 00:04:15.490 ⇒ 00:04:20.510 Samuel Roberts: I guess my question is, is there any other context beyond this, from, like, a business…
41 00:04:20.660 ⇒ 00:04:24.000 Samuel Roberts: Level, or anything else that would be…
42 00:04:24.650 ⇒ 00:04:28.320 Samuel Roberts: like, not something that the LLM could discover in here?
43 00:04:34.570 ⇒ 00:04:42.400 Demilade Agboola: I mean, I think the major differences that I can think of are… I did mention it, like, fact transactions are where we filter out
44 00:04:43.370 ⇒ 00:04:50.850 Demilade Agboola: Things like consult orders, and… But again, it’s canceled and error on us.
45 00:04:51.870 ⇒ 00:04:53.699 Demilade Agboola: We filter those out.
46 00:04:54.190 ⇒ 00:04:56.439 Samuel Roberts: And that’s not something that I would see and hear anywhere.
47 00:04:56.740 ⇒ 00:05:02.120 Demilade Agboola: No, there’s a wear filter that filters that out, so you can… Okay, so it does have that, okay. Yeah.
48 00:05:02.380 ⇒ 00:05:03.020 Samuel Roberts: Okay.
49 00:05:03.300 ⇒ 00:05:08.610 Demilade Agboola: But I think some of the context might be… For certain things.
50 00:05:09.610 ⇒ 00:05:17.499 Demilade Agboola: I know that certain models, like, downstream, use, like, the fact orders, Also, I’m trying to use…
51 00:05:17.650 ⇒ 00:05:20.549 Demilade Agboola: All orders for certain analysis.
52 00:05:21.400 ⇒ 00:05:25.129 Demilade Agboola: I can’t remember the exact examples right now, I have to look into them.
53 00:05:25.130 ⇒ 00:05:27.740 Samuel Roberts: But, like, such things means that if you ask.
54 00:05:27.740 ⇒ 00:05:33.820 Demilade Agboola: the… Yeah, because certain things means if you ask the LLM questions.
55 00:05:34.050 ⇒ 00:05:38.829 Demilade Agboola: It would need to just know whether you want the analysis on all orders, or just…
56 00:05:39.110 ⇒ 00:05:43.009 Demilade Agboola: A certain subset of the others, so that’s kind of the only thing to, like, matter.
57 00:05:43.290 ⇒ 00:05:50.560 Samuel Roberts: Okay, that’s good to know. So I can tell it to maybe ask that question or something, or… or make sure that we tell it specifically that, but…
58 00:05:50.680 ⇒ 00:05:51.760 Samuel Roberts: Okay.
59 00:05:54.320 ⇒ 00:06:01.189 Samuel Roberts: OH, I see you mentioned that Notion, Google Sheets, and database schema, and sample data. So are there other pieces of knowledge that I can pull in here?
60 00:06:02.150 ⇒ 00:06:10.309 Awaish Kumar: Like, I want to just understand, what… what are you planning to do? Like, you want to build a custom solution to generate queries, or…
61 00:06:10.710 ⇒ 00:06:15.449 Samuel Roberts: Yeah, so, Utam had kind of tasked me with putting something together that’s kind of…
62 00:06:15.770 ⇒ 00:06:18.789 Samuel Roberts: Somewhat similar to that Omni, you know.
63 00:06:18.950 ⇒ 00:06:27.830 Samuel Roberts: assistant that, you know, you can kind of chat over the data, and so I got something that kind of can generate some basic sequence.
64 00:06:27.830 ⇒ 00:06:33.170 Awaish Kumar: but I’m trying to add a little bit more context to it, the way Omni had…
65 00:06:33.170 ⇒ 00:06:33.920 Samuel Roberts: More stuff there.
66 00:06:33.920 ⇒ 00:06:39.770 Awaish Kumar: I got, I got your point. I have been working in Omni, so… Okay.
67 00:06:40.330 ⇒ 00:06:47.819 Awaish Kumar: the… Thing is that, like, normally cursor, like, what the, like.
68 00:06:47.950 ⇒ 00:06:52.480 Awaish Kumar: Whatever is in GitHub repo, cursor is, like.
69 00:06:52.830 ⇒ 00:07:02.910 Awaish Kumar: almost able to get that context, what I have seen, right? So, if it is something in another file, or in the intermediate table,
70 00:07:03.770 ⇒ 00:07:20.260 Awaish Kumar: like, Kasha is able to do, and obviously you can also have a… all access to get the code base, and you can access, like, from the file, if I… if I… in my text, if I mention two tables, you can look at those files, and
71 00:07:20.370 ⇒ 00:07:25.360 Awaish Kumar: Look, all the tables that are being referenced in that SQL query.
72 00:07:25.560 ⇒ 00:07:41.019 Awaish Kumar: Right? So if that references tables from intermediate table more, intermediate layer, or if that references a table from source, or a raw layer, you can easily get to that point, and
73 00:07:41.180 ⇒ 00:07:45.810 Awaish Kumar: send in the con… send it in the context file. So what…
74 00:07:46.520 ⇒ 00:07:57.120 Awaish Kumar: LLM or Kherson cannot do right now is that I want him to, like, instead of, like, if I want to build a new query, which is
75 00:07:57.320 ⇒ 00:08:00.790 Awaish Kumar: Which is not in… in the…
76 00:08:01.790 ⇒ 00:08:09.920 Awaish Kumar: the cursor, and it might have some data which I haven’t explored. So, if I want the cursor to…
77 00:08:09.930 ⇒ 00:08:22.290 Awaish Kumar: helped me generate a query, for the data, which even I’m not… I’m familiar… like, quite familiar with. Like, I have maybe looked at it for a few times, but not, like, thoroughly.
78 00:08:22.290 ⇒ 00:08:30.700 Awaish Kumar: So, like, how can Kherson know that? I can’t, like, I’m not able to write everything down, because maybe I haven’t… I don’t have any…
79 00:08:30.780 ⇒ 00:08:32.320 Awaish Kumar: enough,
80 00:08:34.929 ⇒ 00:08:47.529 Awaish Kumar: knowledge, or I can say, like, even if I have, I don’t have enough, like, time to write down everything for all the columns, what are the accepted values, and all these things.
81 00:08:47.960 ⇒ 00:08:52.689 Awaish Kumar: So that’s the issue, and what… What would be the ideal…
82 00:08:52.830 ⇒ 00:08:58.559 Awaish Kumar: solution is that I can connect the… for example, there’s a…
83 00:08:58.970 ⇒ 00:09:10.940 Awaish Kumar: there’s a page, whatever, the agent, and I can connect it with some Google BigQuery project. It can read the information schema, so it gets the table and their fields.
84 00:09:11.210 ⇒ 00:09:13.950 Awaish Kumar: And then it can carry those
85 00:09:14.080 ⇒ 00:09:17.159 Awaish Kumar: Tables to get some sample data, right?
86 00:09:17.280 ⇒ 00:09:21.749 Awaish Kumar: So… From my text, first of all, the…
87 00:09:21.990 ⇒ 00:09:31.390 Awaish Kumar: the step one is that AI agent can, figure out what tables are… are related, or, like…
88 00:09:31.450 ⇒ 00:09:34.749 Samuel Roberts: should be included in the context, right?
89 00:09:34.800 ⇒ 00:09:43.869 Awaish Kumar: And then from those tables, it can find the schema directly from database, and some sample data to understand what it looks like.
90 00:09:43.950 ⇒ 00:09:55.249 Awaish Kumar: Right? And then, notion Docs. Like, we have Google Sheet and Notion Doc, where we store business knowledge, as you were mentioning. Right.
91 00:09:55.250 ⇒ 00:10:05.669 Awaish Kumar: So, like, I have 5 different status of an order, in the table. You can get those 5 status, but you don’t know the description, or…
92 00:10:05.710 ⇒ 00:10:09.060 Awaish Kumar: AI agent won’t know either, right? So… Okay.
93 00:10:09.640 ⇒ 00:10:14.930 Awaish Kumar: What, what you have to do is now that maybe I have… I don’t know…
94 00:10:15.460 ⇒ 00:10:23.420 Awaish Kumar: it either, and I’ve asked someone, and I’ve put it in Notion Doc. So now that knowledge lives in the Notion Doc.
95 00:10:23.620 ⇒ 00:10:33.639 Awaish Kumar: But the AI agent doesn’t know that, or maybe some of that knowledge lives in the Google Sheet. But the AI agent, like, doesn’t know that, so…
96 00:10:33.830 ⇒ 00:10:41.840 Awaish Kumar: we can, like, have a few connectors, like, we are… for Brainforge, we have Notion, we have Google Sheets, we have,
97 00:10:42.200 ⇒ 00:10:52.789 Awaish Kumar: databases, like BigQuery.shift, Postgres at Superbase, and then we have,
98 00:10:54.050 ⇒ 00:11:03.390 Awaish Kumar: Yeah, like, the GitHub codebase. Like, these have 3-4 different connectors. If we have that, to integrate all of them together, and then…
99 00:11:03.750 ⇒ 00:11:06.199 Awaish Kumar: Generate some, queries.
100 00:11:07.750 ⇒ 00:11:08.600 Samuel Roberts: Okay.
101 00:11:09.740 ⇒ 00:11:15.220 Samuel Roberts: So, I think maybe for at least this example that I’m trying to work on with Eden, I would love…
102 00:11:16.860 ⇒ 00:11:23.879 Samuel Roberts: maybe… well, I don’t know, I’m trying to think of how much of this information I need to pull into, like, an agent’s MD file to pass into Cursor.
103 00:11:24.340 ⇒ 00:11:27.010 Samuel Roberts: Or if it’s an MCP thing. Okay.
104 00:11:27.010 ⇒ 00:11:40.320 Awaish Kumar: So right now, I kind of do a similar thing. Like, I… if I have to write some KG using ChatGPT, I will go to the ChatGPT, I will, write down what I know, I will cut-paste some SQL files from…
105 00:11:41.130 ⇒ 00:11:46.709 Awaish Kumar: Codebase, and also some sample data,
106 00:11:46.870 ⇒ 00:11:51.620 Awaish Kumar: from the carry, and then it will generate a carry for me, so…
107 00:11:51.760 ⇒ 00:11:56.419 Awaish Kumar: Okay. Right now, I’m doing it manually, so that’s what I need. If I can…
108 00:11:57.260 ⇒ 00:12:04.190 Awaish Kumar: I don’t have to write it. I can just, write down what I need, and it just collects everything automatically.
109 00:12:04.190 ⇒ 00:12:05.030 Samuel Roberts: Right.
110 00:12:05.370 ⇒ 00:12:06.330 Samuel Roberts: Okay.
111 00:12:06.510 ⇒ 00:12:10.750 Samuel Roberts: Alright, that’s… Good to know. I’m just trying to think…
112 00:12:10.870 ⇒ 00:12:15.390 Samuel Roberts: how best to, like, make this a one-stop shop kind of thing.
113 00:12:15.800 ⇒ 00:12:24.949 Samuel Roberts: And I guess I don’t have a ton of context into this client, at least, so I’m not sure how much I should be pulling in from other sources, like the Notion or the…
114 00:12:25.380 ⇒ 00:12:27.159 Samuel Roberts: Sheets or something.
115 00:12:27.610 ⇒ 00:12:38.100 Awaish Kumar: No, like, we have defined, like, we did similar exercise when we were building these agents, like Slack agents. We have bots, like.
116 00:12:38.450 ⇒ 00:12:43.880 Awaish Kumar: Casey and, miguel have worked on that. So while we were doing that, we…
117 00:12:43.990 ⇒ 00:12:52.519 Awaish Kumar: we… what we did, like, we have Notion Dog, we have Script, we have pipelines ready to get the data from Notion to Superbase.
118 00:12:52.640 ⇒ 00:12:55.350 Awaish Kumar: So now, you just have read from Superlist.
119 00:12:56.330 ⇒ 00:12:57.670 Samuel Roberts: Oh, okay.
120 00:12:57.830 ⇒ 00:13:04.299 Awaish Kumar: And then we have tagged them all with their clients, so if I eat in… so it’s not like you have to…
121 00:13:04.620 ⇒ 00:13:20.049 Awaish Kumar: read full Notion, whatever is in Notion, right? We can… we can just point out, like, Aiden have these three different, Notion docs, which are… which provide knowledge base, and for Urban STEM, we have these three. For…
122 00:13:20.360 ⇒ 00:13:28.590 Awaish Kumar: other clients, we have these 1, 2, 3, whatever, Notion Docs, which provides business domain knowledge, and one Google Sheet. That’s all.
123 00:13:29.070 ⇒ 00:13:40.139 Awaish Kumar: Okay. Pipeline for Google Sheet as well. That’s being loaded in as a, I think. So, we have almost everything in there.
124 00:13:40.590 ⇒ 00:13:43.250 Samuel Roberts: So this stuff is already in Subabase, then? Is that what you’re saying? Like…
125 00:13:43.250 ⇒ 00:13:56.250 Awaish Kumar: Yeah, if… I don’t know if somebody, like, disabled my pipeline, but we have the pipelines ready. If it’s not there, we can just turn them on, and we will be… they will be migrated to Supervis.
126 00:13:57.460 ⇒ 00:13:58.280 Samuel Roberts: Okay.
127 00:13:58.430 ⇒ 00:13:59.960 Awaish Kumar: Yeah, I’d love to…
128 00:14:00.030 ⇒ 00:14:01.889 Samuel Roberts: Check that and figure that out.
129 00:14:02.410 ⇒ 00:14:06.299 Awaish Kumar: Yeah, we have them in Dexter, so we have pipelines to move.
130 00:14:06.300 ⇒ 00:14:06.710 Samuel Roberts: Okay.
131 00:14:06.710 ⇒ 00:14:13.010 Awaish Kumar: We have pipelines to move GitHub codebase, we have pipelines to move,
132 00:14:13.190 ⇒ 00:14:18.530 Awaish Kumar: google Sheets, so all of them are being moved already to Superbase.
133 00:14:19.150 ⇒ 00:14:28.129 Samuel Roberts: Okay, that’s good to know. Okay, that is helpful then. I’ll try to take a look at that and see what I can find there then. Okay. So then, from this, I now know.
134 00:14:28.130 ⇒ 00:14:37.579 Awaish Kumar: Only thing missing is the connection to the database and actually getting some sample data. That’s not there.
135 00:14:38.620 ⇒ 00:14:45.509 Samuel Roberts: Okay, that’s… okay, great. That actually gives me… okay, that helps a little bit. So then I know the intermediate…
136 00:14:47.240 ⇒ 00:14:53.899 Samuel Roberts: parts are good for context there, and then the superbase has more business logic, or business information, I should say.
137 00:14:54.000 ⇒ 00:14:55.110 Samuel Roberts: And then…
138 00:14:56.040 ⇒ 00:15:03.089 Samuel Roberts: sample data is then. Okay, great. Alright, that actually helps a lot. I think… I think I have some place to run with this, then, for now.
139 00:15:04.690 ⇒ 00:15:08.710 Samuel Roberts: Okay, is there anything else I should know that I didn’t know before?
140 00:15:09.180 ⇒ 00:15:11.479 Awaish Kumar: Nope, I think that’s all.
141 00:15:11.990 ⇒ 00:15:18.640 Demilade Agboola: Also, another thing I can do is I can send you sample questions that we get asked by stakeholders.
142 00:15:18.640 ⇒ 00:15:25.690 Samuel Roberts: Perfect. That’d be great, yes, because I’m just… I’m going off just kind of generic ones now, but yeah, more specific ones would be very helpful.
143 00:15:26.650 ⇒ 00:15:35.770 Awaish Kumar: We have Notion knowledge sharing docs, and we can add them, all those questions, and even if we have the answers, we can just paste them there.
144 00:15:36.460 ⇒ 00:15:40.470 Samuel Roberts: Okay, yeah, that’d be great. Yeah, if you could do that, I would really appreciate that, and that will help me a lot.
145 00:15:40.730 ⇒ 00:15:41.770 Samuel Roberts: Working on this.
146 00:15:43.010 ⇒ 00:16:02.030 Samuel Roberts: All right. I think that’s all I’ve got right now, then. I’ll probably, if I need to, I’ll just ping you guys on Slack or something if I’m running into anything else, but I appreciate the time. Definitely, will keep an eye out for those questions, because that will help me test something more specific than just generic things, and then I’ll dig into some of that Superbase stuff and find what I can find.
147 00:16:03.180 ⇒ 00:16:03.800 Awaish Kumar: Cool.
148 00:16:03.800 ⇒ 00:16:04.320 Demilade Agboola: Okay.
149 00:16:04.320 ⇒ 00:16:05.560 Awaish Kumar: Alright.
150 00:16:05.870 ⇒ 00:16:07.010 Demilade Agboola: Oscar, thank you.
151 00:16:07.010 ⇒ 00:16:08.550 Samuel Roberts: Thank you all, yep, bye.
152 00:16:08.550 ⇒ 00:16:09.230 Demilade Agboola: Bye.