Meeting Title: ABC Mastra and Evals Sync Date: 2026-01-26 Meeting participants: Samuel Roberts, Casie Aviles, Mustafa Raja, Pranav


WEBVTT

1 00:00:12.270 00:00:13.220 Samuel Roberts: Hey.

2 00:00:16.460 00:00:17.300 Casie Aviles: Hey, son.

3 00:00:17.610 00:00:18.760 Casie Aviles: Hey, Mustafa.

4 00:00:19.190 00:00:19.980 Mustafa Raja: A…

5 00:00:20.930 00:00:22.210 Samuel Roberts: How’s it going, guys?

6 00:00:23.180 00:00:24.219 Mustafa Raja: Good, how are you?

7 00:00:25.230 00:00:29.860 Samuel Roberts: Doing okay, fighting with Prettier over here, and… I’m a little below.

8 00:00:31.610 00:00:32.600 Samuel Roberts: What’s going on?

9 00:00:35.220 00:00:42.070 Samuel Roberts: I don’t know what’s happening, it’s just, it’s not doing it right. I don’t know what’s going on with the cursor, or prettier, or whatever, but…

10 00:00:42.680 00:00:44.360 Samuel Roberts: I spent too long on it at this point.

11 00:00:44.360 00:00:47.010 Mustafa Raja: for… Is it for TypeScript?

12 00:00:48.070 00:00:54.229 Samuel Roberts: Yeah, so it’s supposed to do it on save, but I’m still getting squiggles everywhere, little red lines.

13 00:00:55.270 00:00:57.399 Samuel Roberts: I don’t know why, but I’m not gonna worry about it right now.

14 00:00:58.040 00:01:00.510 Samuel Roberts: But I’m not gonna worry about it right now, I wanted to get something out.

15 00:01:01.450 00:01:05.620 Samuel Roberts: figure out why it’s working here, but not there, and it’s just… I don’t know what’s going on.

16 00:01:05.620 00:01:11.280 Mustafa Raja: I think, I could be wrong, but I think if it’s an error, then Pretia doesn’t work.

17 00:01:12.360 00:01:18.819 Samuel Roberts: It’s not an error, it’s an… well, it’s ESLint is telling me that Prettier thinks it needs to be fixed, but Prettier isn’t fixing it.

18 00:01:20.900 00:01:21.740 Mustafa Raja: Hmm.

19 00:01:22.140 00:01:27.449 Samuel Roberts: Yeah, so I don’t know, I’m not worried, the files are working, I just was trying to get it to clean up a little bit, and it’s not happening.

20 00:01:27.450 00:01:30.270 Mustafa Raja: ESLint does also mess up bills, right?

21 00:01:30.690 00:01:36.739 Samuel Roberts: Yeah, ESLint and Prettier, they sort of play nice together if you have it set up right, but they don’t. If you don’t, it’s…

22 00:01:37.760 00:01:42.230 Mustafa Raja: I’ve had these situations, where I just turn off Eastland.

23 00:01:42.230 00:01:48.519 Samuel Roberts: Yeah, exactly. But I want to try to keep it till the code is formatted, at least, but we’ll get there.

24 00:01:48.980 00:01:49.780 Mustafa Raja: Yeah.

25 00:01:49.890 00:01:57.089 Mustafa Raja: In an ideal situation, I wouldn’t want a file to be more than 400 lines, I’d just segment the file.

26 00:01:57.590 00:02:01.800 Samuel Roberts: Yeah… I don’t know, with all the AI-generated code now, we have some such.

27 00:02:01.800 00:02:03.500 Mustafa Raja: Yeah, yeah.

28 00:02:03.500 00:02:04.170 Samuel Roberts: bookkeeping.

29 00:02:04.590 00:02:14.970 Mustafa Raja: I think we could, if we really want to, really, really want to enforce that, we could do that, in GitHub repo settings, right? We could, enforce,

30 00:02:14.970 00:02:16.929 Samuel Roberts: Yeah, I thought about that.

31 00:02:16.930 00:02:17.329 Mustafa Raja: sort of…

32 00:02:17.330 00:02:21.929 Samuel Roberts: But I don’t wanna… I don’t want to enforce that until I get the editor working with it, yeah.

33 00:02:21.930 00:02:22.560 Mustafa Raja: Yeah.

34 00:02:22.730 00:02:24.080 Mustafa Raja: Yeah, you’re right.

35 00:02:24.920 00:02:30.660 Samuel Roberts: Because I want to make sure that the code is good, but whatever, it’s not important right now. I already spent, like, half an hour trying to figure it out, so…

36 00:02:31.770 00:02:38.149 Samuel Roberts: Anyway… So, let’s talk… Mastra and evals.

37 00:02:40.590 00:02:42.230 Casie Aviles: Okay.

38 00:02:43.910 00:02:51.030 Casie Aviles: I can… I can go ahead and pull up Instagram and Linear as well.

39 00:02:51.250 00:02:53.750 Samuel Roberts: Yeah, I was just starting to do that, so that’d be great.

40 00:02:55.320 00:02:56.200 Casie Aviles: Okay.

41 00:02:56.670 00:03:01.690 Casie Aviles: So this is our instance, I’ve gone… Just slow down.

42 00:03:04.570 00:03:14.270 Casie Aviles: Right now, I put these as done since it’s already live. It’s just going to be maintenance or minor fixes moving forward.

43 00:03:15.080 00:03:18.789 Casie Aviles: Yeah, I think for me…

44 00:03:18.790 00:03:22.890 Mustafa Raja: Sorry, let’s make this one as done also, the fixed one.

45 00:03:24.120 00:03:26.199 Samuel Roberts: Oh, yeah, we got okay from that, right?

46 00:03:26.890 00:03:27.330 Mustafa Raja: Yeah.

47 00:03:27.330 00:03:28.350 Casie Aviles: Oh, and this…

48 00:03:28.350 00:03:29.530 Mustafa Raja: This is good.

49 00:03:31.390 00:03:34.300 Mustafa Raja: I’m going to… it is, yeah.

50 00:03:34.810 00:03:42.779 Mustafa Raja: And then… this one… This one should also be good.

51 00:03:42.920 00:03:44.490 Mustafa Raja: Fixed performance.

52 00:03:46.250 00:03:51.350 Mustafa Raja: No, actually, was it this one? Did we con- .

53 00:03:52.310 00:03:52.910 Samuel Roberts: No, I…

54 00:03:52.910 00:03:57.030 Mustafa Raja: Let’s leave that. I think it was validated document relevance, right?

55 00:03:57.160 00:03:58.419 Mustafa Raja: There’s that one.

56 00:04:00.380 00:04:00.810 Casie Aviles: now.

57 00:04:00.810 00:04:03.979 Mustafa Raja: For the evals. Yeah, yeah, let’s leave this as is then.

58 00:04:04.600 00:04:05.430 Samuel Roberts: Okay.

59 00:04:10.580 00:04:13.989 Casie Aviles: Alright, so it… I think…

60 00:04:14.560 00:04:17.680 Casie Aviles: There might be a lot going on here, but…

61 00:04:17.680 00:04:18.250 Samuel Roberts: Yeah.

62 00:04:18.250 00:04:25.760 Casie Aviles: I think… I guess the… for me, the… The priority would be getting… the…

63 00:04:27.030 00:04:30.709 Casie Aviles: master agent connected to Google, so that should.

64 00:04:30.710 00:04:31.040 Samuel Roberts: Right.

65 00:04:31.040 00:04:37.350 Casie Aviles: The one that they… they’re… You know, talking to,

66 00:04:38.680 00:04:43.710 Casie Aviles: I guess we’re, let’s see where that part is. I think it should be this one.

67 00:04:43.950 00:04:46.480 Casie Aviles: Implement master entry endpoint.

68 00:04:47.140 00:04:48.280 Casie Aviles: So I think…

69 00:04:49.430 00:05:00.849 Casie Aviles: Oh, just one more thing, yeah. Is this done already, or are we still currently working on it, the validation of Mastra versus NADEN?

70 00:05:02.710 00:05:05.000 Casie Aviles: I feel like we’re done, though, but…

71 00:05:05.000 00:05:09.230 Samuel Roberts: Yeah, I think we’re done. We’re just running the evals and, you know, trying to refine it more.

72 00:05:09.230 00:05:09.960 Mustafa Raja: Hmm.

73 00:05:10.750 00:05:15.559 Mustafa Raja: There’s a lot of eval stuff that’s just the same thing, right?

74 00:05:19.950 00:05:21.310 Mustafa Raja: Yeah, that’s been spotted.

75 00:05:21.420 00:05:25.340 Mustafa Raja: Yeah, for this, validation of, the…

76 00:05:26.410 00:05:32.690 Mustafa Raja: This one, we also did the same thing as we did with the events that we’re doing right now, right?

77 00:05:33.200 00:05:34.299 Samuel Roberts: Yeah, I think so.

78 00:05:34.680 00:05:46.979 Casie Aviles: Yeah, because I think the goal here was to just understand if we could get, like, the master agent working as well as the N8N1, right? And I think we concluded that the Master Agent is fine.

79 00:05:47.230 00:05:48.100 Mustafa Raja: Yep.

80 00:05:49.200 00:05:51.010 Casie Aviles: So I’ll mark that as completed.

81 00:05:52.620 00:05:58.190 Casie Aviles: Cool, so I think for me, my only…

82 00:05:58.420 00:06:03.809 Casie Aviles: So, I guess my, my, my, what I want, what I want to do is to figure out, like.

83 00:06:04.520 00:06:10.079 Casie Aviles: What’s the best way we… we could… forward with this, like.

84 00:06:10.230 00:06:20.000 Casie Aviles: should we have it up on Google Cloud Run, or do we keep Heroku, or do we move to Railway? I think those are, like.

85 00:06:20.170 00:06:22.140 Casie Aviles: Some of the questions that I have.

86 00:06:22.660 00:06:23.810 Casie Aviles: In mine?

87 00:06:24.330 00:06:30.220 Casie Aviles: When it comes to, like, having the master agent replace the N8N endpoint.

88 00:06:30.480 00:06:31.819 Casie Aviles: Right? Yeah.

89 00:06:31.820 00:06:36.089 Samuel Roberts: I think I mean, end goal is to get it on Google Cloud, right?

90 00:06:36.510 00:06:37.170 Casie Aviles: Yeah.

91 00:06:37.810 00:06:41.270 Samuel Roberts: So I think we probably want to… do that.

92 00:06:42.870 00:06:46.910 Samuel Roberts: I just… I don’t know if there’s gonna be any little weird nuances there.

93 00:06:47.970 00:06:50.210 Samuel Roberts: So it’s probably worth trying that.

94 00:06:50.630 00:06:51.890 Samuel Roberts: I think…

95 00:06:52.440 00:06:58.780 Samuel Roberts: like, I don’t necessarily want to flip the switch immediately, like, I want to get it running there, us.

96 00:06:58.780 00:06:59.260 Casie Aviles: Hoping it.

97 00:07:01.050 00:07:04.429 Samuel Roberts: But I think getting it running on GCP is probably the best thing.

98 00:07:05.660 00:07:09.699 Mustafa Raja: I’m also wondering if you’re going to do a staging mode for this?

99 00:07:09.930 00:07:10.600 Mustafa Raja: Where…

100 00:07:10.600 00:07:11.500 Casie Aviles: Yeah, yeah, we…

101 00:07:11.500 00:07:12.950 Mustafa Raja: could also test that?

102 00:07:13.320 00:07:13.870 Samuel Roberts: Yeah.

103 00:07:14.020 00:07:18.270 Casie Aviles: We… we actually do have multiple environments, and…

104 00:07:18.270 00:07:18.670 Samuel Roberts: Yes.

105 00:07:18.670 00:07:22.040 Casie Aviles: I was… I’m sorry.

106 00:07:23.440 00:07:26.409 Casie Aviles: We already have, like, an Andy test.

107 00:07:27.410 00:07:29.100 Casie Aviles: Let’s see why that means.

108 00:07:30.240 00:07:31.750 Mustafa Raja: Oh yeah, we do have that.

109 00:07:32.310 00:07:33.389 Casie Aviles: Yeah, I, I agree.

110 00:07:33.740 00:07:39.520 Casie Aviles: this, but it was not, like… I couldn’t get it working yet for, like.

111 00:07:40.250 00:07:43.099 Casie Aviles: the endpoint itself, so I just,

112 00:07:43.390 00:07:46.630 Casie Aviles: Used, like, our own environment for now.

113 00:07:47.100 00:07:47.980 Casie Aviles: Okay.

114 00:07:47.980 00:07:49.209 Mustafa Raja: We do, we do report.

115 00:07:50.480 00:07:51.990 Mustafa Raja: We do have one more, right?

116 00:07:52.330 00:07:55.760 Mustafa Raja: the… and the anti-test tab, or something.

117 00:07:57.100 00:08:00.270 Casie Aviles: Yeah, for, that’s internally, that’s for the brain forge.

118 00:08:00.460 00:08:01.520 Mustafa Raja: Okay, okay.

119 00:08:02.440 00:08:04.659 Casie Aviles: Hmm… Yeah, this one.

120 00:08:06.440 00:08:07.609 Casie Aviles: Yeah, this one.

121 00:08:07.860 00:08:09.790 Mustafa Raja: This is the internal one.

122 00:08:10.500 00:08:13.320 Samuel Roberts: Okay, where’s… and where’s that running? Is that…

123 00:08:14.660 00:08:16.460 Mustafa Raja: It’s in Anneton also.

124 00:08:16.700 00:08:17.900 Samuel Roberts: This is any… Oh, so we could…

125 00:08:17.900 00:08:18.790 Casie Aviles: convention.

126 00:08:18.790 00:08:21.869 Samuel Roberts: So we could at least get this going and flip it here.

127 00:08:22.330 00:08:23.960 Samuel Roberts: And test it out, right?

128 00:08:26.080 00:08:33.479 Casie Aviles: Yeah, okay, yeah, that’s… that’s a good, way forward. I think we can test it out first within our environment, then.

129 00:08:34.000 00:08:38.220 Samuel Roberts: Yeah, we can still try to get it running on their GCP if we want to, but…

130 00:08:38.900 00:08:42.810 Casie Aviles: I, yeah, I was… Yeah, go ahead, go ahead.

131 00:08:43.870 00:08:52.020 Casie Aviles: Yeah, I was trying it out, I think I hit into some access issues, so that’s why I was asking.

132 00:08:52.280 00:08:53.870 Casie Aviles: Tim?

133 00:08:55.640 00:08:56.300 Samuel Roberts: Right.

134 00:08:57.160 00:08:57.910 Casie Aviles: C.

135 00:09:02.200 00:09:05.129 Casie Aviles: And, like, let me just go to the thread.

136 00:09:05.230 00:09:08.990 Casie Aviles: So I can also… Show you guys…

137 00:09:15.990 00:09:26.920 Casie Aviles: Yeah, like, I wanted… yeah, this is the additional access that was showing up on… Google Cloud Drive.

138 00:09:27.220 00:09:29.759 Casie Aviles: But I think I just… yeah.

139 00:09:30.680 00:09:32.339 Casie Aviles: Oh, what’s this one.

140 00:09:32.580 00:09:35.489 Casie Aviles: So I think it just needs to give me admin, but…

141 00:09:36.410 00:09:41.440 Casie Aviles: I guess I’ll have to follow up with him if he’s able to do that.

142 00:09:41.640 00:09:49.500 Casie Aviles: Since I think he was… There’s some… reservations.

143 00:09:50.290 00:09:50.820 Samuel Roberts: Yeah.

144 00:09:51.960 00:09:55.769 Casie Aviles: with… Granting broad administrative access.

145 00:09:59.510 00:10:00.200 Samuel Roberts: Huh.

146 00:10:04.740 00:10:07.889 Casie Aviles: Okay, I think, yeah, I’ll try to make it work.

147 00:10:07.990 00:10:11.720 Casie Aviles: Within our environment first, and then…

148 00:10:12.740 00:10:20.670 Casie Aviles: Yeah, if this ends up being, like, taking too long, but I will also follow up on… with Tim.

149 00:10:21.030 00:10:21.790 Samuel Roberts: Okay.

150 00:10:22.130 00:10:31.110 Casie Aviles: Just so we could… because, yeah, I don’t… I kind of don’t want to just deploy it on Heroku or Railway, and then we’re just gonna move again.

151 00:10:31.780 00:10:33.440 Samuel Roberts: Yeah, I agree.

152 00:10:34.900 00:10:42.410 Casie Aviles: So… yeah, that makes sense. Okay, so I think I get… I have a clearer sense of what are the next steps.

153 00:10:43.440 00:10:44.619 Casie Aviles: Okay.

154 00:10:46.360 00:10:49.860 Casie Aviles: Yeah, okay, that’s… that’s what I can do, and then…

155 00:10:52.100 00:11:05.629 Casie Aviles: Okay, yeah, I guess that’s it. I just need to spend some more time digging into it, because I haven’t tried it extensively, so I can do that for this week.

156 00:11:06.250 00:11:06.960 Samuel Roberts: Okay, yeah.

157 00:11:06.960 00:11:08.560 Casie Aviles: have it in your own environment.

158 00:11:09.120 00:11:16.469 Samuel Roberts: Yep, exactly. You can get it up and into our environment, and connect the chat to the GCP instance, I think is the goal.

159 00:11:17.320 00:11:19.189 Samuel Roberts: Our chat, at least.

160 00:11:20.090 00:11:24.520 Casie Aviles: Because I think there should be… Chew.

161 00:11:34.980 00:11:40.430 Casie Aviles: I think we could, point to a GitHub repository, if I’m not mistaken.

162 00:11:42.280 00:11:45.200 Casie Aviles: Yeah, we can connect our repository.

163 00:11:46.120 00:11:47.819 Casie Aviles: We even have these.

164 00:11:49.260 00:11:49.960 Samuel Roberts: Okay, cool.

165 00:11:52.040 00:11:54.510 Casie Aviles: So I think, yeah, I’ll just have to, like.

166 00:11:55.130 00:11:59.779 Casie Aviles: test this out, and I’ll let you guys know if I run into any blockers. I just haven’t…

167 00:12:00.310 00:12:03.889 Casie Aviles: check this extensively, so I think that’s my next step.

168 00:12:04.340 00:12:05.390 Casie Aviles: for this.

169 00:12:08.660 00:12:12.880 Casie Aviles: Okay. Yeah, because I do think that do,

170 00:12:13.010 00:12:17.720 Casie Aviles: Doing the switch will also help resolve, like, some of the other…

171 00:12:18.020 00:12:25.500 Casie Aviles: errors that we’ve been seeing, like, what Mustafa surfaced, which is, like, the error thing, the superbase error, so…

172 00:12:25.730 00:12:26.530 Samuel Roberts: Yeah.

173 00:12:26.650 00:12:27.180 Samuel Roberts: I didn’t.

174 00:12:27.180 00:12:29.440 Mustafa Raja: I think that that’s facing from anything, right?

175 00:12:30.100 00:12:30.920 Mustafa Raja: And I think.

176 00:12:30.920 00:12:31.540 Casie Aviles: That’s time.

177 00:12:31.540 00:12:35.479 Mustafa Raja: the project, if we start all, and it can… since that might go away.

178 00:12:36.400 00:12:38.499 Casie Aviles: We have seen that happen.

179 00:12:39.060 00:12:41.980 Mustafa Raja: We have some error… Yeah?

180 00:12:42.400 00:12:45.320 Casie Aviles: Yeah, sorry, I think that that’s just,

181 00:12:45.510 00:12:51.260 Casie Aviles: Something we can do, but I think it doesn’t address, like, the route, because it might happen again now.

182 00:12:51.260 00:12:51.780 Mustafa Raja: Yeah.

183 00:12:51.780 00:12:52.230 Samuel Roberts: Right.

184 00:12:52.230 00:12:53.290 Mustafa Raja: I agree on that.

185 00:12:54.180 00:12:55.760 Casie Aviles: Yeah, so…

186 00:12:56.050 00:13:05.820 Casie Aviles: it’s not happening, as much lately, but, you know, it might happen again, so I think that the point of, you know, being able to move it here is to

187 00:13:06.230 00:13:12.819 Casie Aviles: So we could hopefully, experience that less, or not at all, so…

188 00:13:12.820 00:13:13.390 Mustafa Raja: Yep.

189 00:13:14.210 00:13:19.030 Casie Aviles: Okay, yeah, I think that that will be my steps for this.

190 00:13:19.240 00:13:22.290 Casie Aviles: Let’s see…

191 00:13:22.290 00:13:23.020 Samuel Roberts: Okay.

192 00:13:23.020 00:13:24.890 Casie Aviles: I’m just gonna go back to Instagram.

193 00:13:25.610 00:13:26.420 Samuel Roberts: Yeah.

194 00:13:26.890 00:13:31.450 Samuel Roberts: the… What is the confirmed new central…

195 00:13:31.720 00:13:34.559 Samuel Roberts: Structure with client, that’s about Product Central Doc, right?

196 00:13:36.340 00:13:38.100 Samuel Roberts: Where is this? 24?

197 00:13:40.300 00:13:41.110 Casie Aviles: Yeah.

198 00:13:41.700 00:13:47.579 Samuel Roberts: What is the… Did we… did we make changes? I don’t remember exactly where we ended up with that.

199 00:13:48.940 00:13:51.010 Mustafa Raja: Yeah, we didn’t make any changes.

200 00:13:51.540 00:13:52.180 Samuel Roberts: Okay.

201 00:13:52.820 00:13:57.120 Samuel Roberts: So then I guess we don’t have to worry about that for now, until we need to make more changes at some point.

202 00:14:05.830 00:14:06.790 Samuel Roberts: Okay, cool.

203 00:14:16.820 00:14:21.789 Casie Aviles: Okay, I think, besides that.

204 00:14:22.820 00:14:26.320 Casie Aviles: It’s just going to be the evals mainly, right?

205 00:14:26.540 00:14:35.890 Casie Aviles: So I believe the next step that we need to do there is to actually… Start running evals.

206 00:14:36.850 00:14:40.900 Casie Aviles: But is that dependent on, like, having the…

207 00:14:41.820 00:14:45.380 Casie Aviles: Master app deployed, or… not yet?

208 00:14:45.380 00:14:52.459 Samuel Roberts: I mean, somewhat. I think we can probably set it up so that when it deploys, we’ll be able to log those evals.

209 00:14:55.790 00:15:01.729 Samuel Roberts: I don’t know exactly how that works, but my thought is that if we have it, you know, set up and it’s pointing to

210 00:15:02.620 00:15:07.370 Samuel Roberts: wherever we want to log them, I guess, eventually BigQuery, right?

211 00:15:09.420 00:15:11.370 Samuel Roberts: That, if we can get that running.

212 00:15:11.600 00:15:19.570 Samuel Roberts: in the master code, then we can just push that once we get it all, like, once we get it running on GCP and redeploy or something.

213 00:15:21.940 00:15:23.060 Casie Aviles: Hmm, okay.

214 00:15:23.360 00:15:27.430 Samuel Roberts: I think… Yeah, the…

215 00:15:27.560 00:15:33.260 Samuel Roberts: the ongoing evals, we definitely need to log away like we are in Snowflake right now, right?

216 00:15:35.560 00:15:40.879 Casie Aviles: Yeah, we’re in Snowflake at the moment, so I think we also want to move that to their Google.

217 00:15:41.610 00:15:47.519 Samuel Roberts: Right, so I’m saying, I think… I think as long as what we have in Mastra is logging that properly.

218 00:15:47.680 00:15:50.519 Samuel Roberts: Maybe we need to figure that out, too.

219 00:15:50.800 00:15:55.390 Samuel Roberts: then once we deploy it, hopefully we can just have it keep logging to BigQuery.

220 00:15:57.690 00:16:00.650 Casie Aviles: Mmm, okay, yeah, that makes sense.

221 00:16:00.960 00:16:09.130 Samuel Roberts: And I think, yeah, we talked about adding another little, piece of information about what kind of request it was, and we can test against that later.

222 00:16:14.110 00:16:14.560 Casie Aviles: 2 grand.

223 00:16:14.560 00:16:20.730 Samuel Roberts: I don’t know, I don’t know the ins and outs of how Mastra will log that. Mustafa, you’ve run a few evals now, is there…

224 00:16:21.100 00:16:24.859 Samuel Roberts: like, an output from that that we can just log into BigQuery?

225 00:16:26.210 00:16:28.520 Mustafa Raja: Yeah, hmm…

226 00:16:31.860 00:16:43.859 Mustafa Raja: Yeah, so, I think, what we are doing is we are just, even in NATN, we are just, getting the, grabbing the output from, the agent, and just passing it over to

227 00:16:44.100 00:16:48.060 Mustafa Raja: snowflake, right? So, we can just do the same thing here.

228 00:16:48.370 00:16:52.230 Samuel Roberts: Okay, yeah, are there other pieces of information we need to capture there?

229 00:16:52.780 00:17:05.709 Mustafa Raja: Yeah, there is execution time and stuff like that, and, what I suggested earlier in the morning, that we should look into, where the response is from, whether from DB or whether from.

230 00:17:05.710 00:17:06.159 Samuel Roberts: I agree.

231 00:17:06.160 00:17:14.290 Mustafa Raja: Stuff like that. And even more granular, if we can go more granular, into departments, that would be nice.

232 00:17:14.470 00:17:18.800 Mustafa Raja: Yeah, so stuff like that, execution time and all.

233 00:17:19.490 00:17:25.530 Samuel Roberts: Yeah, I would say take a look in the master docs and see if there’s a good way to just, like.

234 00:17:27.400 00:17:36.040 Samuel Roberts: log the evals, and we can have the scores and stuff as well. But also, I’m wondering about, like, storing that. Does it have…

235 00:17:37.690 00:17:40.459 Samuel Roberts: Like, an output that’s gonna be…

236 00:17:40.610 00:17:44.489 Samuel Roberts: You know, a dump of the run, what was used, all that sort of stuff.

237 00:17:48.010 00:17:51.020 Mustafa Raja: Like a… like a batch? A batch process?

238 00:17:51.710 00:17:56.119 Samuel Roberts: No, I’m like, so if… when a request is made to Mastra, and it goes through the workflow.

239 00:17:56.260 00:17:59.549 Samuel Roberts: Like, is there a JSON output of everything that happened?

240 00:17:59.830 00:18:01.790 Mustafa Raja: Oh, oh, okay, okay, okay.

241 00:18:01.790 00:18:08.639 Samuel Roberts: So that way, that way, that’ll help us debug later, the way we were tying things back in the end. I’m wondering if we can just log that right into the database.

242 00:18:09.140 00:18:10.290 Mustafa Raja: Okay, okay.

243 00:18:12.580 00:18:13.290 Mustafa Raja: Yeah, I’ll agree.

244 00:18:13.290 00:18:21.450 Samuel Roberts: Look at that. Okay, cool. Yeah, that would be something good to log in addition to, like, you know, the input, the output, the execution time, maybe some scores.

245 00:18:21.450 00:18:21.880 Casie Aviles: Recap out.

246 00:18:21.880 00:18:23.779 Samuel Roberts: The thumbs up, thumbs down, all that stuff.

247 00:18:24.660 00:18:25.170 Samuel Roberts: Mmm.

248 00:18:25.170 00:18:25.720 Mustafa Raja: Nope.

249 00:18:26.970 00:18:27.979 Casie Aviles: Yeah, I agree.

250 00:18:31.280 00:18:35.229 Samuel Roberts: Okay, yeah, try to take a look at that and see if there’s a way to, you know, at the end.

251 00:18:35.370 00:18:37.839 Samuel Roberts: route that stuff to BigQuery eventually.

252 00:18:39.140 00:18:39.700 Mustafa Raja: Yep.

253 00:18:40.620 00:18:41.290 Samuel Roberts: Cool.

254 00:18:41.540 00:18:42.870 Samuel Roberts: What else?

255 00:18:43.310 00:18:45.460 Samuel Roberts: What else? .

256 00:18:49.010 00:18:59.319 Casie Aviles: Okay, so I think we have… so we’re… we’ve covered the deployment for the master app, and then the next, so the evals is… yeah, we also talked about that.

257 00:18:59.320 00:18:59.840 Samuel Roberts: Yep.

258 00:19:00.400 00:19:11.080 Casie Aviles: I think… I think we should also decide what scores to use, or is that…

259 00:19:12.660 00:19:13.470 Samuel Roberts: Yes.

260 00:19:14.140 00:19:14.800 Casie Aviles: Yeah.

261 00:19:17.880 00:19:19.939 Samuel Roberts: Yeah, so we have,

262 00:19:20.940 00:19:26.919 Samuel Roberts: Well, I guess when they’re running live, we won’t have the similarity to do, right, against the golden dataset like we’re doing?

263 00:19:27.530 00:19:32.580 Samuel Roberts: But there’s some other default scores that might be nice to have.

264 00:19:39.710 00:19:43.040 Casie Aviles: Yeah, so, hmm.

265 00:19:45.840 00:19:52.009 Casie Aviles: Where are we there? I think, I think Bernab also worked on that, right?

266 00:19:52.010 00:19:53.499 Samuel Roberts: He was able to…

267 00:19:56.790 00:20:02.209 Casie Aviles: suggest, like, I think I… I believe, like, how we will do our eval, so…

268 00:20:02.700 00:20:05.459 Samuel Roberts: Okay, yeah, I haven’t seen that yet. Can we… can we go over that?

269 00:20:08.420 00:20:09.180 Casie Aviles: Sure.

270 00:20:09.180 00:20:10.910 Pranav: It’s attached to my ticket.

271 00:20:11.810 00:20:12.530 Samuel Roberts: Perfect.

272 00:20:16.550 00:20:18.289 Casie Aviles: This one, right?

273 00:20:18.540 00:20:19.989 Samuel Roberts: Cool. Yeah, thank you.

274 00:20:21.700 00:20:24.390 Pranav: Yeah, Mustafa and I, went over this.

275 00:20:27.750 00:20:34.980 Samuel Roberts: Okay, yeah, Aaron, all him as a judge, let’s see… Rosa’s a…

276 00:20:36.570 00:20:39.179 Samuel Roberts: What’s at the top here? What were the first things I missed?

277 00:20:40.530 00:20:42.800 Samuel Roberts: Overall success, department subsidy overall success.

278 00:20:45.850 00:20:56.969 Mustafa Raja: Yeah, department-specific wouldn’t work. It’s just not the… the logs do not support it correctly. I think agent is generated… generating them based on some…

279 00:20:57.320 00:20:58.130 Mustafa Raja: Prompt.

280 00:20:58.430 00:21:02.760 Mustafa Raja: So the departments we have in the longs aren’t the correct ones.

281 00:21:03.700 00:21:05.399 Samuel Roberts: Gotcha. Interesting. Okay.

282 00:21:07.240 00:21:25.420 Mustafa Raja: I need to check for the role. I didn’t check for the road, if those are… those are good. Casey, would you know if the role, that we have in Snowflake for the person testing, or for the person hitting the endpoint, if that’s the correct role?

283 00:21:25.590 00:21:27.509 Mustafa Raja: How are we getting the rule?

284 00:21:29.880 00:21:35.070 Casie Aviles: Role, you mean… what do you mean, role? Like, if they’re from lawn department?

285 00:21:35.070 00:21:35.500 Mustafa Raja: Yeah, so…

286 00:21:35.500 00:21:36.520 Casie Aviles: or CSR…

287 00:21:36.680 00:21:41.920 Mustafa Raja: No, yeah, CSR, yeah. CSR and stuff like that. Let me open up the…

288 00:21:46.770 00:21:50.940 Mustafa Raja: Let me see… So…

289 00:21:54.610 00:21:57.749 Casie Aviles: Okay, hold on, let me get into Snowflake.

290 00:21:58.100 00:22:03.160 Mustafa Raja: Yeah, so we have roles like, CSR router, welcome admin.

291 00:22:03.360 00:22:08.749 Mustafa Raja: support manager, and these are in, Snowflake.

292 00:22:09.430 00:22:13.620 Mustafa Raja: I’m wondering if these are correct, and how are we getting these?

293 00:22:14.370 00:22:21.930 Casie Aviles: Oh, this… we got this from… will we set this up for… primarily for the dashboard, and…

294 00:22:23.300 00:22:27.630 Casie Aviles: Let me recall how we were doing. Let me check.

295 00:22:34.210 00:22:40.920 Casie Aviles: Because I think it’s, A, B, C… Employees, I think, Laura.

296 00:22:42.850 00:22:44.359 Mustafa Raja: Oh, so we have that.

297 00:22:45.530 00:22:46.850 Samuel Roberts: Oh, interesting, okay.

298 00:22:55.140 00:22:58.089 Casie Aviles: Yeah, we tagged them here.

299 00:22:58.090 00:22:58.500 Samuel Roberts: Got it.

300 00:22:58.500 00:23:04.099 Casie Aviles: So if… if the name matches… You know.

301 00:23:04.660 00:23:08.390 Mustafa Raja: But I believe for the department, we do have some overlap now, right?

302 00:23:10.230 00:23:15.070 Casie Aviles: Yeah, this is… this is primarily for the dashboard. It’s… it’s… it shouldn’t affect the…

303 00:23:15.070 00:23:16.340 Mustafa Raja: Okay.

304 00:23:16.340 00:23:24.159 Casie Aviles: andy, you know, how it’s logging Andy, but yeah, I get that now, since we have, like.

305 00:23:24.900 00:23:28.310 Casie Aviles: separate… ways of tagging.

306 00:23:28.480 00:23:35.120 Casie Aviles: It’s a separate one for… The actual, exchanges, the chat.

307 00:23:35.450 00:23:39.460 Casie Aviles: And then it’s a separate one for… the dashboard.

308 00:23:42.120 00:23:42.840 Mustafa Raja: Hmm.

309 00:23:44.870 00:23:45.710 Samuel Roberts: I’m sorry, I’m not sure.

310 00:23:45.710 00:23:51.889 Mustafa Raja: I guess the roles would be… roles would be the correct ones, the departments, I think, we are better off.

311 00:23:52.460 00:23:58.659 Mustafa Raja: Just taking a look at the output and what, which department did we refer to.

312 00:24:01.450 00:24:03.490 Casie Aviles: Oh, wait, sorry.

313 00:24:05.190 00:24:06.749 Samuel Roberts: Oh, weird. I mean, I do have…

314 00:24:07.290 00:24:12.240 Samuel Roberts: Once we know the person’s name or whatever identifying feature we have, We can match.

315 00:24:12.240 00:24:12.650 Casie Aviles: So.

316 00:24:12.650 00:24:13.540 Samuel Roberts: right?

317 00:24:14.290 00:24:18.900 Casie Aviles: Yeah, we’re… I believe it’s, like, a SQL query going on.

318 00:24:19.850 00:24:23.669 Casie Aviles: At the back, which lets us basically just see…

319 00:24:25.660 00:24:26.040 Mustafa Raja: Is it…

320 00:24:26.040 00:24:28.949 Casie Aviles: the, the role… Department in the dashboard.

321 00:24:30.350 00:24:35.980 Mustafa Raja: Is a query running within, within, Snowflake, or is it, in NHL or somewhere?

322 00:24:35.980 00:24:40.230 Casie Aviles: It’s for real, it’s for real. It’s why I, I’m…

323 00:24:40.470 00:24:43.289 Casie Aviles: Mentioning that it’s separate, you know, like…

324 00:24:43.290 00:24:49.259 Samuel Roberts: I don’t think it’s important for us to worry about logging from master, we just want to make sure we log the CSR name.

325 00:24:49.900 00:24:53.230 Samuel Roberts: Or, you know, ID or whatever, or we’re identifying them.

326 00:24:54.970 00:24:57.080 Casie Aviles: Yeah, so this is what we have at the moment.

327 00:24:57.360 00:25:01.070 Casie Aviles: we have the… I think we can also improve our logging, since…

328 00:25:01.070 00:25:02.480 Samuel Roberts: Definitely, yeah.

329 00:25:02.480 00:25:08.870 Casie Aviles: record ID contains the timestamp, and we don’t have, like, a dedicated timestamp.

330 00:25:09.560 00:25:11.009 Mustafa Raja: Yep. Column.

331 00:25:11.010 00:25:17.610 Casie Aviles: So that’s one thing we… moving forward, I think we can… Work on… What else?

332 00:25:19.290 00:25:26.180 Casie Aviles: So, execution time, that’s good. I think we can also have, probably, email addresses, if we can get.

333 00:25:26.180 00:25:26.510 Samuel Roberts: Yeah.

334 00:25:26.510 00:25:28.209 Casie Aviles: I believe we can, but…

335 00:25:28.590 00:25:30.409 Samuel Roberts: Yeah, I’m sure there’s something there.

336 00:25:31.050 00:25:37.169 Casie Aviles: Yeah, and then the quality score, this one is honestly… just using an LLM.

337 00:25:37.670 00:25:39.579 Casie Aviles: another LLM call.

338 00:25:39.810 00:25:43.779 Casie Aviles: And it’s not the most… accurate, right? Because the main…

339 00:25:44.920 00:25:46.970 Casie Aviles: Problem that we had was…

340 00:25:47.420 00:25:50.010 Casie Aviles: These aren’t very accurate, and they don’t reflect.

341 00:25:50.500 00:25:51.949 Samuel Roberts: the.

342 00:25:52.730 00:25:56.440 Casie Aviles: The, you know, the accuracy of the responses, so…

343 00:25:56.770 00:25:57.630 Samuel Roberts: Right.

344 00:25:59.380 00:26:02.689 Samuel Roberts: Okay. So yeah, I mean, we might need to either…

345 00:26:02.800 00:26:09.010 Samuel Roberts: Well, we can get rid of that one, I’m sure. The question is, what do we… substitute for…

346 00:26:10.580 00:26:11.580 Mustafa Raja: Quality score.

347 00:26:11.950 00:26:15.089 Samuel Roberts: Yeah, like, we can do another LLM as a judge, but…

348 00:26:17.180 00:26:24.769 Mustafa Raja: I think, I think earlier you suggested having a… having a… Live eval, you know?

349 00:26:25.230 00:26:27.490 Mustafa Raja: This could be replaced with that, no?

350 00:26:28.510 00:26:29.490 Samuel Roberts: Oh, yeah.

351 00:26:32.890 00:26:38.869 Mustafa Raja: But I guess we would want to talk about how we want to set up… set that up, since we wouldn’t have any ground truth.

352 00:26:39.380 00:26:41.219 Mustafa Raja: What would be the matrix?

353 00:26:41.370 00:26:42.380 Mustafa Raja: For judgment.

354 00:26:44.190 00:26:44.690 Samuel Roberts: Hmm.

355 00:26:44.690 00:26:49.320 Mustafa Raja: I think, for the database,

356 00:26:50.370 00:26:55.029 Mustafa Raja: We could just judge the query against the request, if it’s good.

357 00:26:57.280 00:27:01.510 Samuel Roberts: I’m not sure I followed. What do you… Judge it against the…

358 00:27:01.650 00:27:04.040 Samuel Roberts: Like, did it return a value, or…

359 00:27:04.690 00:27:11.459 Mustafa Raja: Yeah, I guess we could judge the query if it reflects the request.

360 00:27:11.880 00:27:13.390 Mustafa Raja: Or not, you know?

361 00:27:13.740 00:27:18.060 Samuel Roberts: Yes, there’s a couple things in Monster, or at least one that does, like, relevancy, I think.

362 00:27:19.310 00:27:20.150 Mustafa Raja: Hmm…

363 00:27:22.810 00:27:25.680 Samuel Roberts: Okay, so I think… I think what we should do here is…

364 00:27:27.910 00:27:35.359 Samuel Roberts: Let’s look at the scores that Monster comes with, and figure out which ones are valuable for us here.

365 00:27:36.600 00:27:40.839 Samuel Roberts: And make a list of those ones, and then maybe we try…

366 00:27:41.670 00:27:49.850 Samuel Roberts: something is an LLM with a judge, where we can also… Maybe give it… No, I don’t know.

367 00:27:50.070 00:27:54.260 Samuel Roberts: I don’t know how best to do that against, like, changing data in the database and everything.

368 00:27:59.140 00:28:06.160 Casie Aviles: Yeah, that’s… that’s also… challenging, because, the source of truth will be changing, right? Yeah.

369 00:28:06.160 00:28:07.250 Samuel Roberts: Right.

370 00:28:10.240 00:28:14.379 Samuel Roberts: And maybe we just judge on, like, relevancy and,

371 00:28:15.010 00:28:21.519 Samuel Roberts: Yeah, we measure how long it took, and then we have to use the thumbs up, thumbs down to really dig into deeper ones, maybe?

372 00:28:23.980 00:28:27.279 Samuel Roberts: So that might be something where we have the evals running.

373 00:28:27.460 00:28:32.570 Samuel Roberts: Every time there’s a query, but then we do… if there’s a thumbs down, maybe something triggers to look into why.

374 00:28:33.780 00:28:36.950 Mustafa Raja: Hmm, yeah, I think this is what Trutham also suggested.

375 00:28:37.170 00:28:38.760 Samuel Roberts: I think so too, yeah.

376 00:28:39.080 00:28:40.780 Samuel Roberts: I think we’re getting to that point now.

377 00:28:43.850 00:28:48.290 Samuel Roberts: So I would say, for now, let’s focus on…

378 00:28:50.920 00:28:55.800 Samuel Roberts: Yeah, I’m just trying to think, is there any other way to do this besides that?

379 00:28:57.070 00:29:00.799 Samuel Roberts: Which I… I don’t know if it’s the data, I guess, a little bit changing.

380 00:29:01.570 00:29:02.800 Samuel Roberts: Yeah, here we go, yeah.

381 00:29:03.280 00:29:05.400 Samuel Roberts: Relevancy, I think, is a good one.

382 00:29:11.190 00:29:12.040 Samuel Roberts: I don’t know about hallucinity.

383 00:29:12.040 00:29:17.480 Mustafa Raja: Relevancy for, this would be for, database, right?

384 00:29:18.590 00:29:20.970 Samuel Roberts: Just in general, I mean, I think that’s, like…

385 00:29:21.360 00:29:25.240 Samuel Roberts: Relevancy, like, if they ask a question and it comes back with something totally, you know.

386 00:29:25.940 00:29:27.380 Mustafa Raja: Not helpful.

387 00:29:30.800 00:29:34.620 Casie Aviles: I mean, you know, we could start in general and look at

388 00:29:34.780 00:29:37.579 Casie Aviles: Even just the central dock for now, like…

389 00:29:38.270 00:29:38.950 Samuel Roberts: Yeah.

390 00:29:39.590 00:29:49.439 Casie Aviles: Weekend… Yeah, we can have, like, a set of questions that… Did we want to test?

391 00:29:49.630 00:29:52.970 Casie Aviles: Right, and then the ideal answers for them.

392 00:29:53.490 00:29:57.330 Casie Aviles: And then score based on that. So, I think that’s what we did in the past.

393 00:29:57.890 00:29:59.670 Mustafa Raja: Yeah, I think we have this one now.

394 00:30:06.060 00:30:15.680 Casie Aviles: Okay… So… We just need to build, like, another dataset then, in that case, right? Or…

395 00:30:16.670 00:30:19.930 Casie Aviles: Since we have, like, a… An older one.

396 00:30:20.910 00:30:21.840 Samuel Roberts: Right.

397 00:30:23.860 00:30:27.220 Casie Aviles: But I feel like it’s… it’s quite outdated now.

398 00:30:27.900 00:30:28.820 Samuel Roberts: Yeah.

399 00:30:30.990 00:30:32.880 Samuel Roberts: I mean, we have a somewhat…

400 00:30:33.200 00:30:37.810 Samuel Roberts: newer data set. What month is those ones from, Mustafa, that we’ve been testing against?

401 00:30:38.080 00:30:39.880 Mustafa Raja: Previous three.

402 00:30:39.880 00:30:41.020 Samuel Roberts: 3 months.

403 00:30:47.350 00:30:50.179 Samuel Roberts: Yeah, I think the… well, I don’t know.

404 00:30:53.040 00:31:00.090 Samuel Roberts: I think the end goal here is to get it… Running, logging.

405 00:31:01.240 00:31:05.719 Samuel Roberts: And then we can do some tests of it and see how those logs look.

406 00:31:05.970 00:31:13.430 Samuel Roberts: against what we… you know, because I don’t know how good these scorers are, even, in Mostra, you know, like, we’re kind of assuming they’re decent.

407 00:31:14.280 00:31:14.940 Casie Aviles: Yeah.

408 00:31:15.550 00:31:17.680 Samuel Roberts: So we probably want to test that a little bit, too.

409 00:31:20.310 00:31:22.819 Samuel Roberts: Against a golden data set, maybe.

410 00:31:24.140 00:31:24.980 Samuel Roberts: Which we…

411 00:31:24.980 00:31:27.439 Mustafa Raja: The built-in ones don’t really work with Workflow.

412 00:31:28.530 00:31:29.540 Samuel Roberts: Oh, really?

413 00:31:29.880 00:31:30.590 Mustafa Raja: Yeah.

414 00:31:31.380 00:31:31.760 Samuel Roberts: Oh, no.

415 00:31:31.760 00:31:41.109 Mustafa Raja: They expect the output to be in a specific structure in, that the agent outputs in.

416 00:31:43.250 00:31:46.579 Mustafa Raja: That’s not true. I guess we could match that with the workflow.

417 00:31:47.710 00:31:51.220 Mustafa Raja: But then we’ll… we limit ourselves to the…

418 00:31:51.810 00:31:55.129 Samuel Roberts: Yeah, I don’t think we want to do that. Okay, we may have to dig into that.

419 00:31:55.130 00:31:55.450 Mustafa Raja: So…

420 00:31:55.450 00:31:56.330 Samuel Roberts: bit more, then.

421 00:31:57.090 00:31:57.610 Mustafa Raja: Yep.

422 00:31:59.190 00:32:02.160 Mustafa Raja: I just built my own, L&M as a judge.

423 00:32:02.410 00:32:02.930 Mustafa Raja: From…

424 00:32:02.930 00:32:03.670 Samuel Roberts: Right.

425 00:32:03.670 00:32:04.600 Mustafa Raja: and a thumbs up.

426 00:32:05.260 00:32:10.909 Samuel Roberts: Sure. But for the other ones, I thought… I was hoping we could just drop a few of those in, otherwise I don’t want to have to rebuild all those.

427 00:32:12.670 00:32:19.280 Mustafa Raja: Oh, yeah, definitely. I think we would have the source code and the… what’s it called? Prompts right in the…

428 00:32:19.880 00:32:21.170 Mustafa Raja: non-modules.

429 00:32:21.960 00:32:22.949 Samuel Roberts: That’s true.

430 00:32:25.610 00:32:28.569 Samuel Roberts: Yes, we might just be able to pull the prompts that they’re using.

431 00:32:29.010 00:32:29.610 Mustafa Raja: Yep.

432 00:32:30.200 00:32:31.529 Samuel Roberts: Okay, that’s not a bad idea.

433 00:32:33.900 00:32:40.450 Samuel Roberts: Okay, so I think… the plan of attack here is try to get it running on GCP,

434 00:32:40.780 00:32:47.840 Samuel Roberts: While that’s happening, also get, the logging working?

435 00:32:50.650 00:32:54.460 Samuel Roberts: And then some of these scorers working with that, right?

436 00:32:56.340 00:32:59.330 Mustafa Raja: Some of the best coders working, doing in life.

437 00:32:59.620 00:33:00.529 Mustafa Raja: Live scoring?

438 00:33:00.530 00:33:06.219 Samuel Roberts: Yeah, I mean, like, so that it’ll store in the database in Snowflake or BigQuery.

439 00:33:06.720 00:33:12.819 Mustafa Raja: Okay, so we, yeah, so we want to replace the quality score in the dataset, right?

440 00:33:13.600 00:33:13.990 Casie Aviles: Yeah.

441 00:33:13.990 00:33:18.240 Samuel Roberts: Yeah, with at least something else, if not multiple things, maybe even, so…

442 00:33:18.890 00:33:19.550 Mustafa Raja: Okay.

443 00:33:20.090 00:33:21.779 Mustafa Raja: Yeah, I just wanted to confirm that.

444 00:33:22.090 00:33:23.999 Samuel Roberts: Yeah, I think that makes the most sense.

445 00:33:25.350 00:33:26.360 Samuel Roberts: And…

446 00:33:26.360 00:33:30.469 Mustafa Raja: And this data set, are we going to run it on some cadence?

447 00:33:34.500 00:33:39.729 Samuel Roberts: Okay, yeah, so then there’s also that side of it. Yeah, so I was thinking just, like, live, as requests come in.

448 00:33:39.900 00:33:46.839 Samuel Roberts: you know, the results get scored, and we keep that, and we can compare that to the thumbs up, thumbs down over time. I think…

449 00:33:46.840 00:33:48.230 Mustafa Raja: Oh,

450 00:33:48.230 00:33:56.379 Samuel Roberts: What we also want to do is run something on a cadence where we pull the data and compare it, but I don’t know how best to set that up yet.

451 00:33:58.250 00:33:58.940 Mustafa Raja: Yeah.

452 00:33:59.110 00:34:11.460 Mustafa Raja: I guess for the thumbs down, it was really helpful to see how much we have, you know, how much thumbs down requests have we been covering, you know? Right, right. Based on the detailed feedback.

453 00:34:14.870 00:34:15.550 Samuel Roberts: Okay, so maybe that…

454 00:34:15.550 00:34:17.160 Mustafa Raja: I have that metric weekly.

455 00:34:17.770 00:34:24.849 Samuel Roberts: Yeah, exactly. Maybe weekly we run something and see how the thumbs-downs look, and have our scores match them, and if we can understand that over time.

456 00:34:25.360 00:34:29.270 Mustafa Raja: Might be… might be a good thing to send them as weekly update.

457 00:34:29.409 00:34:33.750 Mustafa Raja: We got this many thumbs down, we resolved this many thumbs down.

458 00:34:35.159 00:34:37.079 Samuel Roberts: Mmm, yes, I see that.

459 00:34:40.529 00:34:41.469 Samuel Roberts: Okay.

460 00:34:46.620 00:34:50.980 Mustafa Raja: I don’t… I just don’t know if, weekly is a little too ambitious.

461 00:34:51.429 00:34:54.599 Samuel Roberts: It might be, but we can… we can start, at least, and see how it goes.

462 00:34:56.040 00:35:00.740 Mustafa Raja: Yeah, because… We’ll have to focus on triage then.

463 00:35:01.970 00:35:03.230 Samuel Roberts: Yeah, true.

464 00:35:08.780 00:35:12.420 Samuel Roberts: Well, I mean, I think the important thing is we can get this into…

465 00:35:13.340 00:35:20.990 Samuel Roberts: GCP, we can test it, we can get it out to some CSRs to test it, even, and see how it compares in real world.

466 00:35:22.890 00:35:23.470 Mustafa Raja: Yep.

467 00:35:23.470 00:35:25.649 Samuel Roberts: While we’re figuring out how to log the data.

468 00:35:26.170 00:35:27.720 Samuel Roberts: And what data to log.

469 00:35:29.690 00:35:37.790 Mustafa Raja: Yeah. Casey, did we need a Cloud Run admin permission on our… Okay, login.

470 00:35:38.780 00:35:41.579 Casie Aviles: Yeah, that was what I was, asking Tim for.

471 00:35:42.030 00:35:42.410 Samuel Roberts: Yeah.

472 00:35:42.410 00:35:42.760 Mustafa Raja: Oh, yeah.

473 00:35:42.760 00:35:43.199 Samuel Roberts: I wonder if they’.

474 00:35:43.200 00:35:44.240 Mustafa Raja: We have that now.

475 00:35:45.900 00:35:47.700 Casie Aviles: Oh, okay, nice.

476 00:35:47.700 00:35:56.809 Mustafa Raja: Yeah, I was in the GCP, I checked the role, and I see that Brainforge.co and Twitter now has the role, Cloud and Admin.

477 00:35:57.830 00:36:03.389 Casie Aviles: Okay, cool. So, I should be able to test within here then, then. Yeah. Okay. Great.

478 00:36:03.570 00:36:04.490 Mustafa Raja: That’s nice.

479 00:36:05.950 00:36:06.590 Casie Aviles: Hmm…

480 00:36:06.590 00:36:07.170 Samuel Roberts: Great.

481 00:36:08.870 00:36:11.080 Casie Aviles: Yeah, okay, let me just double check.

482 00:36:13.870 00:36:15.329 Casie Aviles: But, okay, I think it’s…

483 00:36:15.500 00:36:22.129 Casie Aviles: Yeah, it’s, clear what we’ll be doing, so for Eva’s now, we just have to…

484 00:36:23.410 00:36:28.720 Casie Aviles: get the logging working as well, right? And then we’ll test out, like.

485 00:36:29.440 00:36:33.949 Casie Aviles: a bunch of the different scores… scorers, I mean, that were built in.

486 00:36:35.730 00:36:36.500 Samuel Roberts: Yeah.

487 00:36:36.770 00:36:42.279 Casie Aviles: But I think we can also use these, like, we can also check latency, right?

488 00:36:42.280 00:36:42.810 Samuel Roberts: Yes.

489 00:36:42.810 00:36:50.089 Casie Aviles: Also, be able to… I guess it’s just about, like, which scorers would fit these best, I guess, or…

490 00:36:50.090 00:36:57.640 Samuel Roberts: Yeah, well, I think error rate, yeah, if there’s an error, we’ll… we’ll need to log that. We’ll have to log the execution time.

491 00:36:57.760 00:37:00.499 Samuel Roberts: We’ll have to log… yeah, so in order to see those over time…

492 00:37:01.470 00:37:05.049 Samuel Roberts: which I think would, you know, and then execute… we also probably want to log…

493 00:37:05.420 00:37:09.080 Samuel Roberts: Yeah, a better ID with the timestamp and everything.

494 00:37:11.950 00:37:12.790 Casie Aviles: Okay, cool.

495 00:37:19.230 00:37:19.770 Samuel Roberts: What up?

496 00:37:19.770 00:37:23.550 Casie Aviles: Okay, you’re… yeah, we’re good here. We’re not… we’re not locked here anymore.

497 00:37:23.550 00:37:24.590 Samuel Roberts: Cool. Okay, great.

498 00:37:24.590 00:37:26.830 Casie Aviles: unless I find something else, but.

499 00:37:26.830 00:37:30.920 Samuel Roberts: Yeah, who knows? There’s always something in Google’s hustle.

500 00:37:31.750 00:37:32.950 Casie Aviles: Yeah, thank you.

501 00:37:33.630 00:37:34.270 Samuel Roberts: Okay.

502 00:37:34.840 00:37:35.640 Samuel Roberts: Great.

503 00:37:35.990 00:37:37.059 Samuel Roberts: What else?

504 00:37:40.070 00:37:40.660 Casie Aviles: Right.

505 00:37:49.880 00:37:53.720 Mustafa Raja: Yeah, let’s mark 31 as done also, right?

506 00:37:54.640 00:37:57.760 Samuel Roberts: Yep, 31’s done, we’re working on 32…

507 00:37:58.290 00:38:00.599 Samuel Roberts: So we’ll bump that down a little bit, I guess.

508 00:38:02.780 00:38:04.620 Casie Aviles: Oh, you mean, like, later in the week?

509 00:38:04.940 00:38:06.539 Samuel Roberts: Just bump into this week, yeah.

510 00:38:08.820 00:38:10.699 Samuel Roberts: I think it’s fine there, and then…

511 00:38:15.350 00:38:17.949 Samuel Roberts: Yeah, same with Implement Monster, that’s fine too.

512 00:38:19.990 00:38:20.890 Samuel Roberts: Cool.

513 00:38:21.660 00:38:24.220 Casie Aviles: So this should be moved as well, I think.

514 00:38:25.810 00:38:28.120 Samuel Roberts: Yeah, because it allows me testing against that, yep.

515 00:38:30.970 00:38:32.239 Casie Aviles: Build this one.

516 00:38:33.930 00:38:35.510 Casie Aviles: Let’s show what these are.

517 00:38:36.280 00:38:41.290 Mustafa Raja: So… For this one, actually, for the load testing one.

518 00:38:42.110 00:38:47.110 Mustafa Raja: that, when I run the evals, I get, I get, what’s it called?

519 00:38:47.320 00:38:57.410 Mustafa Raja: Rate-limited a lot, but this also happened when I was also doing the master versus NHN1.

520 00:38:57.630 00:38:58.200 Mustafa Raja: So…

521 00:38:58.200 00:38:58.860 Samuel Roberts: Hmm.

522 00:38:59.550 00:39:09.609 Mustafa Raja: And we have seen, with the NATN instance that sometimes CSRs also get rate limited. Do we want to do something about that?

523 00:39:10.020 00:39:14.060 Samuel Roberts: Yeah, are we able to… so, do we know what the rate that we’re hitting is?

524 00:39:15.930 00:39:20.219 Mustafa Raja: Like, like, what’s the default rate limit allowed to us?

525 00:39:20.710 00:39:23.559 Samuel Roberts: Yeah, is there… one, is there a way to turn that up, and then…

526 00:39:23.790 00:39:26.890 Samuel Roberts: Two, is there a way to do, like, a retry kind of thing?

527 00:39:27.390 00:39:28.190 Mustafa Raja: Okay.

528 00:39:28.440 00:39:29.470 Mustafa Raja: Yeah, I mean…

529 00:39:29.650 00:39:31.699 Casie Aviles: Azure model that we have.

530 00:39:31.990 00:39:32.920 Samuel Roberts: Yeah. Yep.

531 00:39:34.010 00:39:34.820 Casie Aviles: Okay.

532 00:39:39.020 00:39:45.520 Casie Aviles: I know that I implemented some retries already, and also I’ve set the

533 00:39:46.430 00:39:50.570 Casie Aviles: I’ve set the limit to the max, I believe, in…

534 00:39:50.570 00:39:56.799 Samuel Roberts: Okay. Okay, well that’s… okay. So if that’s good, then yeah, we have to make sure we do retries in Mastra, or make sure Mastra’s handling that.

535 00:39:57.960 00:39:58.830 Samuel Roberts: Gracefully.

536 00:39:59.630 00:40:00.120 Casie Aviles: Yeah.

537 00:40:00.120 00:40:01.250 Mustafa Raja: I mean…

538 00:40:01.350 00:40:02.360 Casie Aviles: Here.

539 00:40:02.360 00:40:03.690 Samuel Roberts: Next, we trust 3, okay, yeah.

540 00:40:03.690 00:40:07.020 Mustafa Raja: Is this our instance, or their instance?

541 00:40:09.680 00:40:12.250 Casie Aviles: You mean if this is their…

542 00:40:12.580 00:40:14.990 Mustafa Raja: Yeah, the Azor, yeah, yeah.

543 00:40:14.990 00:40:17.570 Casie Aviles: Yeah, this is ours, you know, this is Brainforge.

544 00:40:20.110 00:40:20.810 Samuel Roberts: Okay.

545 00:40:22.480 00:40:27.029 Casie Aviles: That’s something they haven’t really spun up yet, the models.

546 00:40:27.470 00:40:28.220 Mustafa Raja: Got it. Okay.

547 00:40:28.220 00:40:28.910 Samuel Roberts: Okay.

548 00:40:29.370 00:40:31.219 Samuel Roberts: We may have to sort that out anyway.

549 00:40:33.450 00:40:35.010 Mustafa Raja: Yeah, I was wondering the other day.

550 00:40:37.680 00:40:40.290 Samuel Roberts: Yeah, I hadn’t thought about that. Okay.

551 00:40:45.310 00:40:50.610 Casie Aviles: Yeah, I was… yeah, it’s fine. I was just trying to show that I… that it’s maxed out already.

552 00:40:50.610 00:40:54.560 Samuel Roberts: Okay, yeah, if it’s Mexico, That’s fine, let’s.

553 00:40:54.580 00:40:56.329 Mustafa Raja: I think, I think with the…

554 00:40:56.790 00:41:02.309 Samuel Roberts: With the evals now, one, we are hitting a lot of requests.

555 00:41:04.260 00:41:09.479 Mustafa Raja: One for the routing agent, then the routing agent goes to the… what’s it called?

556 00:41:10.630 00:41:20.280 Mustafa Raja: Either of the two, agents, and then they make requests, and this all is happening

557 00:41:20.700 00:41:22.290 Mustafa Raja: a lot quicker.

558 00:41:22.640 00:41:29.949 Mustafa Raja: In a very short time, so… so… I think we are making a lot of requests, that’s why it’s happening.

559 00:41:30.130 00:41:32.999 Samuel Roberts: Yeah, a bunch of them at once, okay.

560 00:41:34.070 00:41:38.540 Samuel Roberts: I’m trying to think…

561 00:41:40.190 00:41:45.780 Samuel Roberts: if we need to, like, what is the plan? I don’t really know. Are they expecting us to be…

562 00:41:46.080 00:41:47.870 Samuel Roberts: Hosting these models still.

563 00:41:48.160 00:41:49.200 Samuel Roberts: Goodbye.

564 00:41:49.820 00:41:51.100 Samuel Roberts: We talked about…

565 00:41:52.090 00:41:53.210 Casie Aviles: Yeah, it’s okay.

566 00:41:53.210 00:41:54.230 Mustafa Raja: Talked about this.

567 00:41:54.630 00:41:59.770 Casie Aviles: Yeah, I think I mentioned to them… Whether it should be…

568 00:42:00.040 00:42:04.179 Casie Aviles: Something that they, they would, you know, host themselves.

569 00:42:05.200 00:42:09.360 Casie Aviles: I think Tim was saying that, yeah, that’s something they could do.

570 00:42:09.890 00:42:12.810 Casie Aviles: But we didn’t, like, dive into, like, specifics on…

571 00:42:13.300 00:42:14.390 Samuel Roberts: Okay, well, let’s fine-tune.

572 00:42:14.390 00:42:18.399 Casie Aviles: Yeah, exactly, that’s why I want to make sure that that’s all.

573 00:42:19.380 00:42:22.700 Samuel Roberts: Okay, maybe we need to put together something with that for them.

574 00:42:23.220 00:42:24.150 Casie Aviles: Hmm.

575 00:42:24.800 00:42:27.519 Samuel Roberts: Eventually, but let’s get this working, at least, first.

576 00:42:28.850 00:42:31.050 Casie Aviles: Yeah, but they’re open to it, I believe.

577 00:42:31.050 00:42:32.759 Samuel Roberts: Okay, good, that’s good to know.

578 00:42:33.620 00:42:34.470 Samuel Roberts: Okay.

579 00:42:36.040 00:42:38.010 Samuel Roberts: Yeah, I would say…

580 00:42:38.170 00:42:43.849 Samuel Roberts: If we’re getting timeouts in Mosra, let’s make sure to check the retries or something, and see what that’s set to.

581 00:42:44.210 00:42:49.229 Samuel Roberts: Otherwise, let’s see if it happens in real-world scenarios or not, because the evals are different.

582 00:42:50.550 00:42:54.070 Mustafa Raja: Yeah, yeah, yeah, emails are just… Requests…

583 00:42:54.130 00:42:58.879 Samuel Roberts: Again and again, right? Yeah, exactly. So hopefully, if… if that’s…

584 00:42:59.070 00:43:03.860 Samuel Roberts: Not a problem, we won’t have to deal with it too quickly, but… Okay.

585 00:43:04.700 00:43:05.620 Samuel Roberts: What else?

586 00:43:05.620 00:43:08.890 Mustafa Raja: And I have… and I don’t have any retries logic.

587 00:43:09.680 00:43:10.030 Samuel Roberts: Okay.

588 00:43:10.030 00:43:10.990 Mustafa Raja: implemented.

589 00:43:11.480 00:43:15.529 Samuel Roberts: Yeah, we may want to add a little bit of that. I don’t know if Monster has that as a setting or not, but…

590 00:43:16.000 00:43:16.640 Mustafa Raja: Okay.

591 00:43:18.900 00:43:19.640 Casie Aviles: Alright.

592 00:43:19.970 00:43:20.500 Samuel Roberts: Aaron?

593 00:43:21.000 00:43:24.330 Samuel Roberts: Yeah, as a retry mechanic for workflows… okay.

594 00:43:27.840 00:43:30.999 Samuel Roberts: Yeah, there’s retries, you can do attempts and delays, okay.

595 00:43:31.720 00:43:32.090 Mustafa Raja: Okay.

596 00:43:32.090 00:43:34.449 Samuel Roberts: Then I just want to be ready to set those if we need to.

597 00:43:44.650 00:43:46.000 Samuel Roberts: Okay, cool.

598 00:43:46.170 00:43:47.149 Samuel Roberts: What else?

599 00:43:48.700 00:43:54.649 Casie Aviles: I think there’s also this item for me. Yeah, this is what I think Mustafa was talking about.

600 00:43:54.650 00:43:55.000 Mustafa Raja: Yup.

601 00:43:55.000 00:43:56.720 Casie Aviles: This is the tricky one for me.

602 00:43:56.720 00:43:57.809 Samuel Roberts: Yes, it is.

603 00:44:00.720 00:44:06.519 Casie Aviles: Hmm… I think, yeah, I probably need to do a spike first, because we…

604 00:44:06.520 00:44:07.050 Mustafa Raja: I’m…

605 00:44:07.050 00:44:07.849 Casie Aviles: No idea.

606 00:44:07.850 00:44:08.350 Samuel Roberts: Yeah.

607 00:44:08.350 00:44:08.730 Casie Aviles: this year.

608 00:44:08.730 00:44:16.170 Samuel Roberts: My thought is maybe… What we should do… Is mock the data.

609 00:44:19.610 00:44:19.960 Samuel Roberts: So…

610 00:44:19.960 00:44:21.570 Casie Aviles: The one we have on Super Base.

611 00:44:24.370 00:44:27.150 Samuel Roberts: Yeah, we just want to know that the text-to-SQL, like.

612 00:44:27.560 00:44:32.290 Samuel Roberts: is working properly. So if we have it just point to a database that we know is good.

613 00:44:32.800 00:44:34.969 Samuel Roberts: And then do a few tests against that.

614 00:44:35.810 00:44:42.480 Samuel Roberts: we’ll know what we’re looking for, right? So rather than testing against the live database, maybe we test against that dev one you had spun up.

615 00:44:44.460 00:44:45.340 Casie Aviles: Hmm, okay.

616 00:44:45.570 00:44:52.250 Samuel Roberts: then we can test to make sure the text-to-SQL is doing what we expect every time on consistent data.

617 00:44:54.190 00:44:55.800 Mustafa Raja: Yeah, one thing…

618 00:44:56.240 00:44:56.980 Samuel Roberts: Yeah, go ahead.

619 00:44:57.190 00:45:07.779 Mustafa Raja: Yeah, one thing that I’m wondering is if we have an updated ad column or not. If we do have that, and we see that, okay, this has been recently updated.

620 00:45:08.290 00:45:22.230 Mustafa Raja: After this row was created, then the judge would know that, yeah, this data is different, but this is also recently updated.

621 00:45:22.900 00:45:27.250 Samuel Roberts: That’s a good idea. Do we have updated at on the… .

622 00:45:28.020 00:45:29.240 Casie Aviles: the columns.

623 00:45:29.240 00:45:29.900 Samuel Roberts: Yeah.

624 00:45:29.900 00:45:30.550 Mustafa Raja: Yeah.

625 00:45:31.690 00:45:33.639 Casie Aviles: I don’t think we do.

626 00:45:33.960 00:45:36.270 Samuel Roberts: Okay, that should be pretty good with a…

627 00:45:36.390 00:45:39.839 Samuel Roberts: Yeah, even a Postgres extension, I think, can do that automatically.

628 00:45:39.840 00:45:42.930 Mustafa Raja: Yeah, and then we don’t even need to manage it, right?

629 00:45:43.270 00:45:44.970 Mustafa Raja: Postgres manages that.

630 00:45:45.190 00:45:46.679 Samuel Roberts: Yep, yep, exactly.

631 00:45:46.980 00:45:50.810 Samuel Roberts: So I think… I think that’s a good idea in general, but I think also…

632 00:45:50.950 00:45:56.440 Samuel Roberts: The idea of, like, just validating it with like, a test EV.

633 00:45:56.660 00:46:00.330 Samuel Roberts: That is the same format and structure is not a bad idea.

634 00:46:02.340 00:46:03.200 Casie Aviles: Okay.

635 00:46:06.040 00:46:10.899 Casie Aviles: So that’s just, like, adding a new column, right? Yeah. For each table.

636 00:46:11.500 00:46:13.369 Samuel Roberts: For the updated app?

637 00:46:13.810 00:46:14.280 Mustafa Raja: Yep.

638 00:46:14.280 00:46:14.970 Casie Aviles: Yeah.

639 00:46:15.200 00:46:16.590 Casie Aviles: Yeah, there’s a…

640 00:46:16.590 00:46:19.569 Samuel Roberts: There’s definitely an extension that does that.

641 00:46:19.710 00:46:21.139 Samuel Roberts: I don’t remember…

642 00:46:22.790 00:46:25.720 Casie Aviles: So, for example, we’ll just add another column.

643 00:46:27.660 00:46:32.339 Mustafa Raja: Yeah, try that, and then, Postgres has its own functions.

644 00:46:32.560 00:46:33.090 Mustafa Raja: You could use.

645 00:46:33.090 00:46:37.830 Samuel Roberts: Yeah, maybe it’s that. Maybe that’s what I’m thinking of, the updating… maybe it’s already built in… yeah.

646 00:46:39.070 00:46:45.299 Casie Aviles: Okay, yeah, because I didn’t… I did… I do remember that we have, like, a history table.

647 00:46:45.960 00:46:46.780 Mustafa Raja: Right.

648 00:46:46.780 00:46:47.310 Samuel Roberts: It was good.

649 00:46:47.770 00:46:48.990 Samuel Roberts: But it might be harder.

650 00:46:48.990 00:46:49.420 Mustafa Raja: Put it on.

651 00:46:49.420 00:46:50.450 Samuel Roberts: validate that.

652 00:46:50.590 00:46:52.410 Mustafa Raja: Let’s test it on appendix.

653 00:46:56.150 00:47:00.659 Mustafa Raja: Let’s see if we add here… An updated one.

654 00:47:01.060 00:47:05.580 Mustafa Raja: I think, Postgres would also already suggest us.

655 00:47:05.960 00:47:07.230 Mustafa Raja: What we should do.

656 00:47:07.480 00:47:09.619 Samuel Roberts: Yeah, there’s definitely a way to do it.

657 00:47:16.360 00:47:24.070 Mustafa Raja: I think the default value, let me see what default value should… We have updated that.

658 00:47:26.350 00:47:29.619 Samuel Roberts: We made it a trigger for it, too, though, I don’t remember how this works.

659 00:47:35.800 00:47:37.669 Mustafa Raja: Yeah, it is a trigger.

660 00:47:42.660 00:47:43.240 Mustafa Raja: It is.

661 00:47:43.240 00:47:45.699 Samuel Roberts: So it might be a little more work than just right now, that’s fine, we can…

662 00:47:45.700 00:47:46.390 Mustafa Raja: Yeah.

663 00:47:46.390 00:47:48.600 Samuel Roberts: We can definitely plan to add that.

664 00:47:49.030 00:47:51.280 Samuel Roberts: I would make a ticket for that even, because that might be a little more.

665 00:47:51.280 00:47:51.990 Mustafa Raja: Yeah.

666 00:47:51.990 00:47:53.000 Samuel Roberts: involved. Yeah.

667 00:47:53.310 00:47:54.020 Mustafa Raja: Yep.

668 00:48:01.820 00:48:04.909 Samuel Roberts: Yeah, and updated at… yeah, perfect.

669 00:48:13.040 00:48:13.730 Samuel Roberts: Perfect.

670 00:48:15.710 00:48:29.099 Samuel Roberts: Yeah, I think the audit trail that we have is good in case someone accidentally changes something, and we need to revert, or something like that. The updated app will be good to compare and see, like, oh, this has been changed already, so…

671 00:48:29.320 00:48:32.150 Samuel Roberts: I think that’s good.

672 00:48:34.650 00:48:37.339 Casie Aviles: Turn this into a spike instead.

673 00:48:38.270 00:48:43.810 Samuel Roberts: Yeah, and mention, potentially testing against the dev, like, static DB or something, because…

674 00:48:44.060 00:48:45.529 Samuel Roberts: And so we don’t forget that.

675 00:48:51.780 00:48:52.530 Samuel Roberts: Cool.

676 00:48:53.980 00:48:56.009 Mustafa Raja: I was also thinking that we could just, you know.

677 00:48:56.330 00:49:02.890 Mustafa Raja: Judge the query, if the query ran was good or not, but that could be complex to judge.

678 00:49:03.610 00:49:06.850 Samuel Roberts: Yeah, probably a little more than we want to do right now.

679 00:49:08.220 00:49:13.750 Casie Aviles: Because, so, there’s, like, two things that I update.

680 00:49:14.360 00:49:26.930 Casie Aviles: For, for, like, you know, zip code-related, answer, so… It’s either the instructions… of the SQL generator.

681 00:49:27.770 00:49:33.819 Casie Aviles: Or it’s… it’s the database itself that’s not updated, right? So those are the two things that…

682 00:49:34.160 00:49:35.979 Casie Aviles: that I, update.

683 00:49:36.090 00:49:40.890 Casie Aviles: So right now, we have the admin UI that lets Janice

684 00:49:41.030 00:49:43.890 Casie Aviles: Updated, so that should be fine.

685 00:49:44.610 00:49:50.730 Casie Aviles: Makes also… makes it easier for us to update that. So there’s… the only thing left there is also the…

686 00:49:51.670 00:49:59.170 Casie Aviles: you know, the instructions. And it’s just a matter of giving it, like, the right schema, the right…

687 00:49:59.430 00:50:01.300 Casie Aviles: rules, I guess.

688 00:50:01.680 00:50:09.670 Casie Aviles: But also… The models could also be a factor, since it’s… we can switch out models there.

689 00:50:10.370 00:50:11.429 Samuel Roberts: Right, right.

690 00:50:12.360 00:50:14.810 Casie Aviles: But yeah, I was just, mentioning that.

691 00:50:15.540 00:50:16.190 Samuel Roberts: Okay.

692 00:50:16.590 00:50:17.700 Samuel Roberts: That’s good to know.

693 00:50:21.220 00:50:27.830 Casie Aviles: No, no, I’ll just move these things, and I’ll just tag Amber, I believe, so she’s also in the loop.

694 00:50:29.380 00:50:30.519 Samuel Roberts: Okay, great, thank you.

695 00:50:30.520 00:50:31.060 Casie Aviles: Pick.

696 00:50:31.470 00:50:37.130 Casie Aviles: I think we’re pretty… we’re clear now, right, with what we’re going to do for ADC.

697 00:50:38.660 00:50:39.110 Samuel Roberts: I think so.

698 00:50:39.400 00:50:42.110 Casie Aviles: Not enough.

699 00:50:44.530 00:50:50.729 Casie Aviles: Yeah, okay. I think that’s, that’s, that’s good for ABC.

700 00:50:51.210 00:51:01.580 Mustafa Raja: Yeah, one more thing, I forgot. For the conversation dataset, I haven’t ran it, so I’ll run it, run it tonight, leave it running tonight.

701 00:51:02.420 00:51:02.930 Samuel Roberts: Oh, right.

702 00:51:03.560 00:51:05.250 Mustafa Raja: Yeah, I forgot that.

703 00:51:06.240 00:51:06.559 Samuel Roberts: You know.

704 00:51:06.560 00:51:07.669 Mustafa Raja: And I’ll share.

705 00:51:10.290 00:51:11.180 Mustafa Raja: Okay, yeah, I’ll.

706 00:51:11.180 00:51:12.230 Samuel Roberts: Yeah, that’ll be an interesting one.

707 00:51:12.230 00:51:17.189 Mustafa Raja: Yeah, yeah. Because I just want to see the behavior of memory with that.

708 00:51:20.560 00:51:21.330 Samuel Roberts: Okay.

709 00:51:24.860 00:51:34.170 Casie Aviles: I might just need some… I guess I’ll just… I might just need some help as well. I’m not sure if I can do all the tickets.

710 00:51:34.170 00:51:39.360 Samuel Roberts: Yeah, I would say the eval logging, maybe we can… move since…

711 00:51:41.850 00:51:43.869 Samuel Roberts: I don’t know, I’m gonna talk about Now that we’ve been…

712 00:51:43.870 00:51:44.290 Mustafa Raja: Yay.

713 00:51:44.290 00:51:45.370 Samuel Roberts: that already.

714 00:51:45.370 00:51:51.320 Mustafa Raja: Yeah, we could pass that over. Since I’ll also be implementing evals.

715 00:51:51.820 00:51:52.620 Samuel Roberts: That’s true.

716 00:51:52.620 00:52:00.620 Mustafa Raja: I think this is… might as well just… So, by logging, we mean just put that in, whatever house we selected, right? So, for now.

717 00:52:00.620 00:52:01.640 Samuel Roberts: Yeah, I…

718 00:52:01.890 00:52:05.899 Samuel Roberts: Yeah, we’re… yeah, the goal is BigQuery, but I really just want to make sure that it’s set up to, like.

719 00:52:06.090 00:52:07.740 Samuel Roberts: Spit that out somewhere.

720 00:52:07.900 00:52:14.230 Samuel Roberts: And with all the fields we want, so even if it’s just logging it while you’re testing locally, it’s probably fine, but…

721 00:52:14.450 00:52:17.320 Samuel Roberts: Eventually, into BigQuery.

722 00:52:17.920 00:52:19.569 Mustafa Raja: Yeah, yeah, yeah, of course.

723 00:52:20.060 00:52:20.930 Samuel Roberts: Okay, cool.

724 00:52:22.320 00:52:29.079 Mustafa Raja: Casey, let me know, if, if there’s anything else. We could also pass something on to Pranav.

725 00:52:29.660 00:52:30.250 Samuel Roberts: Yeah.

726 00:52:32.330 00:52:37.670 Casie Aviles: Hmm, okay. I’m just not sure, like, which, which ticket you’re…

727 00:52:37.670 00:52:38.300 Samuel Roberts: I know.

728 00:52:38.300 00:52:41.829 Casie Aviles: with working on, because it takes… it has a lot of context.

729 00:52:42.190 00:52:44.420 Samuel Roberts: A lot of them are, yeah, a lot of them are contacts you have.

730 00:52:44.690 00:52:45.820 Samuel Roberts: Unfortunately.

731 00:52:46.300 00:52:55.070 Pranav: Maybe… I feel like that’s kind of been the theme for a few of these. I think, if we can maybe… if you’re ever working on a few of these, if I can just, like, pair with you.

732 00:52:55.220 00:52:55.810 Pranav: That way…

733 00:52:55.810 00:52:56.240 Casie Aviles: in this country.

734 00:52:56.240 00:52:57.249 Mustafa Raja: Kind of like…

735 00:52:57.560 00:53:02.070 Pranav: Ramp up on, like, the project a little bit more, understand, like.

736 00:53:02.520 00:53:04.639 Pranav: The ins and outs of a few different things.

737 00:53:04.760 00:53:05.630 Pranav: Sure.

738 00:53:05.630 00:53:06.360 Casie Aviles: or…

739 00:53:06.360 00:53:11.540 Pranav: Yeah, just feel free to just ping me, and then we can just, like, hop in a huddle or a Zoom, and we can just, like, pair.

740 00:53:12.760 00:53:13.360 Casie Aviles: Okay, yeah.

741 00:53:13.360 00:53:13.980 Mustafa Raja: That’s ridiculous.

742 00:53:14.760 00:53:15.510 Casie Aviles: Hmm.

743 00:53:16.330 00:53:18.720 Casie Aviles: Alright, yeah, I’ll let you know.

744 00:53:21.570 00:53:26.409 Casie Aviles: Okay. Yeah, I think that’s… that’s all we have for ABC now. I think we’re… Alright.

745 00:53:26.860 00:53:30.829 Casie Aviles: going to talk about Halo Nix, right?

746 00:53:31.540 00:53:35.680 Samuel Roberts: Yeah, let me, grab some coffee, and then I’ll be good to go.

747 00:53:36.190 00:53:40.439 Mustafa Raja: Yeah, also, let me know, Casey, when you’re working on deploying.

748 00:53:40.700 00:53:43.960 Mustafa Raja: We will be deploying migration progress

749 00:53:44.820 00:53:55.869 Mustafa Raja: branch. I haven’t really updated, or pushed the latest code. I think you’ve been working on, KC1330?

750 00:53:56.100 00:54:00.509 Mustafa Raja: And you might want to move your updates to this new branch.

751 00:54:00.630 00:54:02.210 Mustafa Raja: migration progress.

752 00:54:05.470 00:54:07.940 Casie Aviles: Oh, let me actually go to Heroku.

753 00:54:08.510 00:54:12.439 Casie Aviles: I might… So I can see much further.

754 00:54:18.170 00:54:21.849 Samuel Roberts: Okay, yeah, I’m gonna let you guys sort that out real quick while I can make some coffee before the next meeting.

755 00:54:21.850 00:54:22.590 Mustafa Raja: Yeah, yeah, yeah.

756 00:54:22.590 00:54:26.369 Samuel Roberts: Sure. Let me know if I can help with that, but I’ll be… I’ll be back, or I’ll be on the other call in a minute.

757 00:54:26.630 00:54:31.390 Pranav: I was gonna ask, next meeting, are you guys cool if we move it, like, maybe 30 minutes?

758 00:54:33.080 00:54:34.850 Casie Aviles: Yeah, I don’t mind.

759 00:54:35.130 00:54:36.260 Samuel Roberts: That’s fine with me.

760 00:54:36.590 00:54:37.670 Pranav: Okay, perfect.

761 00:54:38.010 00:54:39.779 Samuel Roberts: Alright, even better. Okay, thank you guys.

762 00:54:40.080 00:54:40.760 Pranav: Yep, thanks.

763 00:54:40.760 00:54:41.479 Mustafa Raja: I ain’t…

764 00:54:41.480 00:54:42.050 Samuel Roberts: Yep.

765 00:54:48.700 00:54:51.610 Mustafa Raja: Yeah, I don’t like this verification code.

766 00:54:52.030 00:54:54.610 Casie Aviles: Yeah, I don’t need… I don’t either.

767 00:54:55.570 00:54:56.510 Casie Aviles: Here.

768 00:54:57.590 00:55:01.709 Mustafa Raja: Hopefully, once we move to Railway, we can get rid of this.

769 00:55:07.290 00:55:09.160 Casie Aviles: Okay, yes.

770 00:55:16.040 00:55:22.039 Mustafa Raja: The other thing is that I don’t have this code saved in 1Pass, I have it on my Microsoft

771 00:55:22.430 00:55:24.489 Mustafa Raja: Password manager, something.

772 00:55:24.940 00:55:25.619 Casie Aviles: Oh, you’re, you’re.

773 00:55:25.620 00:55:30.670 Mustafa Raja: Yeah, yeah, so I have really… Open that up.

774 00:55:31.220 00:55:32.480 Mustafa Raja: Yeah, it might, might.

775 00:55:32.620 00:55:35.269 Casie Aviles: Mine’s on my phone.

776 00:55:35.270 00:55:41.070 Mustafa Raja: Yeah, same. It’s on my phone. The app really… the app, Microsoft app really doesn’t have a Mac app.

777 00:55:41.180 00:55:46.689 Mustafa Raja: So I cannot have it on my laptop also. So I… whatever the case, I’m stuck with it.

778 00:55:47.520 00:55:48.070 Mustafa Raja: I.

779 00:55:48.070 00:55:48.580 Casie Aviles: I have.

780 00:55:48.580 00:55:50.209 Mustafa Raja: I flew open my phone to get it out.

781 00:55:51.110 00:55:54.519 Casie Aviles: Yeah, that kinda sucks, because if you have it on…

782 00:55:54.760 00:56:00.190 Casie Aviles: I think you can have your verification code in one pass, and it will just autofill for you, everything.

783 00:56:00.190 00:56:05.159 Mustafa Raja: Yeah, yeah, that’s, like, super convenient. I think, what’s it called?

784 00:56:05.740 00:56:12.949 Mustafa Raja: I think Sam has it. Sam has it in his OnePass, and it’s super convenient for him to, you know, get into this.

785 00:56:13.360 00:56:17.289 Casie Aviles: Yeah, I don’t know… I didn’t set it up before, so…

786 00:56:17.760 00:56:18.760 Mustafa Raja: Yeah, me too.

787 00:56:19.020 00:56:26.979 Mustafa Raja: It’s just, when I was onboarding, I just set it up in my… in my Microsoft password manager.

788 00:56:27.990 00:56:28.880 Casie Aviles: Okay.

789 00:56:29.700 00:56:35.369 Mustafa Raja: Yeah, so we just want to make sure that, we would be deploying, not…

790 00:56:35.960 00:56:41.529 Mustafa Raja: KC1330, but Migration Progress Branch. So, if you have made any changes.

791 00:56:41.670 00:56:47.309 Mustafa Raja: Make sure that those changes are also in, Migration Progress Branch.

792 00:56:47.840 00:56:51.219 Mustafa Raja: So we don’t lose any progress.

793 00:56:52.140 00:56:53.680 Casie Aviles: Okay, so I’ll just merge…

794 00:56:53.820 00:56:57.759 Mustafa Raja: No, no, don’t mind too much, because,

795 00:57:00.060 00:57:03.080 Mustafa Raja: The issue with the merger would be…

796 00:57:05.610 00:57:14.430 Mustafa Raja: Yeah, so the issue with the merge would be that we moved the version from master stable to master beta.

797 00:57:14.810 00:57:20.239 Mustafa Raja: In this migration progress branch, so that’ll create some conflicts.

798 00:57:21.000 00:57:27.120 Mustafa Raja: Or we might have to, you know, select the versions again manually.

799 00:57:27.330 00:57:28.639 Mustafa Raja: It’s painful.

800 00:57:33.520 00:57:34.670 Casie Aviles: Okay, yeah.

801 00:57:34.670 00:57:42.289 Mustafa Raja: So if you… if you would know the files that you need, most likely, the database ones, right? Just move those.

802 00:57:44.030 00:57:49.960 Casie Aviles: Oh, okay. So, so, for example, my changes are just mainly…

803 00:57:50.200 00:57:57.140 Casie Aviles: Well, the only changes that I’m concerned with are, let’s see, admin UI, this folder.

804 00:57:57.740 00:58:03.519 Casie Aviles: And then… with the master app, I just updated, like, the tool.

805 00:58:03.780 00:58:05.910 Casie Aviles: The instructions for the tool.

806 00:58:08.770 00:58:14.980 Mustafa Raja: I’m actually happy you updated instructions to, in, in Langfuse?

807 00:58:18.310 00:58:28.540 Casie Aviles: Oh, no, no, oh, sorry, wrong. Yeah, it’s not the prompt, actually, sorry. It’s the… Ignored tables.

808 00:58:29.440 00:58:30.139 Casie Aviles: Where is it?

809 00:58:30.140 00:58:34.290 Mustafa Raja: But that, that would be in the other, other thing, no? This is vector query, no?

810 00:58:35.190 00:58:39.210 Casie Aviles: Oh, okay, yeah, very deeply, my God.

811 00:58:39.460 00:58:40.329 Mustafa Raja: Yeah, you can just…

812 00:58:40.330 00:58:40.960 Casie Aviles: through the table.

813 00:58:40.960 00:58:41.540 Mustafa Raja: nurse?

814 00:58:41.970 00:58:46.550 Mustafa Raja: Yeah, yeah, this is… so it’s only this? Did you make any other changes?

815 00:58:47.160 00:58:48.450 Casie Aviles: Yeah, it’s just this.

816 00:58:49.430 00:59:01.660 Mustafa Raja: Okay, yeah, that’s super… yeah, that’s super easy. Even I could, I’ll move it, and then, commit, changes in migration progress, and then we can just move ahead with that, right?

817 00:59:03.750 00:59:07.969 Casie Aviles: Yeah, it’s, it’s a bit messy because of, like, the whole Heroku thing.

818 00:59:08.400 00:59:09.160 Mustafa Raja: Yeah, yeah.

819 00:59:10.720 00:59:15.970 Casie Aviles: So… Okay, so I… hmm, let me think.

820 00:59:16.210 00:59:23.969 Casie Aviles: Okay, so I’ll just do that. And then… will you be using any of these, apps, or will you be creating a new one?

821 00:59:25.100 00:59:25.840 Mustafa Raja: Sorry?

822 00:59:27.140 00:59:34.760 Casie Aviles: Here on Iroku, will you be creating a new app for the migration progress branch, or will you just use one of these?

823 00:59:35.650 00:59:43.420 Mustafa Raja: I guess we could use one of these, because, I guess production one is being used in NHN, right? So I wouldn’t touch that.

824 00:59:44.720 00:59:52.770 Mustafa Raja: I guess, I’d update one of these staging ones. Let me know if any of these are being used in production, the staging ones.

825 00:59:53.490 00:59:59.969 Casie Aviles: Yeah… It’s… it’s messy, because I… I set this up already in…

826 01:00:00.480 01:00:09.010 Mustafa Raja: We’re going to deploy it on, GCP, right? So, would we be… would we need to deploy this over on Heroku?

827 01:00:10.910 01:00:14.060 Casie Aviles: Yeah, we don’t need to, I was, I was just, no.

828 01:00:14.170 01:00:18.490 Casie Aviles: So I just need to make sure that it’s updated, okay, the migration progress, okay.

829 01:00:18.870 01:00:23.439 Mustafa Raja: Okay, yeah. Yeah, I’ll text you, I’ll DM you once I update that.

830 01:00:25.090 01:00:26.160 Casie Aviles: Okay, okay.

831 01:00:26.680 01:00:29.049 Mustafa Raja: Okay, thank you. Yeah. Yeah, bye.

832 01:00:29.840 01:00:31.049 Casie Aviles: Thank you, bye-bye.