Meeting Title: Readme - Brainforge Date: 2025-10-20 Meeting participants: Bill Mill, Henry Zhao
WEBVTT
1 00:00:12.360 ⇒ 00:00:13.789 Henry Zhao: Hey Bill, how’s it going?
2 00:00:13.950 ⇒ 00:00:15.059 Bill Mill: Good, how you doing?
3 00:00:15.350 ⇒ 00:00:19.229 Henry Zhao: Good, thank you. Thanks for your time. I won’t take up too much of your time.
4 00:00:19.410 ⇒ 00:00:23.559 Henry Zhao: And I’m not even sure if we can go through AWS, I know there’s an outage going on right now.
5 00:00:23.690 ⇒ 00:00:29.250 Henry Zhao: Fortunately, we’re all in U.S. West, too, so it hasn’t hit us as far as I’m aware.
6 00:00:29.630 ⇒ 00:00:30.490 Henry Zhao: Okay, cool.
7 00:00:30.600 ⇒ 00:00:39.940 Henry Zhao: So basically, I’m onboarding right now to kind of figure out some of the pricing analytics and stuff like that, so looking at what features people are using,
8 00:00:40.000 ⇒ 00:00:57.880 Henry Zhao: who are the people that are self-serving, you know, upgrading, and things like that. And so, right now, I’ve been given access to MongoDB, and I’ve been given access to Amplitude. The issue is, the Amplitude is, like, a free plan, so there’s a lot of limitations in what we can do in terms of calculations, like, looking at downgrades.
9 00:00:58.320 ⇒ 00:01:00.380 Henry Zhao: Looking at, like.
10 00:01:00.650 ⇒ 00:01:10.230 Henry Zhao: more freely, like, defining timestamps and things like that, and MongoDB I’ve never worked with before, so I’m, like, trying to understand how to query things in the…
11 00:01:10.230 ⇒ 00:01:24.479 Henry Zhao: data in MongoDB, and we just want to know if we can, like… is there anything that we can find in AWS that would be helpful for the analysis that we’re trying to do, and or if there’s anything in AWS that can make our job a little bit easier in terms of these, queries that we want to run.
12 00:01:25.800 ⇒ 00:01:33.420 Bill Mill: Yeah, I don’t think so, because in AWS, the only thing that we keep directly… so all of our stuff is in AWS through service providers.
13 00:01:34.030 ⇒ 00:01:39.670 Bill Mill: But the only thing we do directly in AWS is serve the Git repositories.
14 00:01:40.350 ⇒ 00:01:42.500 Bill Mill: through our product called Giddo.
15 00:01:42.800 ⇒ 00:01:53.179 Bill Mill: So we don’t really store any, like, useful customer data in there. And, like, even that data is mostly reflected in Mongo in the Gitto organization in there.
16 00:01:53.360 ⇒ 00:01:55.960 Bill Mill: Another organization, database.
17 00:01:56.800 ⇒ 00:02:02.109 Henry Zhao: Okay, do you work with… so whenever you were, like, somebody would look for data, they should be going into MongoDB for that?
18 00:02:02.380 ⇒ 00:02:07.560 Bill Mill: Yeah, pretty much. I mean, unless it’s the raw Git repository that you need access to.
19 00:02:08.180 ⇒ 00:02:12.989 Henry Zhao: Okay. Do you guys have any other AW services you guys use, like, in terms of, like.
20 00:02:13.160 ⇒ 00:02:18.460 Henry Zhao: Redshift, or, like, any query tools, or things like that, that you guys use in AWS?
21 00:02:18.910 ⇒ 00:02:29.869 Bill Mill: No, we use, the only thing that we have in AWS other than that is we have, I mean, we have a Redis cache of GitO data, but that’s just a cache of the Git data.
22 00:02:30.140 ⇒ 00:02:34.959 Bill Mill: And we have, OpenSearch, which is our logs.
23 00:02:35.460 ⇒ 00:02:46.630 Bill Mill: Okay. So that’s accessible without having to go into AWS, but we do host it in AWS. So if you wanted to look at logs, you can query logs through OpenSearch.
24 00:02:47.370 ⇒ 00:02:55.419 Bill Mill: I don’t know how much, like, I don’t know if, like, customer downgrade data would be in there, I have no idea, to be honest.
25 00:02:55.420 ⇒ 00:03:05.969 Henry Zhao: Okay, then I’ll have to try and figure out how to do it in MongoDB, and then I’ll ask for Alicia for, you know, alternate solutions, if there’s anything that we need to do, but we can’t do with what we have now.
26 00:03:07.010 ⇒ 00:03:12.659 Bill Mill: And if you have, like, specific… do you have specific things that you, like, are trying to query in particular?
27 00:03:15.250 ⇒ 00:03:22.610 Henry Zhao: So yeah, I have the stuff that I need in MongoDB, but yeah, if there’s nothing else additional that could be of value in AWS, I think…
28 00:03:22.730 ⇒ 00:03:25.780 Henry Zhao: I won’t… won’t prod more in terms of that, yeah.
29 00:03:25.780 ⇒ 00:03:31.090 Bill Mill: Cool. I don’t think there is. I don’t know exactly what you’re looking for, but I don’t think there is.
30 00:03:31.590 ⇒ 00:03:32.220 Henry Zhao: Okay.
31 00:03:33.450 ⇒ 00:03:45.169 Henry Zhao: Okay, yeah, as long as you don’t think there’s any, like, user usage data, or, like, what features people are using, anything like that, then yeah, I think for this analysis that I’m trying to do now, it’s probably not relevant, whatever you guys have in AWS.
32 00:03:45.440 ⇒ 00:03:46.230 Bill Mill: Cool.
33 00:03:46.230 ⇒ 00:03:53.260 Henry Zhao: And I’ve never queried raw logs, so unless there’s, like, a need for that, I won’t even… Touch that.
34 00:03:53.570 ⇒ 00:04:06.269 Bill Mill: Yeah, I’m always available on Slack, and if there’s a particular piece of data that you’re trying to get at, I usually am working at a much lower level than, like, what a customer feature is using, but I am comfortable with the system
35 00:04:06.680 ⇒ 00:04:15.159 Bill Mill: Broadly speaking. So, if you need help, like, being pointed at something, or trying to figure out where something lives, feel free to hit me up on Slack.
36 00:04:15.500 ⇒ 00:04:18.069 Henry Zhao: But does it pretty much all live in MongoDB? Like, any, like…
37 00:04:18.070 ⇒ 00:04:20.190 Bill Mill: pretty much all lives in MongoDB.
38 00:04:21.240 ⇒ 00:04:25.209 Henry Zhao: Okay, and you guys… do you guys store anything in S3?
39 00:04:26.040 ⇒ 00:04:33.129 Bill Mill: We do have some S3 buckets, like customer uploads. If you upload an image to your document site, that goes in S3.
40 00:04:33.550 ⇒ 00:04:34.680 Henry Zhao: I never lost that, yeah.
41 00:04:34.680 ⇒ 00:04:41.939 Bill Mill: Yeah, yeah, and then we back up our logs there, so we also have a backup of our logs in S3, but that’s about it.
42 00:04:42.440 ⇒ 00:04:45.689 Henry Zhao: Do we have user lifecycle data in MongoDB? So…
43 00:04:45.960 ⇒ 00:04:53.939 Henry Zhao: kind of, like, just from the time they sign up all the way to, like, what kind of projects they’re creating, what are they doing in there? Do you know if we have all that data in MongoDB?
44 00:04:54.200 ⇒ 00:05:00.470 Bill Mill: So, here, I will, little share my screen here.
45 00:05:02.190 ⇒ 00:05:11.839 Bill Mill: Alright, so… I don’t know how much you’ve been in here. README36 is the production database.
46 00:05:12.380 ⇒ 00:05:14.440 Bill Mill: I’m gonna use README Backup.
47 00:05:14.840 ⇒ 00:05:22.650 Bill Mill: Which is… it backs up every morning. I tend to just browse on README Backup, because that way, if I messed something up, it would be overwritten tomorrow.
48 00:05:22.650 ⇒ 00:05:23.110 Henry Zhao: Okay, cool.
49 00:05:23.110 ⇒ 00:05:26.599 Bill Mill: I don’t do that. Mess anything up, but it’s an exact copy, so I.
50 00:05:26.600 ⇒ 00:05:27.000 Henry Zhao: Yeah, perfect.
51 00:05:27.000 ⇒ 00:05:28.840 Bill Mill: I tend to just browse those.
52 00:05:29.760 ⇒ 00:05:30.600 Henry Zhao: Okay.
53 00:05:30.750 ⇒ 00:05:34.459 Bill Mill: And… oh my god, they just moved stuff around, I think.
54 00:05:35.200 ⇒ 00:05:36.619 Bill Mill: What am I looking for?
55 00:05:37.380 ⇒ 00:05:39.650 Bill Mill: I want… collections.
56 00:05:40.050 ⇒ 00:05:42.959 Bill Mill: Yeah, they actually just completely redid their UI.
57 00:05:43.210 ⇒ 00:05:44.900 Bill Mill: And now I’m useless.
58 00:05:46.450 ⇒ 00:05:51.579 Bill Mill: Database, clusters, readme backup, browse Collections, that’s what I wanted.
59 00:05:52.630 ⇒ 00:05:53.400 Bill Mill: Alright.
60 00:05:54.100 ⇒ 00:05:59.469 Bill Mill: So, in README Backup, we’ve got README. This is the main database.
61 00:06:00.090 ⇒ 00:06:04.070 Bill Mill: And… so there’s two main…
62 00:06:04.260 ⇒ 00:06:11.770 Bill Mill: collections is the term in Mongo. They’d be tables in another database, but in Mongo they’re called collections.
63 00:06:12.270 ⇒ 00:06:22.250 Bill Mill: Projects is the… is the main one, and projects has a ton of information. Like, for example, what’s important to you a lot is going to be this plan.
64 00:06:22.530 ⇒ 00:06:27.809 Bill Mill: Which is a very weird selection of possible,
65 00:06:30.290 ⇒ 00:06:36.300 Bill Mill: like, plans is a string, and there are a couple of random… like, there’s a 2018 business.
66 00:06:36.630 ⇒ 00:06:43.089 Bill Mill: there’s a bunch of different… I don’t honestly know what they mean, so you’d have to ask somebody else for exactly what plans there are.
67 00:06:43.090 ⇒ 00:06:45.700 Henry Zhao: I think Mark told me the blowdown, yeah.
68 00:06:45.700 ⇒ 00:06:48.809 Bill Mill: Good, yeah, yeah, he’s a good source for that.
69 00:06:49.000 ⇒ 00:06:55.570 Bill Mill: So, one thing in terms of customer lifecycle is here in the logs table, and I don’t know if Mark told you about this at all.
70 00:06:55.740 ⇒ 00:06:57.840 Henry Zhao: We store…
71 00:06:57.840 ⇒ 00:07:09.220 Bill Mill: a lot of the changes on projects. I don’t exactly know whether we store, like, customer… you know, like, I assume that in the audit log is customer change…
72 00:07:09.420 ⇒ 00:07:14.700 Bill Mill: Project to, you know, from free to business, or whatever.
73 00:07:16.430 ⇒ 00:07:25.940 Bill Mill: So, I just wanted to show you this as a table where that information might live, or collection, excuse me. I assume that in here is some sort of,
74 00:07:26.150 ⇒ 00:07:30.469 Bill Mill: Log message that says, you know, customer upgraded or customer downgraded.
75 00:07:32.080 ⇒ 00:07:32.590 Henry Zhao: Okay.
76 00:07:32.590 ⇒ 00:07:33.300 Bill Mill: So…
77 00:07:33.300 ⇒ 00:07:34.320 Henry Zhao: Look into that, yeah.
78 00:07:34.560 ⇒ 00:07:35.320 Bill Mill: Yeah.
79 00:07:35.380 ⇒ 00:07:38.430 Henry Zhao: This is helpful. Because there’s so many tables, I was like, where do I even begin?
80 00:07:38.430 ⇒ 00:07:42.799 Bill Mill: Yes. Yeah, so logs, projects, and I think users is helpful too, right?
81 00:07:42.970 ⇒ 00:07:43.990 Bill Mill: Yes.
82 00:07:43.990 ⇒ 00:07:44.799 Henry Zhao: You understand?
83 00:07:44.800 ⇒ 00:07:49.580 Bill Mill: And you can ignore Hub 2 users, which is a different type of users, which is very confusing.
84 00:07:51.410 ⇒ 00:07:58.629 Bill Mill: for your… for your relevance, I think, really, projects, users, and logs are the three tables that I can think of that would be…
85 00:07:58.630 ⇒ 00:07:58.950 Henry Zhao: Huh?
86 00:07:59.270 ⇒ 00:08:06.699 Bill Mill: useful. Here we’ve got Stripe updates. I don’t know anything about our Stripe integration, so I don’t know if there’s anything useful in this table.
87 00:08:07.270 ⇒ 00:08:08.240 Bill Mill: Collection.
88 00:08:08.890 ⇒ 00:08:12.969 Henry Zhao: I think Alicia did want something in terms of Stripe updates, so I’ll look into that.
89 00:08:15.360 ⇒ 00:08:30.730 Bill Mill: And that’s all I can really think of for showing you around. I could take a look… I don’t know offhand, and I wouldn’t be able to find it just right now on the call, but I could take a look at what it would look like if a customer changed their plan. I’m sure it gets in here somewhere.
90 00:08:31.010 ⇒ 00:08:47.310 Henry Zhao: Okay. And then my last question is, I’m gonna look into how to query things in MongoDB. Like, I know how to do the basic stuff, like, just query distinct values, but would it be okay if we end up deciding to do, like, a setup like DuckDB or something, to make it easier to query the data that we have in MongoDB?
91 00:08:48.670 ⇒ 00:08:51.549 Bill Mill: I mean, that’s… that’s on you. If that helps you, go for it.
92 00:08:51.550 ⇒ 00:08:57.030 Henry Zhao: Okay. Yeah, I just wanted to make sure there wasn’t, like, I wasn’t gonna break anything, or, like, isn’t against…
93 00:08:57.770 ⇒ 00:09:15.699 Bill Mill: I mean, I especially recommend doing it against this README backup database, which really, you cannot break. You can… I mean, go ahead, go to town, break it, and I can restore it. It’s not a problem. This generate query thing, AI can be hit or miss, but this generate query thing is super handy, so we can say, like, documents where
94 00:09:15.790 ⇒ 00:09:19.399 Bill Mill: Plan where the phrase…
95 00:09:19.840 ⇒ 00:09:20.420 Henry Zhao: Yeah.
96 00:09:20.560 ⇒ 00:09:25.820 Henry Zhao: Well, I guess it’s good for, like, simple stuff, but I think as we want to do some more complex stuff, we might need to…
97 00:09:25.820 ⇒ 00:09:32.229 Bill Mill: Oh, for sure, for sure. I just, in terms of, like, designing that stuff, it is… I’ve found it really helpful for.
98 00:09:32.230 ⇒ 00:09:32.990 Henry Zhao: Exactly.
99 00:09:32.990 ⇒ 00:09:40.719 Bill Mill: coming up with this crap. Especially, you know, I’ve been on SQL databases my whole life, so it’s been an adventure for me to learn.
100 00:09:41.350 ⇒ 00:09:42.489 Henry Zhao: Yeah, but it’s been fun.
101 00:09:43.550 ⇒ 00:09:44.290 Henry Zhao: Yeah.
102 00:09:45.070 ⇒ 00:09:45.710 Bill Mill: Cool.
103 00:09:46.070 ⇒ 00:09:48.700 Henry Zhao: Cool, I think that’s pretty much it. This has been really, really helpful.
104 00:09:48.820 ⇒ 00:09:56.960 Henry Zhao: Because I was kind of just, like, lost figuring out, like, what data is in MongoDB, what else do we have, so this was… it was great to be able to just kind of pick your brain and ask you about…
105 00:09:57.170 ⇒ 00:09:59.140 Henry Zhao: This overall stuff, so thank you.
106 00:09:59.460 ⇒ 00:10:03.280 Bill Mill: Yeah, no problem. And, like I said, I’m always available on Slack, so feel free to hit me up.
107 00:10:03.830 ⇒ 00:10:04.990 Henry Zhao: Alright, thanks, Bill.
108 00:10:04.990 ⇒ 00:10:05.600 Bill Mill: R.
109 00:10:05.740 ⇒ 00:10:06.589 Henry Zhao: Have a good one.
110 00:10:06.590 ⇒ 00:10:07.140 Bill Mill: Peace.