Meeting Title: Readme - Brainforge Date: 2025-10-20 Meeting participants: Bill Mill, Henry Zhao


WEBVTT

1 00:00:12.360 00:00:13.789 Henry Zhao: Hey Bill, how’s it going?

2 00:00:13.950 00:00:15.059 Bill Mill: Good, how you doing?

3 00:00:15.350 00:00:19.229 Henry Zhao: Good, thank you. Thanks for your time. I won’t take up too much of your time.

4 00:00:19.410 00:00:23.559 Henry Zhao: And I’m not even sure if we can go through AWS, I know there’s an outage going on right now.

5 00:00:23.690 00:00:29.250 Henry Zhao: Fortunately, we’re all in U.S. West, too, so it hasn’t hit us as far as I’m aware.

6 00:00:29.630 00:00:30.490 Henry Zhao: Okay, cool.

7 00:00:30.600 00:00:39.940 Henry Zhao: So basically, I’m onboarding right now to kind of figure out some of the pricing analytics and stuff like that, so looking at what features people are using,

8 00:00:40.000 00:00:57.880 Henry Zhao: who are the people that are self-serving, you know, upgrading, and things like that. And so, right now, I’ve been given access to MongoDB, and I’ve been given access to Amplitude. The issue is, the Amplitude is, like, a free plan, so there’s a lot of limitations in what we can do in terms of calculations, like, looking at downgrades.

9 00:00:58.320 00:01:00.380 Henry Zhao: Looking at, like.

10 00:01:00.650 00:01:10.230 Henry Zhao: more freely, like, defining timestamps and things like that, and MongoDB I’ve never worked with before, so I’m, like, trying to understand how to query things in the…

11 00:01:10.230 00:01:24.479 Henry Zhao: data in MongoDB, and we just want to know if we can, like… is there anything that we can find in AWS that would be helpful for the analysis that we’re trying to do, and or if there’s anything in AWS that can make our job a little bit easier in terms of these, queries that we want to run.

12 00:01:25.800 00:01:33.420 Bill Mill: Yeah, I don’t think so, because in AWS, the only thing that we keep directly… so all of our stuff is in AWS through service providers.

13 00:01:34.030 00:01:39.670 Bill Mill: But the only thing we do directly in AWS is serve the Git repositories.

14 00:01:40.350 00:01:42.500 Bill Mill: through our product called Giddo.

15 00:01:42.800 00:01:53.179 Bill Mill: So we don’t really store any, like, useful customer data in there. And, like, even that data is mostly reflected in Mongo in the Gitto organization in there.

16 00:01:53.360 00:01:55.960 Bill Mill: Another organization, database.

17 00:01:56.800 00:02:02.109 Henry Zhao: Okay, do you work with… so whenever you were, like, somebody would look for data, they should be going into MongoDB for that?

18 00:02:02.380 00:02:07.560 Bill Mill: Yeah, pretty much. I mean, unless it’s the raw Git repository that you need access to.

19 00:02:08.180 00:02:12.989 Henry Zhao: Okay. Do you guys have any other AW services you guys use, like, in terms of, like.

20 00:02:13.160 00:02:18.460 Henry Zhao: Redshift, or, like, any query tools, or things like that, that you guys use in AWS?

21 00:02:18.910 00:02:29.869 Bill Mill: No, we use, the only thing that we have in AWS other than that is we have, I mean, we have a Redis cache of GitO data, but that’s just a cache of the Git data.

22 00:02:30.140 00:02:34.959 Bill Mill: And we have, OpenSearch, which is our logs.

23 00:02:35.460 00:02:46.630 Bill Mill: Okay. So that’s accessible without having to go into AWS, but we do host it in AWS. So if you wanted to look at logs, you can query logs through OpenSearch.

24 00:02:47.370 00:02:55.419 Bill Mill: I don’t know how much, like, I don’t know if, like, customer downgrade data would be in there, I have no idea, to be honest.

25 00:02:55.420 00:03:05.969 Henry Zhao: Okay, then I’ll have to try and figure out how to do it in MongoDB, and then I’ll ask for Alicia for, you know, alternate solutions, if there’s anything that we need to do, but we can’t do with what we have now.

26 00:03:07.010 00:03:12.659 Bill Mill: And if you have, like, specific… do you have specific things that you, like, are trying to query in particular?

27 00:03:15.250 00:03:22.610 Henry Zhao: So yeah, I have the stuff that I need in MongoDB, but yeah, if there’s nothing else additional that could be of value in AWS, I think…

28 00:03:22.730 00:03:25.780 Henry Zhao: I won’t… won’t prod more in terms of that, yeah.

29 00:03:25.780 00:03:31.090 Bill Mill: Cool. I don’t think there is. I don’t know exactly what you’re looking for, but I don’t think there is.

30 00:03:31.590 00:03:32.220 Henry Zhao: Okay.

31 00:03:33.450 00:03:45.169 Henry Zhao: Okay, yeah, as long as you don’t think there’s any, like, user usage data, or, like, what features people are using, anything like that, then yeah, I think for this analysis that I’m trying to do now, it’s probably not relevant, whatever you guys have in AWS.

32 00:03:45.440 00:03:46.230 Bill Mill: Cool.

33 00:03:46.230 00:03:53.260 Henry Zhao: And I’ve never queried raw logs, so unless there’s, like, a need for that, I won’t even… Touch that.

34 00:03:53.570 00:04:06.269 Bill Mill: Yeah, I’m always available on Slack, and if there’s a particular piece of data that you’re trying to get at, I usually am working at a much lower level than, like, what a customer feature is using, but I am comfortable with the system

35 00:04:06.680 00:04:15.159 Bill Mill: Broadly speaking. So, if you need help, like, being pointed at something, or trying to figure out where something lives, feel free to hit me up on Slack.

36 00:04:15.500 00:04:18.069 Henry Zhao: But does it pretty much all live in MongoDB? Like, any, like…

37 00:04:18.070 00:04:20.190 Bill Mill: pretty much all lives in MongoDB.

38 00:04:21.240 00:04:25.209 Henry Zhao: Okay, and you guys… do you guys store anything in S3?

39 00:04:26.040 00:04:33.129 Bill Mill: We do have some S3 buckets, like customer uploads. If you upload an image to your document site, that goes in S3.

40 00:04:33.550 00:04:34.680 Henry Zhao: I never lost that, yeah.

41 00:04:34.680 00:04:41.939 Bill Mill: Yeah, yeah, and then we back up our logs there, so we also have a backup of our logs in S3, but that’s about it.

42 00:04:42.440 00:04:45.689 Henry Zhao: Do we have user lifecycle data in MongoDB? So…

43 00:04:45.960 00:04:53.939 Henry Zhao: kind of, like, just from the time they sign up all the way to, like, what kind of projects they’re creating, what are they doing in there? Do you know if we have all that data in MongoDB?

44 00:04:54.200 00:05:00.470 Bill Mill: So, here, I will, little share my screen here.

45 00:05:02.190 00:05:11.839 Bill Mill: Alright, so… I don’t know how much you’ve been in here. README36 is the production database.

46 00:05:12.380 00:05:14.440 Bill Mill: I’m gonna use README Backup.

47 00:05:14.840 00:05:22.650 Bill Mill: Which is… it backs up every morning. I tend to just browse on README Backup, because that way, if I messed something up, it would be overwritten tomorrow.

48 00:05:22.650 00:05:23.110 Henry Zhao: Okay, cool.

49 00:05:23.110 00:05:26.599 Bill Mill: I don’t do that. Mess anything up, but it’s an exact copy, so I.

50 00:05:26.600 00:05:27.000 Henry Zhao: Yeah, perfect.

51 00:05:27.000 00:05:28.840 Bill Mill: I tend to just browse those.

52 00:05:29.760 00:05:30.600 Henry Zhao: Okay.

53 00:05:30.750 00:05:34.459 Bill Mill: And… oh my god, they just moved stuff around, I think.

54 00:05:35.200 00:05:36.619 Bill Mill: What am I looking for?

55 00:05:37.380 00:05:39.650 Bill Mill: I want… collections.

56 00:05:40.050 00:05:42.959 Bill Mill: Yeah, they actually just completely redid their UI.

57 00:05:43.210 00:05:44.900 Bill Mill: And now I’m useless.

58 00:05:46.450 00:05:51.579 Bill Mill: Database, clusters, readme backup, browse Collections, that’s what I wanted.

59 00:05:52.630 00:05:53.400 Bill Mill: Alright.

60 00:05:54.100 00:05:59.469 Bill Mill: So, in README Backup, we’ve got README. This is the main database.

61 00:06:00.090 00:06:04.070 Bill Mill: And… so there’s two main…

62 00:06:04.260 00:06:11.770 Bill Mill: collections is the term in Mongo. They’d be tables in another database, but in Mongo they’re called collections.

63 00:06:12.270 00:06:22.250 Bill Mill: Projects is the… is the main one, and projects has a ton of information. Like, for example, what’s important to you a lot is going to be this plan.

64 00:06:22.530 00:06:27.809 Bill Mill: Which is a very weird selection of possible,

65 00:06:30.290 00:06:36.300 Bill Mill: like, plans is a string, and there are a couple of random… like, there’s a 2018 business.

66 00:06:36.630 00:06:43.089 Bill Mill: there’s a bunch of different… I don’t honestly know what they mean, so you’d have to ask somebody else for exactly what plans there are.

67 00:06:43.090 00:06:45.700 Henry Zhao: I think Mark told me the blowdown, yeah.

68 00:06:45.700 00:06:48.809 Bill Mill: Good, yeah, yeah, he’s a good source for that.

69 00:06:49.000 00:06:55.570 Bill Mill: So, one thing in terms of customer lifecycle is here in the logs table, and I don’t know if Mark told you about this at all.

70 00:06:55.740 00:06:57.840 Henry Zhao: We store…

71 00:06:57.840 00:07:09.220 Bill Mill: a lot of the changes on projects. I don’t exactly know whether we store, like, customer… you know, like, I assume that in the audit log is customer change…

72 00:07:09.420 00:07:14.700 Bill Mill: Project to, you know, from free to business, or whatever.

73 00:07:16.430 00:07:25.940 Bill Mill: So, I just wanted to show you this as a table where that information might live, or collection, excuse me. I assume that in here is some sort of,

74 00:07:26.150 00:07:30.469 Bill Mill: Log message that says, you know, customer upgraded or customer downgraded.

75 00:07:32.080 00:07:32.590 Henry Zhao: Okay.

76 00:07:32.590 00:07:33.300 Bill Mill: So…

77 00:07:33.300 00:07:34.320 Henry Zhao: Look into that, yeah.

78 00:07:34.560 00:07:35.320 Bill Mill: Yeah.

79 00:07:35.380 00:07:38.430 Henry Zhao: This is helpful. Because there’s so many tables, I was like, where do I even begin?

80 00:07:38.430 00:07:42.799 Bill Mill: Yes. Yeah, so logs, projects, and I think users is helpful too, right?

81 00:07:42.970 00:07:43.990 Bill Mill: Yes.

82 00:07:43.990 00:07:44.799 Henry Zhao: You understand?

83 00:07:44.800 00:07:49.580 Bill Mill: And you can ignore Hub 2 users, which is a different type of users, which is very confusing.

84 00:07:51.410 00:07:58.629 Bill Mill: for your… for your relevance, I think, really, projects, users, and logs are the three tables that I can think of that would be…

85 00:07:58.630 00:07:58.950 Henry Zhao: Huh?

86 00:07:59.270 00:08:06.699 Bill Mill: useful. Here we’ve got Stripe updates. I don’t know anything about our Stripe integration, so I don’t know if there’s anything useful in this table.

87 00:08:07.270 00:08:08.240 Bill Mill: Collection.

88 00:08:08.890 00:08:12.969 Henry Zhao: I think Alicia did want something in terms of Stripe updates, so I’ll look into that.

89 00:08:15.360 00:08:30.730 Bill Mill: And that’s all I can really think of for showing you around. I could take a look… I don’t know offhand, and I wouldn’t be able to find it just right now on the call, but I could take a look at what it would look like if a customer changed their plan. I’m sure it gets in here somewhere.

90 00:08:31.010 00:08:47.310 Henry Zhao: Okay. And then my last question is, I’m gonna look into how to query things in MongoDB. Like, I know how to do the basic stuff, like, just query distinct values, but would it be okay if we end up deciding to do, like, a setup like DuckDB or something, to make it easier to query the data that we have in MongoDB?

91 00:08:48.670 00:08:51.549 Bill Mill: I mean, that’s… that’s on you. If that helps you, go for it.

92 00:08:51.550 00:08:57.030 Henry Zhao: Okay. Yeah, I just wanted to make sure there wasn’t, like, I wasn’t gonna break anything, or, like, isn’t against…

93 00:08:57.770 00:09:15.699 Bill Mill: I mean, I especially recommend doing it against this README backup database, which really, you cannot break. You can… I mean, go ahead, go to town, break it, and I can restore it. It’s not a problem. This generate query thing, AI can be hit or miss, but this generate query thing is super handy, so we can say, like, documents where

94 00:09:15.790 00:09:19.399 Bill Mill: Plan where the phrase…

95 00:09:19.840 00:09:20.420 Henry Zhao: Yeah.

96 00:09:20.560 00:09:25.820 Henry Zhao: Well, I guess it’s good for, like, simple stuff, but I think as we want to do some more complex stuff, we might need to…

97 00:09:25.820 00:09:32.229 Bill Mill: Oh, for sure, for sure. I just, in terms of, like, designing that stuff, it is… I’ve found it really helpful for.

98 00:09:32.230 00:09:32.990 Henry Zhao: Exactly.

99 00:09:32.990 00:09:40.719 Bill Mill: coming up with this crap. Especially, you know, I’ve been on SQL databases my whole life, so it’s been an adventure for me to learn.

100 00:09:41.350 00:09:42.489 Henry Zhao: Yeah, but it’s been fun.

101 00:09:43.550 00:09:44.290 Henry Zhao: Yeah.

102 00:09:45.070 00:09:45.710 Bill Mill: Cool.

103 00:09:46.070 00:09:48.700 Henry Zhao: Cool, I think that’s pretty much it. This has been really, really helpful.

104 00:09:48.820 00:09:56.960 Henry Zhao: Because I was kind of just, like, lost figuring out, like, what data is in MongoDB, what else do we have, so this was… it was great to be able to just kind of pick your brain and ask you about…

105 00:09:57.170 00:09:59.140 Henry Zhao: This overall stuff, so thank you.

106 00:09:59.460 00:10:03.280 Bill Mill: Yeah, no problem. And, like I said, I’m always available on Slack, so feel free to hit me up.

107 00:10:03.830 00:10:04.990 Henry Zhao: Alright, thanks, Bill.

108 00:10:04.990 00:10:05.600 Bill Mill: R.

109 00:10:05.740 00:10:06.589 Henry Zhao: Have a good one.

110 00:10:06.590 00:10:07.140 Bill Mill: Peace.