Meeting Title: Brainforge CDP Implementation Intro Meeting Date: 2025-07-11 Meeting participants: Awaish Kumar, Henry Zhao


WEBVTT

1 00:00:38.570 00:00:40.120 Henry Zhao: Hello! How are you doing.

2 00:00:48.230 00:00:49.639 Awaish Kumar: We’re studying. Hi!

3 00:00:50.060 00:00:51.120 Awaish Kumar: How are you doing.

4 00:00:51.960 00:00:53.010 Henry Zhao: I’m good. How are you?

5 00:00:54.230 00:00:55.999 Awaish Kumar: I’m good. How about you?

6 00:00:56.600 00:00:58.020 Henry Zhao: Good thanks.

7 00:00:58.590 00:01:04.190 Henry Zhao: So I just started yesterday, so I thought I would like to meet you, and just kind of get a little bit of understanding of what you work with.

8 00:01:05.141 00:01:07.249 Henry Zhao: So I can figure out, you know

9 00:01:07.490 00:01:11.509 Henry Zhao: where, when I might be able to use you and ask you questions about certain things.

10 00:01:13.660 00:01:14.380 Awaish Kumar: Okay.

11 00:01:14.760 00:01:17.740 Henry Zhao: You wanted to give an introduction first, st I guess I’ll start with my introduction.

12 00:01:17.940 00:01:24.980 Henry Zhao: Basically, I’ve been brought on to help with the Cdp implementation.

13 00:01:25.509 00:01:31.300 Henry Zhao: So right now, right, we use segment. I don’t know if you’ve worked with segment, but I guess I’ll wait for your introduction to kind of get into that.

14 00:01:33.160 00:01:40.629 Awaish Kumar: Oh, yeah, like, my name is Avesh. I’m a kind of have been working

15 00:01:40.800 00:01:49.120 Awaish Kumar: as a lead data engineer in the past few years. And here I’m kind of managing people and also

16 00:01:50.040 00:01:52.400 Awaish Kumar: performing data engineering work.

17 00:01:52.984 00:01:57.979 Awaish Kumar: Yeah, like in the Eden side. Mostly, I’m managing data engineering

18 00:01:58.300 00:02:02.979 Awaish Kumar: work. And I’m supporting the Cdp work that Robert was doing.

19 00:02:03.200 00:02:06.300 Awaish Kumar: Select support means if there’s any

20 00:02:06.450 00:02:10.780 Awaish Kumar: modeling requirements like, if for your investigation, if you need any models, any

21 00:02:11.190 00:02:16.990 Awaish Kumar: kind of data ingestions or things like that. So I I have been helping with that.

22 00:02:18.410 00:02:24.820 Henry Zhao: Okay, so were you the one that set up like the web hooks, or like set up help set up set up segments? Or was that somebody else.

23 00:02:26.570 00:02:34.710 Awaish Kumar: Yeah, setting up segment was not part of like my responsibilities. That was done by someone in the Eden side.

24 00:02:34.890 00:02:41.740 Awaish Kumar: I’m just. We are just reading data which is coming from segment into the bigquery and

25 00:02:43.640 00:02:46.110 Awaish Kumar: and performing modelling on top of it, like

26 00:02:46.702 00:02:52.450 Awaish Kumar: like, whatever data comes through segment connectors and also on

27 00:02:52.970 00:03:00.210 Awaish Kumar: yeah, that was, that’s mo, mostly it, like we mostly working on whatever is coming out of segment

28 00:03:00.330 00:03:02.290 Awaish Kumar: and then building the models.

29 00:03:02.460 00:03:18.949 Awaish Kumar: But then, like some of the people can do in the Eden team are doing reverse Etl. So they are pulling data from bigquery, which maybe from our models, or something like that back to some customer I/O or different

30 00:03:19.610 00:03:22.090 Awaish Kumar: platforms. But like that’s not

31 00:03:22.766 00:03:32.039 Awaish Kumar: I do normally like like that’s not. That’s like. We are not managing that there. There are like different people in Aiden who are doing that.

32 00:03:33.795 00:03:38.669 Henry Zhao: Okay, so what part are you? Are you currently managing or like continuing to manage.

33 00:03:39.880 00:03:48.979 Awaish Kumar: No, no, that’s what I’m saying. Like segment is is being set up by someone in Eden. Engineering. Right?

34 00:03:49.470 00:03:52.590 Awaish Kumar: What we are doing is getting the data from segment

35 00:03:53.190 00:04:07.609 Awaish Kumar: right? And then any type of data analytics work on top of that is being done by rainforged team, which includes me Damilade, Robert, and Annie.

36 00:04:08.760 00:04:09.510 Awaish Kumar: So I’m.

37 00:04:09.510 00:04:09.830 Henry Zhao: You know.

38 00:04:09.830 00:04:14.890 Awaish Kumar: Data, analytics and data engineering person. So we have some data which is

39 00:04:15.040 00:04:18.099 Awaish Kumar: where we are not using segments. So we have some

40 00:04:18.625 00:04:28.179 Awaish Kumar: different kind of Zendesk or different connectors which which are not set up in segments. So we get data using different tools.

41 00:04:28.726 00:04:41.159 Awaish Kumar: and then just the data to bigquery and then model it, and leave the models in a format that the the other teams, like engineering marketing and the item can use it.

42 00:04:41.730 00:04:55.559 Awaish Kumar: But then, like, for example, if somebody’s using is using to for for reverse detail. That’s again, someone in the Eden team like the brain force is not managing that part right now.

43 00:04:57.670 00:05:00.890 Henry Zhao: Okay, so what languages do you do you work with.

44 00:05:02.760 00:05:04.270 Awaish Kumar: Mostly we are doing

45 00:05:04.450 00:05:13.969 Awaish Kumar: so mostly. All the injection and the modeling work we are doing is the Dvt. And I have to ex to access some of the data and

46 00:05:14.940 00:05:21.720 Awaish Kumar: reading data from Google Sheet. And things like that, we, I use some python using.

47 00:05:21.960 00:05:26.209 Awaish Kumar: So I’m using Dexter as a tool for our python pipelines.

48 00:05:26.430 00:05:29.790 Awaish Kumar: And we have been using Dbt and escal

49 00:05:29.920 00:05:32.349 Awaish Kumar: for over all the data transformation work.

50 00:05:33.830 00:05:39.549 Henry Zhao: Okay, very interesting. I think those are really the only questions I had. I guess if I have any questions I’ll

51 00:05:39.710 00:05:44.370 Henry Zhao: I’ll reach out to you. But in the meantime, is there anything that you recommend that I should look at on my onboarding.

52 00:05:44.370 00:05:50.569 Awaish Kumar: So yeah, for the Cdp work. I think Robert must have shared a document with you

53 00:05:50.730 00:05:58.780 Awaish Kumar: like he has built a very long document of different. What, what, what we are

54 00:06:00.000 00:06:08.110 Awaish Kumar: we need to investigate on Cdp work. He has shared this notion, Doc, in that there are like different objectives he has set.

55 00:06:08.270 00:06:23.395 Awaish Kumar: So there’s called one of the objective is to get the user profiles from segment and join it with the existing the warehouse customer data which is like coming from like our

56 00:06:24.380 00:06:30.640 Awaish Kumar: like sales platform, like boss. And so we have 2 different types of customer table.

57 00:06:30.940 00:06:46.260 Awaish Kumar: one, the user profiles coming from segment, which is based on all the different platforms which are connected to segment. Maybe some anonymous users, some users who who never made a purchase, but they are like still with the visited boss, or something like that.

58 00:06:46.864 00:06:52.509 Awaish Kumar: Visited like Indian website. And then on the other side, we have internal

59 00:06:53.778 00:06:58.371 Awaish Kumar: like, we have built the customer table. It’s called dim customer in the like

60 00:06:59.490 00:07:13.629 Awaish Kumar: for the the brain forge team. So we have built another dim customer table, but it it is only dependent on the the users who are real customers like we don’t have anyone who just visited the website and never made a purchase

61 00:07:13.780 00:07:29.580 Awaish Kumar: things like that. So now, like, I have been building like, we have the customer I have been building user profiles table. Now for the user profile table. We went through an exercise that, like everything I’m just telling you is is mentioned in the robot

62 00:07:30.005 00:07:46.579 Awaish Kumar: document I’m just giving. I’m summarizing it that in that document there’s an objective to create an audit table. So in the the user profile table is called the profile trade updates in the bigquery. It is coming from segment. It has more than like 300 columns.

63 00:07:47.250 00:07:53.709 Awaish Kumar: and for each column we call it a trait of a user, like like an email address or

64 00:07:54.420 00:07:58.300 Awaish Kumar: some mobile number or some other information like.

65 00:07:58.410 00:08:27.299 Awaish Kumar: there’s Utm source or things like that. So they are like, there are different traits for a user which basically can help us build a user profile. So there are like more than 300 columns. We have created a way to identify some meaningful traits which are useful to have in user profiles. So like, if we have all those 300, it’s really very hard to like get any real information out of that.

66 00:08:27.980 00:08:36.620 Awaish Kumar: So what we have have had have a strategy. So 1st of all, we try to measure those

67 00:08:37.970 00:08:51.839 Awaish Kumar: like we have, we, we find a way to calculate some of the metrics based on those traits. So there is a table in the bigquery. It is called int user traits int for intermediate

68 00:08:52.060 00:09:04.809 Awaish Kumar: and user traits. So there’s 1 table which basically have all these traits. And then for each trait, we are saying, we are checking the math different metrics like completeness

69 00:09:05.110 00:09:20.480 Awaish Kumar: and the normal percentages and the kind of like different variance. If there is a variance in the data and things like that. Then if it is being updated regularly or not, so kind of a few list of some to figure out like

70 00:09:20.580 00:09:23.909 Awaish Kumar: some medical map traits. So

71 00:09:24.418 00:09:36.161 Awaish Kumar: I have calculated all those metrics for those 300 traits. And then I use a filter query which is also in you can find it in Github, it and

72 00:09:37.050 00:10:00.859 Awaish Kumar: in the code base. It is the the model it’s in the. It’s a DVD project. So we are. The SQL. File is called like we call them as a model. So that model is again in meaningful traits. In that I have a query, and in that like, I’m I’m just using 4 filters to identify which traits are meaningful. And I get like around 30 to 40

73 00:10:00.970 00:10:05.201 Awaish Kumar: columns only, which really have some non null

74 00:10:06.160 00:10:10.954 Awaish Kumar: very, very data with larger variants. And

75 00:10:13.047 00:10:18.480 Awaish Kumar: some like distinct values for multiple users. So which gives us some

76 00:10:18.690 00:10:44.839 Awaish Kumar: and and like the indication that these can be useful, then I qualify them as a part of user profile table. So then again, there’s 1 more table, and that’s called user profiles. And then that user profiles table. If you go and you, you find it’s kind of one row per user. And then again, the we get the maximum.

77 00:10:45.900 00:10:49.700 Awaish Kumar: Whatever is the maximum value of that user?

78 00:10:50.328 00:10:56.760 Awaish Kumar: For some trade like it’s kind of a sub sub table of that bigger segment table.

79 00:10:57.090 00:11:09.679 Awaish Kumar: So out of 300 now we only have a table where we we have only 40 columns, which we identified, that they are meaningful. And also, there’s only one row per user. So we are not having like, multiple

80 00:11:10.548 00:11:15.139 Awaish Kumar: those per user, because original table will have multiple those per user as well.

81 00:11:16.040 00:11:25.280 Awaish Kumar: And then, after that, the next task I’m working on. It’s not finished yet, but I’m I’m targeting it to be done like in this week.

82 00:11:25.752 00:11:34.320 Awaish Kumar: Like like today. So at the end of day, so we’ll have one more table. It will be called kind of customer enriched model.

83 00:11:34.450 00:11:43.520 Awaish Kumar: So we have. as I mentioned, we have user profile table. And then we have a dim customer table which is already in there in victory.

84 00:11:43.830 00:11:45.820 Henry Zhao: Where where is the cable?

85 00:11:46.270 00:11:50.560 Awaish Kumar: Dim customer. And if if do you have access to bigquery, hidden warehouse.

86 00:11:51.140 00:11:56.126 Henry Zhao: Yeah. Which? Which? Folder like. Which repository is it? In which?

87 00:11:57.051 00:12:03.020 Awaish Kumar: So it is like in the git. If in the github it is in the in the analytics.

88 00:12:04.542 00:12:09.369 Awaish Kumar: It’s it’s it’s not in the brain forge AI organization.

89 00:12:09.950 00:12:15.120 Awaish Kumar: It is in Eden organization and it’s called Analytics

90 00:12:15.220 00:12:19.390 Awaish Kumar: Repository. If you don’t have access, maybe ask Robert.

91 00:12:19.890 00:12:23.649 Henry Zhao: But I mean in bigquery, in bigquery. Which repository is it in.

92 00:12:24.990 00:12:30.680 Awaish Kumar: Victory like there’s no repository like in victory. We have projects, schemas, and tables.

93 00:12:30.680 00:12:33.359 Henry Zhao: Yeah, which sorry? Which I meant. Which schema? Which schema is it in.

94 00:12:34.160 00:12:38.729 Awaish Kumar: It’s in the it’s in broad debiting marks.

95 00:12:39.790 00:12:43.280 Henry Zhao: Broad. Dvt. I don’t have that one. Maybe that’s why.

96 00:12:43.560 00:12:45.660 Awaish Kumar: Sorry. Can you share your screen? I can.

97 00:12:46.050 00:12:46.540 Henry Zhao: Yeah.

98 00:12:46.540 00:12:47.390 Awaish Kumar: Sure.

99 00:12:56.335 00:13:00.610 Awaish Kumar: If you search them customers on top, we can.

100 00:13:04.750 00:13:10.030 Awaish Kumar: Yeah, this one productivity marks and them customers.

101 00:13:11.030 00:13:13.420 Henry Zhao: Okay. If I start, maybe it’ll show up.

102 00:13:16.620 00:13:18.440 Awaish Kumar: Yeah, but the.

103 00:13:20.620 00:13:24.809 Henry Zhao: Okay. So here we have the dim customers. And then you were also talking about.

104 00:13:24.810 00:13:26.140 Awaish Kumar: And that’s not.

105 00:13:26.647 00:13:30.020 Awaish Kumar: Yeah, that’s that should be here as well. User profiles.

106 00:13:31.960 00:13:34.140 Awaish Kumar: You can search for it in the top.

107 00:13:37.290 00:13:38.440 Henry Zhao: It’s right here. Okay.

108 00:13:40.120 00:13:51.840 Awaish Kumar: So, yeah, now, I will working on creating a 3rd table, which will basically join the join these 2. But it will not have, like everything from dream customer.

109 00:13:51.990 00:14:04.829 Awaish Kumar: but it will have everything from user profile and plus few metrics from dim customer. And maybe if if there are some metrics which might not even be available in dim customer, like maybe

110 00:14:05.080 00:14:11.719 Awaish Kumar: total orders, value or total total revenue things like that lifetime value.

111 00:14:12.400 00:14:18.479 Awaish Kumar: so I will just calculate them on the fly, and we’ll make it a

112 00:14:19.314 00:14:22.229 Awaish Kumar: as part of maybe team customer and then

113 00:14:22.740 00:14:27.470 Awaish Kumar: build a final table called Customer Enriched model that will

114 00:14:27.620 00:14:33.539 Awaish Kumar: be that will have everything from user profile plus few more

115 00:14:33.720 00:14:36.700 Awaish Kumar: trades from them, customer. And the

116 00:14:37.560 00:14:46.990 Awaish Kumar: that’s what the objective is in in this exercise nice after that, like we can maybe, like

117 00:14:47.220 00:14:52.879 Awaish Kumar: Robot wanted to use this table to basically push the

118 00:14:53.070 00:14:56.960 Awaish Kumar: push it to customer I/O and use it for Owl

119 00:14:57.430 00:15:03.609 Awaish Kumar: the better campaigning, I think, and it’s like better segmenting of customers and campaigning.

120 00:15:03.970 00:15:10.850 Awaish Kumar: That was the objective. So I’m helping him to find out those customers, and he’s going to push

121 00:15:11.060 00:15:13.289 Awaish Kumar: that to Customer I/O and work on that.

122 00:15:14.020 00:15:19.420 Awaish Kumar: I’m not sure like. Now, how what is your role in it like, you know

123 00:15:19.530 00:15:21.670 Awaish Kumar: you must have discussed with Robert like.

124 00:15:24.310 00:15:25.236 Henry Zhao: Yeah, yeah.

125 00:15:27.160 00:15:34.000 Henry Zhao: Yeah. So 1st task for me is just to like, evaluate the Cdp, if we want to stay with segment or maybe move to rudder, stack.

126 00:15:35.760 00:15:43.050 Henry Zhao: Okay, well, that’s great. That’s really all I had to talk about today.

127 00:15:43.820 00:15:48.390 Henry Zhao: Thank you for the Good Intro. And then, if I ever need any help. I’ll probably reach out to you or or Robert

128 00:15:48.823 00:15:49.639 Henry Zhao: but thank you so much.

129 00:15:49.640 00:15:50.889 Awaish Kumar: Thank you. Yeah.

130 00:15:51.420 00:15:53.739 Awaish Kumar: Okay, have a good day. You, too. Bye?