Meeting Title: Javy-Data-Engineering-Weekly Date: 2024-10-15 Meeting participants: Nicolas Sucari, Aman Nagpal, Brian Pei, Payas Parab
WEBVTT
1 00:04:41.230 ⇒ 00:04:43.860 Brian Pei: Recording in progress. Hello.
2 00:04:44.430 ⇒ 00:04:45.440 Aman Nagpal: Hey? How’s it going.
3 00:04:46.130 ⇒ 00:04:48.570 Brian Pei: It’s going. Well, how was your week last week?
4 00:04:49.410 ⇒ 00:04:54.469 Aman Nagpal: Good man. We were just at a retreat in Vegas.
5 00:04:55.432 ⇒ 00:05:01.370 Aman Nagpal: So just was, you know. Am I the whole week. But back in action. Now, how about you guys.
6 00:05:02.580 ⇒ 00:05:09.799 Brian Pei: I was actually wait last week. Last week is when I got back from Vegas. My parents retired there. I think I mentioned that
7 00:05:10.510 ⇒ 00:05:11.230 Aman Nagpal: Oh.
8 00:05:11.740 ⇒ 00:05:14.069 Aman Nagpal: I can’t recall. Yeah, that’s awesome.
9 00:05:14.450 ⇒ 00:05:20.760 Brian Pei: At the start of this project I was pst in Vegas with my parents, and now I’m back in Brooklyn.
10 00:05:21.450 ⇒ 00:05:24.210 Aman Nagpal: Oh, nice, nice. Yeah. Vegas was a
11 00:05:24.710 ⇒ 00:05:26.950 Aman Nagpal: good and dangerous time.
12 00:05:28.060 ⇒ 00:05:35.319 Brian Pei: Super dangerous this time. I’m usually down this last time I was up for, like the 1st time in like 3 years.
13 00:05:35.870 ⇒ 00:05:43.540 Aman Nagpal: That’s good to hear that. My problem is that once I’m down, I’m just like trying to claw back. But if I’m up like alright, you know I can walk away now.
14 00:05:47.100 ⇒ 00:05:51.060 Brian Pei: Yeah, if you’re there for oh, more than 2 days like.
15 00:05:51.070 ⇒ 00:05:54.310 Brian Pei: it’s 1 of the only things to do. So
16 00:05:55.170 ⇒ 00:05:59.269 Brian Pei: you can be down one day, and the next day you’ll be like you know. What? Why not
17 00:06:00.060 ⇒ 00:06:01.550 Brian Pei: roll the dice one more time.
18 00:06:01.820 ⇒ 00:06:08.619 Aman Nagpal: Yeah, might as well. And everything you can stay in a hotel. Everything you have to. You want to get to. You gotta walk to the Casino, anyway, right.
19 00:06:11.920 ⇒ 00:06:13.070 Brian Pei: It’s a gift and a curse.
20 00:06:14.770 ⇒ 00:06:16.165 Brian Pei: sweet, I think.
21 00:06:18.070 ⇒ 00:06:19.450 Brian Pei: Let’s see.
22 00:06:21.270 ⇒ 00:06:23.449 Brian Pei: Nico’s here. Pais is here.
23 00:06:24.060 ⇒ 00:06:24.810 Nicolas Sucari: Hi guys.
24 00:06:25.450 ⇒ 00:06:26.890 Brian Pei: What’s up, hey?
25 00:06:27.520 ⇒ 00:06:29.987 Brian Pei: I was just kicking us off because
26 00:06:31.212 ⇒ 00:06:34.659 Brian Pei: I didn’t realize it was moved up. 15
27 00:06:34.670 ⇒ 00:06:39.980 Brian Pei: I have to drop at 2 30. But it’s fine. I was gonna go through my update and then pass it off to
28 00:06:40.361 ⇒ 00:06:46.509 Brian Pei: Nico and pies to show. Well, there are updates in in real. But I’m gonna quickly talk about
29 00:06:48.310 ⇒ 00:06:50.500 Brian Pei: Dbt, super, quick!
30 00:06:50.740 ⇒ 00:06:51.700 Brian Pei: Let me
31 00:06:53.170 ⇒ 00:06:53.950 Brian Pei: and
32 00:06:54.830 ⇒ 00:06:58.009 Brian Pei: not gonna bore everyone by going
33 00:06:58.130 ⇒ 00:07:01.060 Brian Pei: through code live. But
34 00:07:01.140 ⇒ 00:07:03.581 Brian Pei: it is more so. Just
35 00:07:04.200 ⇒ 00:07:05.330 Brian Pei: a
36 00:07:06.890 ⇒ 00:07:10.640 Brian Pei: look at what we’re going to do. So
37 00:07:11.150 ⇒ 00:07:14.440 Brian Pei: Dbt is going to be the
38 00:07:14.520 ⇒ 00:07:16.250 Brian Pei: transformation
39 00:07:16.730 ⇒ 00:07:18.980 Brian Pei: tool and orchestrator.
40 00:07:19.160 ⇒ 00:07:20.230 Brian Pei: 4.
41 00:07:20.270 ⇒ 00:07:23.150 Brian Pei: The tables, such as orders.
42 00:07:23.613 ⇒ 00:07:29.610 Brian Pei: That will be used in reporting. So it’ll be important as a code base to keep SQL. Logic
43 00:07:29.750 ⇒ 00:07:31.460 Brian Pei: and review and alter
44 00:07:32.007 ⇒ 00:07:34.710 Brian Pei: and then this will be spun up on
45 00:07:35.040 ⇒ 00:07:40.529 Brian Pei: Dbc cloud, which for one enterprise user is free. So another plus
46 00:07:40.580 ⇒ 00:07:42.949 Brian Pei: Dbt cloud, would be the
47 00:07:43.010 ⇒ 00:07:48.820 Brian Pei: orchestration engine using their own servers that would trigger our SQL. Code to
48 00:07:48.870 ⇒ 00:07:52.040 Brian Pei: update tables in Snowflake or create new tables in Snowflake.
49 00:07:52.860 ⇒ 00:08:01.229 Brian Pei: I again don’t want to bore everyone. So I’m going to quickly go just through what we have in
50 00:08:01.300 ⇒ 00:08:10.160 Brian Pei: the repo. So far it will be open to everybody the once this is merged for the first, st like the 1st phase of what what we’re gonna call this
51 00:08:11.052 ⇒ 00:08:13.579 Brian Pei: there are these
52 00:08:13.740 ⇒ 00:08:16.989 Brian Pei: folders. But you don’t really have to worry about anything. Besides models
53 00:08:17.080 ⇒ 00:08:23.629 Brian Pei: there are. There are a couple of macros. Macros are kind of just it’s it’s ginger. It’s
54 00:08:24.180 ⇒ 00:08:25.510 Brian Pei: things that
55 00:08:26.321 ⇒ 00:08:28.830 Brian Pei: for instance. I can add.
56 00:08:29.726 ⇒ 00:08:42.169 Brian Pei: I can add things to this config at the top. That tells it that for this specific set of SQL. I want it to go to the Amazon schema in the intermediate database
57 00:08:42.508 ⇒ 00:08:53.699 Brian Pei: or the prod analytics database, whatever that is. And there’s a lot of things you can do with configs. But I’m not gonna bore you with that right now. That’s what Macros would be doing. But the meat and bones would be in models.
58 00:08:54.416 ⇒ 00:08:57.960 Brian Pei: We organized it in a way that
59 00:08:58.130 ⇒ 00:09:04.221 Brian Pei: there will be in, like the intermediate or staging. You can think of it like that way which
60 00:09:04.610 ⇒ 00:09:08.000 Brian Pei: cleans up and has business logic
61 00:09:08.110 ⇒ 00:09:10.439 Brian Pei: for a specific
62 00:09:11.590 ⇒ 00:09:21.470 Brian Pei: table, object for shopify and Amazon. And then, as we move forward, different sources will have their own folder in here, so that it can create, for example, a clean version of
63 00:09:21.780 ⇒ 00:09:33.440 Brian Pei: intermediates, or you can think of it as staging again, shopify orders specifically. So it’ll have all the code here. And then the config will tell Dbt. Cloud what to do.
64 00:09:33.710 ⇒ 00:09:35.389 Brian Pei: and then
65 00:09:35.450 ⇒ 00:09:38.579 Brian Pei: for Mart’s it’ll be more for
66 00:09:39.398 ⇒ 00:09:48.389 Brian Pei: this is the 1st pass at having Amazon orders and shopify orders union together so that you can query or
67 00:09:48.430 ⇒ 00:09:51.627 Brian Pei: visualize a table called Fact Orders,
68 00:09:52.270 ⇒ 00:10:01.789 Brian Pei: you, or like shopify order would be all obviously shopify orders. You can query these tables too, specifically, if you would like
69 00:10:01.910 ⇒ 00:10:04.070 Brian Pei: to see that broken out. But
70 00:10:04.080 ⇒ 00:10:05.329 Brian Pei: because we have
71 00:10:05.370 ⇒ 00:10:11.017 Brian Pei: snowflake, and we have marts it makes sense to try to consolidate as much as possible.
72 00:10:11.470 ⇒ 00:10:13.600 Brian Pei: so that there’s like one single source of truth
73 00:10:13.770 ⇒ 00:10:18.819 Brian Pei: per vendor or application for a specific business object.
74 00:10:19.420 ⇒ 00:10:24.680 Brian Pei: And we still got a couple more tables to go here. But orders being one of the most important.
75 00:10:25.063 ⇒ 00:10:29.300 Brian Pei: If you look at fact orders right now. It’ll you’ll have.
76 00:10:29.822 ⇒ 00:10:40.480 Brian Pei: It’s just a string, but the app source will. You’ll be able to flip flop between shopify and Amazon or if you take that out, you can see it all in totality.
77 00:10:41.195 ⇒ 00:10:46.590 Brian Pei: This right now is, was what I’ve been working on the past couple of
78 00:10:46.650 ⇒ 00:10:49.109 Brian Pei: business days in terms of
79 00:10:49.170 ⇒ 00:10:51.160 Brian Pei: testing and
80 00:10:51.710 ⇒ 00:10:58.249 Brian Pei: obviously like code review stuff. So it is an open Pr once it’s merged. It’ll start
81 00:10:58.440 ⇒ 00:11:14.630 Brian Pei: creating these prod tables so that we don’t have to keep looking at the Dev tables. They’ll be the same except the prod tables are just the actual table name it’s supposed to be, and those prod tables will actually refresh automatically every morning as opposed to me. Running it in snowflake.
82 00:11:15.100 ⇒ 00:11:16.090 Brian Pei: Yeah. So
83 00:11:16.420 ⇒ 00:11:26.782 Brian Pei: I will give a pretty major update when these table names are live and in prod and send you links to where you can see the run logs and
84 00:11:28.040 ⇒ 00:11:35.190 Brian Pei: like all that stuff for the orchestration piece of this. But it’s more so just to show
85 00:11:36.128 ⇒ 00:11:39.320 Brian Pei: the Dbt progress, and
86 00:11:39.730 ⇒ 00:11:43.380 Brian Pei: when it is merged, I’ll probably do a deeper dive into it. But for now
87 00:11:45.070 ⇒ 00:11:47.250 Brian Pei: I will be reviewing this
88 00:11:47.670 ⇒ 00:11:50.129 Brian Pei: with the team and merge it
89 00:11:50.170 ⇒ 00:11:53.281 Brian Pei: this week a hundred percent. So
90 00:11:54.570 ⇒ 00:12:10.400 Brian Pei: yeah, once that’s merged, we’ll get the new tables. For the development being done on the Bi side. It’ll be a simple table name swap. The names of the columns aren’t gonna change. So that’s not too big of a deal for the purposes of this demo. I think we’re still using the Dev tables. But
91 00:12:10.759 ⇒ 00:12:17.339 Brian Pei: I’ll let everyone know when prod runs start running so we can actually have prod tables to look at and query.
92 00:12:19.300 ⇒ 00:12:21.399 Brian Pei: and yeah, that’s my!
93 00:12:21.450 ⇒ 00:12:26.869 Brian Pei: I thought Pius had his hand over his head like he was like, I hate all of this and that.
94 00:12:26.970 ⇒ 00:12:29.682 Brian Pei: but I hope that wasn’t the case.
95 00:12:30.450 ⇒ 00:12:35.089 Brian Pei: but yeah, dbt, coming along nicely. I’m pretty excited about it.
96 00:12:36.580 ⇒ 00:12:43.169 Brian Pei: for now I’ll I’ll throw it to to Nico for any of the other updates, and I might have to drop in 5 min, but just wanted to
97 00:12:43.420 ⇒ 00:12:47.790 Brian Pei: share and hopefully, not like over. I don’t want to over complicate things like, but.
98 00:12:47.790 ⇒ 00:13:16.850 Aman Nagpal: No, no, not at all. And thanks for going through that, I guess, before you hop. I know we’ve gone a bit into. I’ve asked before. You know kind of what Dbt is doing here. So I know you said. It’s it’s running the transformations. It’s you know we’re running it. We’re going to run in the Dbt. Cloud. What, exactly would be the difference, you know, if we weren’t using Dbt, right? So I know, I guess everyone’s kind of using it. Now, would we just be running that sequel directly in Snowflake? Is that what you’re mentioning? Or I guess what’s kind of like the 15 second breakdown of that.
99 00:13:17.160 ⇒ 00:13:20.834 Brian Pei: Sure. You can get away with
100 00:13:22.720 ⇒ 00:13:51.750 Brian Pei: yeah. Well, I think. There’s there’s all these features I’m going to just talk about one of them dbt, has. Instead of referencing static table names, it references. Jinja. So it’s SQL plus a little bit of dynamic python where when tables. Need to run sequentially. So like, I need Amazon orders and shopify orders both to finish updating before I run all orders. Right? Dbt, automatically will make
101 00:13:52.230 ⇒ 00:14:05.569 Brian Pei: a dag, and it will run things and trigger things in the order of operation, so that you will never miss data in between, because these all models will get more complex, though, like some tables, will
102 00:14:05.810 ⇒ 00:14:34.099 Brian Pei: 10 staging layers and a prod, and then the reporting and so that’s probably the biggest feature that you would miss if you if we weren’t using Dbt Cloud, there’s something called snowflake tasks, which is basically just you save a SQL. Query. You put in a cron schedule, and it’ll run that snowflake wouldn’t have that reference thing. So we would guesstimate and time out the cron schedules for those refreshes so you could.
103 00:14:34.462 ⇒ 00:14:38.649 Brian Pei: You know, moving forward when you start learning about Dbt, and you might.
104 00:14:38.720 ⇒ 00:14:47.659 Brian Pei: You know there there were clients who are like you know what we might not need this. We can just use snowflake tasks. I would be happy to break down the pros and cons, but
105 00:14:47.830 ⇒ 00:14:59.410 Brian Pei: technically you could do all of that in Snowflake using something called snowflake tasks. Yeah, it would. Just. It wouldn’t be as dynamic. Kind of just yeah at the mercy of like a cron cron. Scheduler.
106 00:14:59.810 ⇒ 00:15:10.989 Aman Nagpal: That’s exactly the word I was. Gonna use is this is just way more dynamic. And it’s you know where we’re getting the exact points where everything’s updated and refreshed as opposed to just guessing. Which I know with Cron triggers it
107 00:15:11.010 ⇒ 00:15:21.450 Aman Nagpal: it can be tough to kind of deal with. So I think this is definitely the move. Just. I’ll try to dive a little deeper into Dbt and see what else is is going on there.
108 00:15:21.690 ⇒ 00:15:22.510 Aman Nagpal: Sure.
109 00:15:22.510 ⇒ 00:15:27.740 Brian Pei: And I’m happy. I’ll do like. Outside of this meeting we can do like a Dbt deep dive stuff
110 00:15:27.760 ⇒ 00:15:29.510 Brian Pei: once things are running so.
111 00:15:29.820 ⇒ 00:15:31.050 Aman Nagpal: Thank you, appreciate it.
112 00:15:31.340 ⇒ 00:15:32.020 Brian Pei: Sure, sure.
113 00:15:34.640 ⇒ 00:15:44.620 Nicolas Sucari: Cool. Excellent! Thanks, Brian. Then, we have you been able to check the real dashboard? Someone? Were you able to access.
114 00:15:45.080 ⇒ 00:15:52.979 Aman Nagpal: I was able to open it and access it. I didn’t do a deep look into it, but I did have it open. Let me see if I still do.
115 00:15:53.190 ⇒ 00:16:06.860 Nicolas Sucari: Excellent. Okay, we’re, as as Brian say, says, we’re using the Dev tables yet until that pull request is merged and the table names will change, and we will need to update a little bit that in in real
116 00:16:07.249 ⇒ 00:16:23.230 Nicolas Sucari: but I already pushed the other real dashboard, Brian, the one on all orders. So that’s also coming from a dev table but just in order to start taking a look or poking around on the data, we can
117 00:16:23.230 ⇒ 00:16:27.210 Nicolas Sucari: you? You can start doing that because it’s already on the
118 00:16:27.990 ⇒ 00:16:28.740 Nicolas Sucari: real
119 00:16:28.760 ⇒ 00:16:32.510 Nicolas Sucari: cloud deployed. So you can be able to access that
120 00:16:33.036 ⇒ 00:16:35.019 Nicolas Sucari: so, according to real.
121 00:16:35.020 ⇒ 00:16:51.520 Aman Nagpal: When you say it’s all using the same data. But just the Dev table tables is our initial kind of draft organization of the data. And now we’re going to have the new organization right? And then, can you just give me another breakdown of what we’re using row for.
122 00:16:52.880 ⇒ 00:16:54.990 Nicolas Sucari: We’re using real for
123 00:16:55.060 ⇒ 00:16:58.420 Nicolas Sucari: as a visualization tool of all of the data. Let me show you
124 00:16:58.490 ⇒ 00:16:59.550 Nicolas Sucari: real quick
125 00:16:59.620 ⇒ 00:17:01.390 Nicolas Sucari: how I’m looking at.
126 00:17:02.360 ⇒ 00:17:06.100 Aman Nagpal: Basically, all the tables is just. Here’s a visual look of it.
127 00:17:06.109 ⇒ 00:17:07.079 Nicolas Sucari: Exactly. Yeah.
128 00:17:07.289 ⇒ 00:17:08.329 Nicolas Sucari: So
129 00:17:09.279 ⇒ 00:17:25.549 Nicolas Sucari: we we created a project called an organization called Jabi Coffee. Obviously. And we have in these real organization, like the the dashboards that we are creating, those dead tables that Brian was showing on Snowflake.
130 00:17:25.889 ⇒ 00:17:33.889 Nicolas Sucari: On in in the repo they are pulling. The day really is pulling the data from that, the tables, and it’s all coming here and
131 00:17:33.969 ⇒ 00:17:54.049 Nicolas Sucari: gathering the measures and dimensions. And we are getting these kind of dashboard where you can. Actually, this is this has already the data all loaded. And you can like poke around and try to do some analysis or filter down anything you want and try to look into specific things. For example, I don’t know if you wanna see
132 00:17:54.467 ⇒ 00:18:20.559 Nicolas Sucari: this, the these are all orders shipping address preferences code from Texas. You just can click Texas and all of the measures and all the dimensions will filter down on on Texas. It’s kind of a visualization tool for exploration, for doing some analysis. And if you want to answer quick questions regarding the the data you can come here into real and just start to work with that
133 00:18:21.065 ⇒ 00:18:25.749 Nicolas Sucari: and this is for shopify order metrics. But we have also.
134 00:18:26.517 ⇒ 00:18:39.679 Nicolas Sucari: the other one that is coming from shopify and Amazon. We’re trying to create this all order view that has shopify and Amazon as a source. And you can check, like all all different orders. Okay?
135 00:18:40.299 ⇒ 00:18:45.479 Nicolas Sucari: And all all the the metrics are all like. The views are always the same. You can.
136 00:18:45.659 ⇒ 00:18:50.219 Nicolas Sucari: You are, gonna see measures on the left. And with these kind of
137 00:18:50.549 ⇒ 00:19:12.919 Nicolas Sucari: graphs and all of the dimensions as tables here. But then, if you want to create like specific pivot tables, or you want to create your own table and export it. You can come here into pivot and select the measures and dimensions, and you can start, for example, I don’t know total orders. You can start dragging, and you want to see. I don’t know the order name. You can add here into rows.
138 00:19:12.929 ⇒ 00:19:17.149 Nicolas Sucari: and you can start adding different columns and then export like this data set.
139 00:19:18.019 ⇒ 00:19:18.709 Nicolas Sucari: Okay.
140 00:19:18.710 ⇒ 00:19:28.239 Aman Nagpal: Very cool. So is this something? That is 2 questions. One. Does it have a cost cost associated with it? And 2, is it something that we need for the long term.
141 00:19:29.380 ⇒ 00:19:39.670 Nicolas Sucari: I mean, ideally, this is something that we can discuss with Payas, so I don’t know. Pay us if you want to jump in, and we can get into discussing about the tool for visualization that we want to use.
142 00:19:39.970 ⇒ 00:20:01.640 Payas Parab: Yeah. So I I think at the end of the day, right? Like, we’re doing all this back end work. And now we need to like, get it into an environment that’s friendly for you guys. And like, you know, I’m on as you guys are in that process with a data analyst your future data analyst to be able to like, go in or like Jared or Justin, or whoever to go in and like interact with the data. This real serves as a great like self. Serve tool.
143 00:20:01.905 ⇒ 00:20:29.799 Payas Parab: It’s not as like if you want, like a bespoke like dashboard that you like. Wanna make like real right now doesn’t have that capability. So what we’re suggesting is like real as like a self serve area where, like once, these models are defined. Anyone can go in like you can see. You can like play around with the like. You could play around with it. You could like filter right? Like for certain time periods. I know you can like filter based on like order attributes. You can like compare like this is like a really easy environment to like
144 00:20:29.820 ⇒ 00:20:54.500 Payas Parab: play around with the key metrics that you need as Nico showing you right now. So that is like, I think this is something that you’ll want for the long run that, like anyone can go in and like, play around with this data. As far as like custom dashboards that come from the data we’re gonna need to like implement a different type of tool. Unfortunately, real is was like made as like a self serve tool and so we, we have some other options that we can discuss with you.
145 00:20:54.810 ⇒ 00:21:24.579 Payas Parab: but yeah, real is something we would recommend. You just have handy for any of your data analysts or any key stakeholders where they can just go in and play around with data. Once we set up like orders. Once we set up like fulfillment dashboards like whatever relevant dashboards are there, they’ll be able to like interact and play around with the data super easily. Now, in terms of like custom, dashboarding and custom, like data. Like a place to view the data. We have a couple of options. They again do have somewhat of an ongoing cost. But they’re not like
146 00:21:24.960 ⇒ 00:21:53.109 Payas Parab: a lot of these tools aren’t like outrageously expensive. They’re like pretty reasonable. And that’s just like they’re like a front end to be able to deploy. And I can share my screen and show like there’s a few options. But what we’re recommending is like, real is like your self, serve environment. And then you’ll have additional place where you can get like whatever sequel queries you guys write or whatever like data you want to view quickly. Using like a data analyst, that one.
147 00:21:53.359 ⇒ 00:21:58.399 Payas Parab: We have a couple of options which I can walk through. If that if you’re like, want to know about that.
148 00:21:58.800 ⇒ 00:22:00.730 Aman Nagpal: So would this be something like.
149 00:22:01.040 ⇒ 00:22:03.070 Aman Nagpal: you know? Oh, we wanna know
150 00:22:03.110 ⇒ 00:22:05.530 Aman Nagpal: how many Amazon orders we got
151 00:22:05.790 ⇒ 00:22:14.370 Aman Nagpal: in the last 2 years. So instead of going to amplitude, let’s say, or whatever visualizations we end up with at the end alongside.
152 00:22:14.370 ⇒ 00:22:14.859 Payas Parab: Yeah, but.
153 00:22:14.860 ⇒ 00:22:16.490 Aman Nagpal: We would use rail for something like that.
154 00:22:16.490 ⇒ 00:22:33.330 Payas Parab: So like, yeah, Nico is walking through right now, like, this is exactly like the type of question you can answer is like, all right last 2 years. Let’s check Amazon orders. And you can like find all of them right here and now. Your filter for Amazon. You can check like additional attributes to you can see, like
155 00:22:33.580 ⇒ 00:22:38.440 Payas Parab: whatever refunds the financial status of them, the Aov etc.
156 00:22:39.120 ⇒ 00:22:42.039 Aman Nagpal: That’s awesome. Okay, so, and how much does this cost?
157 00:22:43.680 ⇒ 00:22:56.080 Nicolas Sucari: So the pricing, I think, is, yeah. It will depend on the amount of data that you’re like moving into here and trying to load. But I think it’s something around 200 250 a month something like that.
158 00:22:56.710 ⇒ 00:22:57.140 Aman Nagpal: Okay.
159 00:22:57.140 ⇒ 00:23:08.090 Nicolas Sucari: I can get the details. We do, Tom, and then let you know. But yeah, maybe we can set up a meeting with real yeah, with with the real people, and see exactly what how much it’s gonna cost.
160 00:23:08.710 ⇒ 00:23:23.472 Aman Nagpal: Yeah. If you can send over just a slack of what you know, the exact details, you do know right now and then we can always get on a call with them later. And then as long as there’s safeguards in in play in place that you know, hey? We’re not gonna end up with, like, you know, a thousand per month. Bill.
161 00:23:23.700 ⇒ 00:23:24.629 Nicolas Sucari: Yeah, yeah, exactly.
162 00:23:24.630 ⇒ 00:23:33.009 Aman Nagpal: Anything like that. But I think this sounds good, and once we’re done with the Dev tables and everything’s pushed once it’s refreshed here.
163 00:23:33.265 ⇒ 00:23:39.569 Aman Nagpal: I think it’d be a good time to send this to Justin and Jared, and let them play around with it and see you know what’s happening there.
164 00:23:40.060 ⇒ 00:23:49.460 Aman Nagpal: but yeah, that sounds good. So you know, we have 5 Tran getting the data into Snowflake. I’m just going through through this again. Then we have Dbt on top helping us transform the data
165 00:23:49.965 ⇒ 00:23:58.159 Aman Nagpal: and then we have the visualization tools, one of which is the basic one which is real. And then I guess alongside it would be amplitude.
166 00:23:59.382 ⇒ 00:24:03.920 Nicolas Sucari: Yeah, I don’t know. We we have different options. Maybe we can go with what a man has.
167 00:24:03.920 ⇒ 00:24:11.969 Payas Parab: Yeah. So when we when we come now that all the data is in the data warehouse, and then we’re moving from the data warehouse amplitude kind of is its own.
168 00:24:12.040 ⇒ 00:24:32.160 Payas Parab: Unlike, it’s kind of its own thing. It’s like event tracking as opposed to like your actual business data, right? Which is what you’re looking at here. Right? The event tracking is like known to be like, have reliability issues. So we have now that we have the data here in like your warehouse, we can put it somewhere. That’s like easy to visualize. So like Nico, do you mind if I share my screen.
169 00:24:32.160 ⇒ 00:24:32.890 Nicolas Sucari: Yeah, yeah.
170 00:24:33.030 ⇒ 00:24:40.880 Payas Parab: So there’d basically be another tool. We we are trying to avoid having like 2 tools where you can see. But real self serve. Capabilities are like.
171 00:24:40.970 ⇒ 00:24:54.910 Payas Parab: really nice. And we think you should have those like, we think those are valuable to you. Again, what software is you keep? Choose to keep which ones you try out. We’re here to present you the options. And you know it’s up to you. You’re like, Okay, fuck it. We don’t like that.
172 00:24:55.330 ⇒ 00:25:03.349 Payas Parab: We’re gonna close it down like it is what it is. But we’re here to present you options. So what I’m gonna show you is today, for example, I sent Jared.
173 00:25:03.770 ⇒ 00:25:10.656 Payas Parab: I sent Jared. You know we’re working on some of this dashboarding stuff, tying out the numbers to what he’s seeing.
174 00:25:10.970 ⇒ 00:25:12.240 Aman Nagpal: How’d that call go? By the way.
175 00:25:12.570 ⇒ 00:25:13.420 Payas Parab: What’s up?
176 00:25:13.420 ⇒ 00:25:15.900 Aman Nagpal: How’d the call go with him? Everything good and everything. After that.
177 00:25:16.146 ⇒ 00:25:31.933 Payas Parab: It’s okay. I think I think it’s I mean he. He has a fair point, which is like he’s still just waiting for those deliverables to be like, ready to go, you know, like, like, it’s like, I can use this today. And we’re just debugging a bunch of data issues, making sure we understand things. So I’m iterating with him a little bit.
178 00:25:32.200 ⇒ 00:25:34.289 Payas Parab: yeah, there’s kind of like.
179 00:25:34.660 ⇒ 00:25:36.360 Payas Parab: yeah, we’re trying to get it
180 00:25:36.740 ⇒ 00:25:45.529 Payas Parab: into that dashboard format you like. So can you see my screen? By the way, you’re seeing Meta base? So what I shared with him today, right? Was like.
181 00:25:45.570 ⇒ 00:25:54.780 Payas Parab: if you needed like data that refreshed really, really quickly, and has all the latest data. And you have, like a data analyst who can like write sequel to pull this for you.
182 00:25:54.880 ⇒ 00:25:56.870 Payas Parab: This is a pretty good option.
183 00:25:57.192 ⇒ 00:26:01.880 Payas Parab: Is just doing it in Snowflake, which you’re already paying for, which is great right
184 00:26:04.352 ⇒ 00:26:05.990 Payas Parab: you with me so far. Mom.
185 00:26:05.990 ⇒ 00:26:06.979 Aman Nagpal: Yeah, yeah, so, yeah.
186 00:26:06.980 ⇒ 00:26:27.190 Payas Parab: Yeah, this is, this is like option one which is like, Hey, like, okay, I can get the sequel query written here. You can share this with other members of the organization, and all the data is here. Now, is this the most aesthetic thing in the world? No right? So there are ways to like, do some basic charting which you see here. This data table isn’t really made for charting that. Well, because it’s like a giant data table. But
187 00:26:27.290 ⇒ 00:26:47.119 Payas Parab: you can do some charting. The visualizations are not as good, right? This isn’t super aesthetic to look at. This is just like, Okay, great. I can get that data. It’s refreshed and clean. That’s 1 option. The other option is, we have these things hex tech. I think it was Brandon in the chat asked about this one. This is another tool you can use, which is like
188 00:26:47.140 ⇒ 00:26:49.860 Payas Parab: the way I think about it is like it’s like a
189 00:26:50.140 ⇒ 00:27:03.630 Payas Parab: like a dashboard enriched document. So you can like write documents that are like, Hey, here’s our stuff. Here’s all our key dashboards. You can like format it so that, like you have it like split halfway. Here you can add like notes at the top. Right?
190 00:27:03.640 ⇒ 00:27:23.410 Payas Parab: This is one option you have as well, and this tool is like, really nice. And what you’ll see is you can download the data here. Anytime you want. And this data actually refreshes. So it’s not like it’s like this was a copy and pasted table. This is a query that will be refreshed every time you open the page, which is a really nice kind of feature.
191 00:27:25.380 ⇒ 00:27:31.029 Payas Parab: Got it? This is one of your options. I can double check on cost on this one. This one’s known to be like.
192 00:27:31.310 ⇒ 00:27:46.799 Payas Parab: It’s it’s again. It’s just gonna add a couple of $100 a month. But it’s to make these like really nicely formatted. Much more aesthetic kind of things. That snowflake, I don’t think is as aesthetic right? Like, it’s kind of dirty. But you can easily get the info you need through your data analyst.
193 00:27:46.960 ⇒ 00:27:47.960 Payas Parab: Yeah.
194 00:27:48.370 ⇒ 00:27:55.609 Payas Parab: that’s another option. There’s another option as well that if you’re worried about costs we wanted to present you is a it’s a tool called streamlit
195 00:27:55.750 ⇒ 00:28:10.776 Payas Parab: and streamlit, is like a custom scripting tool. This is a sample I built for some, some another client here. And they were like an e-commerce store. They were like more like large scale e-commerce, like like mid like a distributor via e-commerce
196 00:28:11.220 ⇒ 00:28:24.279 Payas Parab: And so like this is a scripting language where you can make custom dashboards, you can add like sliders. You can customize all of these things. You can have tables. In here. You don’t see that, but you can make like charts like bar charts, pie, charts, etc, etc.
197 00:28:24.540 ⇒ 00:28:30.440 Payas Parab: This is included in Snowflake right since you’re already paying for Snowflake. This is something you’ll get already with it.
198 00:28:30.919 ⇒ 00:28:40.349 Payas Parab: The one downside to this is that it requires scripting like at the back end of this is like code, like python code. And if I were to show you like I could quickly
199 00:28:40.360 ⇒ 00:28:44.570 Payas Parab: go to my github and show you like. It’s not like trivial code, you know.
200 00:28:45.390 ⇒ 00:28:53.000 Payas Parab: analytics. Dash, Sam, like, it’s like, it’s not crazy hard to write if I’m being honest. But it’s not like.
201 00:28:53.590 ⇒ 00:29:06.960 Payas Parab: it’s like, yeah, it’s it’s like, okay, cool, like I’d made like the headers. And I like had the sliders right. And then I can display things as tables, charts. You do have to write code. So there’s like an uplift there
202 00:29:07.060 ⇒ 00:29:14.690 Payas Parab: and then another option just want to again present. This is known to be one of the cheaper options for quickly having data. Visualization is meta base.
203 00:29:14.790 ⇒ 00:29:33.556 Payas Parab: It’s your standard like visualizations. It’s not really good at self serve, which is why we still recommend. You have real. The self serve. Capabilities we know are not good on here. It is cheap, and your data analyst can write you a sequel query, get you a dashboard and share it really easily and seamlessly. So we have a couple of options right?
204 00:29:34.000 ⇒ 00:29:40.390 Payas Parab: they all involve kind of like someone actually writing the SQL. Query, and like whipping up these dashboards. But to what extent, like.
205 00:29:40.570 ⇒ 00:29:47.810 Payas Parab: you know, you can get the free one with Streamlin. But they someone would have to go through and like script that app right to make this type of visualization
206 00:29:48.780 ⇒ 00:30:06.709 Payas Parab: snowflake, you can get like data tables really easily. And you can share them among team members. And then this is like probably the most aesthetic for, like business users to like share information among each other. This will be higher cost than metabase from if I recall off top my head.
207 00:30:06.940 ⇒ 00:30:18.439 Aman Nagpal: So the difference between these few options that you’re giving us and rail is that rail is just you see what you get? These are, you can create the dashboards that we want with SQL. Python. Whatever it is right.
208 00:30:18.440 ⇒ 00:30:19.670 Payas Parab: Exactly, exactly.
209 00:30:19.670 ⇒ 00:30:43.800 Nicolas Sucari: Exactly so, I mean in real you can. You can write your your queries. But the actual view, I mean, they don’t have like different options for charts. It is just what what they have there, just those line charts and those tables as dimensions. And you can poke around. You can compare data. You can filter down dates. You can do everything real quick, but you don’t have the ability to add like a pie chart or a bar chart.
210 00:30:44.301 ⇒ 00:30:54.320 Nicolas Sucari: It’s it’s limited. There you can write different queries, and you can create a lot of dashboards. But don’t give that aesthetic view of of different things.
211 00:30:54.830 ⇒ 00:30:58.270 Aman Nagpal: No, that makes sense. And I appreciate all the options. I think we definitely
212 00:30:58.560 ⇒ 00:31:07.529 Aman Nagpal: should, you know internally sit down and say, Hey, look, these are the different options. This is what they look like. I think it a lot of it probably will come down to a user experience standpoint what they like.
213 00:31:07.807 ⇒ 00:31:32.709 Aman Nagpal: I guess the main thing that comes to mind, though, is, you know, initially, when we spoke it was kind of like we’ll keep most of our stuff in amplitude, which will grab data from, you know, the data warehouse, or you know, snowflake integration. And then, for I guess, financial data quote unquote, we’ll use a different visualization tool which might be something like these. So if you can relate kind of that, what we were talking about earlier to these, and how that fits. So you know, for things like
214 00:31:33.390 ⇒ 00:31:48.529 Aman Nagpal: how many users cancel their subscriptions, and we got that information from the recharge Api. And what’s their churn over the last 6 months? Things like that? Would we continue to use amplitude for that? And if so, how does that fit into all this, or are you suggesting we go with something else.
215 00:31:49.960 ⇒ 00:32:06.060 Payas Parab: On on that front. I can chime in here is like, basically what we want to do. And and this is something that the Brainforge team and I have been discussing is we want to get that amplitude data to whatever extent we can. And we’re still trying to scope all that out into snowflake as well.
216 00:32:06.346 ⇒ 00:32:29.003 Payas Parab: I I. Robert has worked with this before. There’s something that we can do that basically like move some of that event data. So then, you’re like sequel, analyst is able to pull your like actual Amazon data and then pull the amplitude events that are tied to that and like, merge those tables and quickly whip up. You know something right versus an amplitude that like, if you can’t tie everything to an event, you can’t make that
217 00:32:29.360 ⇒ 00:32:39.120 Payas Parab: at table join. Essentially. So we’re going to try and move events from amplitude into your snowflake warehouse as well. So their data tables that they can play with
218 00:32:39.428 ⇒ 00:32:40.920 Payas Parab: when it comes to like
219 00:32:41.040 ⇒ 00:33:00.099 Payas Parab: custom, charting, and like playing around, especially when it comes to those like marketing type, like funnel events and things like that. Amplitude is just better for it. But whatever data we think would be really valuable to connect to the other data sources, we’re going to pipeline that into your snowflake as well. So it’s accessible to your data analysts. So there’s kind of like
220 00:33:00.650 ⇒ 00:33:08.019 Payas Parab: it’s not. I know it’s not great having so many different tools. It’s like kind of a pain. Oh, I have a question. And like, Where do I go? It’s like one of these tools.
221 00:33:08.330 ⇒ 00:33:15.670 Payas Parab: but amplitude. When it comes to like tracking user journeys, they just are superior. And that stuff doesn’t translate well into
222 00:33:15.720 ⇒ 00:33:21.180 Payas Parab: like raw data as easily as like financial and order data does. So.
223 00:33:21.180 ⇒ 00:33:22.169 Aman Nagpal: Still use it. It’s.
224 00:33:22.170 ⇒ 00:33:23.040 Payas Parab: Where the.
225 00:33:23.240 ⇒ 00:33:40.159 Aman Nagpal: You know. Right now we’re feeding everything into amplitude directly backfill all that amplitude data into the data warehouse, and then we would eventually reroute all everything, sent amplitude to send into the data warehouse first, st and then amplitude could pull that data, and we could continue to use it for a lot of the things we’re using.
226 00:33:40.160 ⇒ 00:33:40.600 Payas Parab: Exactly.
227 00:33:40.600 ⇒ 00:33:41.430 Aman Nagpal: Not
228 00:33:41.760 ⇒ 00:33:46.080 Aman Nagpal: inventory levels, not financial data, but a lot of stuff we would continue to use.
229 00:33:46.080 ⇒ 00:34:11.540 Payas Parab: Yeah, and what I’m what I’m also thinking. This raises a great point of like, I, I right now, we’re like, still trying to figure out all this like data stuff, and like, get all this sorted out. Towards the end of the like this, like all these deliverables, what I think might be nice, for as like a takeaway from the Pungo and Brainforge team is like a document that’s like you’re a guide to your analytics stack. And like what you use. What for? We can put that Nico and I. I think that would be a good thing we could work on together, Nico, just like.
230 00:34:11.540 ⇒ 00:34:36.149 Payas Parab: get that all sorted out in order and like you’d have links. And it’s like, Okay, well, like, what do I go to? Real for? What do I go to Snowflake for what the fuck are all these tools? And like words you’re throwing at me the same questions you have them on. I assume other people in the business will, too, your future data folks, your technical engineering teams will have these same questions. So we can do maybe right now, we’re just a little busy with like trying to sort out all this data stuff.
231 00:34:36.250 ⇒ 00:34:58.940 Payas Parab: But like, maybe in a week or 2, Nico will add that as just like an item that will give as well as a deliverable is like the Pongo, and like brain forge guide to your analytics, stack with like all the tools we’ve done what they do and like kind of where to access them. So then it’s like a easy guide of like, oh, I need this data. This is where that’s at. Because I know in my guide, it says, that’s what yeah.
232 00:34:59.190 ⇒ 00:35:19.589 Aman Nagpal: Appreciate it. Yeah, that would be super helpful. Definitely, you know, in the next coming weeks, as we wrap up a lot of this other stuff. But cool, I think. Yeah. My main concern was, you know, we also just upgraded our our contract with amplitude to. We’re just. We just keep running out of events which you know, I know we’ll hopefully go down with the data warehouse solution. But you know we’re locked in for like
233 00:35:19.640 ⇒ 00:35:26.250 Aman Nagpal: 8 more than almost 2 years now. So and we’re just comfortable with it. So I just wanted to make sure that that plan was still in play, that.
234 00:35:26.840 ⇒ 00:35:44.750 Aman Nagpal: So even if all the data is going through the data warehouse first, st a lot of our performance marketing, visualizations and things, you know, and maybe some things will go out of amplitude if you think there’s a better way. But things like churn and cohorts. You know, comparing different landing pages. All of that data.
235 00:35:44.830 ⇒ 00:35:48.050 Aman Nagpal: It’s it seems like the plan is still to do that in amplitude, even if the.
236 00:35:48.050 ⇒ 00:35:55.499 Payas Parab: Yeah, yeah, the the event tracking like when it’s like a user journey, it just amplitude is like so much better at that. It’s just
237 00:35:55.850 ⇒ 00:36:16.260 Payas Parab: when it’s stitching together data. That’s what amplitude is not very good at right? Which was like we’re relying on like the event firing. And then, like that event is tied to an order, and like we can’t really find any of the information about the order. All we’re relying on is like whatever is in that amplitude event which would turned out to be not right most of the time.
238 00:36:16.260 ⇒ 00:36:27.100 Payas Parab: That’s the issue. Whereas, like we know the the ground truth lies in shopify right? Like shopify actually fulfills the order. And once we find all those data points, so I think that’s where
239 00:36:27.290 ⇒ 00:36:40.070 Payas Parab: these 2 separate tools come in, we’ll figure out what part is like bridging them so that there is some interplay between them. But when it comes to that customer journey like you can’t get better than amplitude like it’s just that’s the best way. Yeah.
240 00:36:40.070 ⇒ 00:36:47.649 Aman Nagpal: I’ll give you an example of something I’m working on now is getting. Justin’s not happy with gorgeous data. He wants to see what
241 00:36:48.100 ⇒ 00:37:06.009 Aman Nagpal: agent, what Cx agent, because they’re all abroad, is doing, what? Using what macros and which one is, you know, basically wants to see who’s canceling immediately versus trying to do something else before the user cancels their subscription things like that. So if I wanted to use gorgeous web hooks, I could
242 00:37:06.304 ⇒ 00:37:21.929 Aman Nagpal: fire one every time the ticket is updated, but they don’t have anything natively without using their Api to fire it every time a ticket is closed. So if I wanted to get that into amplitude with just a web hook. I would be fire firing 10 to 30 plus events to amplitude.
243 00:37:22.410 ⇒ 00:37:31.809 Aman Nagpal: pre ticket update, where instead, it could just go to the data warehouse and the data warehouse will record everything, and once the ticket is closed we can send that to amplitude. For example, something like that.
244 00:37:32.430 ⇒ 00:37:40.771 Payas Parab: Yes, yeah, I think the sending it back to amplitude might be harder than I think I. I may need some Roberts like input on that. Frankly.
245 00:37:41.240 ⇒ 00:37:50.390 Payas Parab: let me write that down as like a thing to follow up with him on. It’s but like, yeah, I think if it’s like you’re collecting data, and you just want to like store it all down somewhere like
246 00:37:50.480 ⇒ 00:37:56.410 Payas Parab: that, and like especially collecting it from web hook like. Then it’s like snowflakes. Kind of like the place to kind of dump that in. Yeah.
247 00:37:56.900 ⇒ 00:38:13.190 Aman Nagpal: Okay, that makes sense. Yeah, I just like I said, that’s the main thing. Make sure that’s still in play. I know I mentioned before, where Jared has been working with Iris AI to look at a lot of our financial data. I feel like that would be a good example to look at and say
248 00:38:13.710 ⇒ 00:38:24.000 Aman Nagpal: what visualization tool can maybe replace that for financial data of the options you’ve given or any other option. If you access to that? I’m sure Jared can get us access, and you can kind of just quietly.
249 00:38:24.000 ⇒ 00:38:26.469 Payas Parab: It’s called Iris. Like, IRI,
250 00:38:26.830 ⇒ 00:38:27.540 Payas Parab: yeah, it’s.
251 00:38:27.540 ⇒ 00:38:29.630 Aman Nagpal: CIRI, SAI.
252 00:38:29.920 ⇒ 00:38:30.610 Payas Parab: Okay.
253 00:38:31.260 ⇒ 00:38:33.910 Aman Nagpal: And they’ve been doing a sort of build out for us. But
254 00:38:34.207 ⇒ 00:38:43.370 Aman Nagpal: that might be a good kind of behind the scene, just to take a look at what’s there, what’s happening? And maybe we pick a visualization tool that, you know works that works
255 00:38:44.000 ⇒ 00:38:44.620 Aman Nagpal: bad.
256 00:38:44.770 ⇒ 00:38:45.450 Payas Parab: Yep.
257 00:38:47.460 ⇒ 00:38:52.039 Nicolas Sucari: I think. You need to. Under. Yeah, we need to understand, like, who’s the
258 00:38:52.200 ⇒ 00:39:01.550 Nicolas Sucari: who’s each dashboards target or who’s using each of the dashboards that that you want. So that’s the way we are gonna identify which tool is better for
259 00:39:01.630 ⇒ 00:39:19.169 Nicolas Sucari: for each of the different things we want to use. If Iris is showing dashboard with more kind of tiles and more kind of a structured dashboard that you look once a month, I think. Yeah, that shouldn’t be unreal or shouldn’t be. A real dashboard should be more on X Tech or something different
260 00:39:19.461 ⇒ 00:39:36.080 Nicolas Sucari: or Meta base. I don’t know. But if we are trying to answer questions like everyday questions by of the operation of the orders, or you wanna go in and check how many orders were in Amazon past week. That’s like a place you can easily go to real and check it like in in a minute.
261 00:39:36.570 ⇒ 00:39:46.830 Nicolas Sucari: Identifying like who’s who’s consuming each of the dashboards will let us know exactly. What kind of tool do we need to lean more into.
262 00:39:47.580 ⇒ 00:39:55.290 Aman Nagpal: Yeah, yeah, no. I think that makes sense. Financial data would be mostly Jared and Justin, maybe me once in a while, I mean all of them. I’ll put myself.
263 00:39:55.290 ⇒ 00:39:55.840 Nicolas Sucari: It’s out there.
264 00:39:55.840 ⇒ 00:39:59.170 Aman Nagpal: But performance marketing data.
265 00:39:59.540 ⇒ 00:40:08.890 Aman Nagpal: you know, mostly Justin and myself. And then, you know, we have certain reports and dash and amplitude where? You know, people that don’t really use amplitude.
266 00:40:08.980 ⇒ 00:40:21.309 Aman Nagpal: We’ll just look at those reports. So, for example, media buyer will look at our spend that’s being sent from north beam synced into amplitude, and we compare that against our
267 00:40:21.310 ⇒ 00:40:43.490 Aman Nagpal: orders, which are also an amplitude and kind of look at the Cac. That way on a daily basis. So that report, for example, does it make sense to continue to live in amplitude? Is there like an easy to simplified link? We can send to another visualization to the media buyer, where you can just look every day and get that data. These are kind of the questions that I know. You guys will continue to have. And hopefully, I can, you know, answer.
268 00:40:45.100 ⇒ 00:40:45.720 Payas Parab: Yep.
269 00:40:46.890 ⇒ 00:40:48.030 Payas Parab: Sounds great.
270 00:40:49.300 ⇒ 00:41:10.510 Payas Parab: Alright, yeah. So I think what I can do. I can send a recap of the tools we kind of talked about just to. Just so you have that documented somewhere. We’ll keep working on. I’m working on the work stream with Jared trying to reconcile some issues I’m seeing in the data working with Brainforge on, you know, what are those additional data sources? I think, Nico, did we discuss the additional data sources that we wanted to kind of like talk about
271 00:41:10.810 ⇒ 00:41:11.400 Payas Parab: making.
272 00:41:11.400 ⇒ 00:41:37.490 Nicolas Sucari: So we had a first.st Yeah, May, maybe we can can ask here a man like we already got. We’re we’re working with shopify and Amazon right now. I know you talked about Tiktok shops, but I think all of the Tiktok orders are in shopify, and we can identify that from there. So ideally, we need to to start working or start thinking. On which of the other data sources should we start trying to integrate into snowflake?
273 00:41:37.880 ⇒ 00:41:41.680 Nicolas Sucari: So I don’t know. Maybe you can guide us there.
274 00:41:42.470 ⇒ 00:41:43.189 Aman Nagpal: Yeah,
275 00:41:44.160 ⇒ 00:41:46.660 Aman Nagpal: I guess I’m trying to think what would be
276 00:41:46.680 ⇒ 00:41:48.053 Aman Nagpal: most important.
277 00:41:48.880 ⇒ 00:41:50.880 Aman Nagpal: I mean smaller ones.
278 00:41:51.210 ⇒ 00:41:58.329 Aman Nagpal: Kendo’s really basic data. It’s just review data that we’ve been running a cloudflare worker to get into amplitude. But
279 00:41:58.640 ⇒ 00:42:04.979 Aman Nagpal: I mean, if that can just do a sync from their Api to Snowflake, I think that that should be a simpler one.
280 00:42:05.292 ⇒ 00:42:08.319 Aman Nagpal: And then, you know, of course, there’s bigger ones. But
281 00:42:08.728 ⇒ 00:42:12.310 Aman Nagpal: Tiktok shop like you said it’s it’s in shopify. So that’s good.
282 00:42:12.320 ⇒ 00:42:14.619 Aman Nagpal: Maybe all of the
283 00:42:14.770 ⇒ 00:42:22.969 Aman Nagpal: recharge data and events, cancellations frequency changes all the user actions. I don’t know.
284 00:42:23.310 ⇒ 00:42:24.060 Aman Nagpal: If.
285 00:42:24.260 ⇒ 00:42:29.509 Aman Nagpal: do we want to start moving those into the data warehouse, or should those stick to going into
286 00:42:29.860 ⇒ 00:42:31.889 Aman Nagpal: amplitude? I’m not sure.
287 00:42:32.100 ⇒ 00:42:50.959 Nicolas Sucari: I mean, our our goal is still trying to get that gross profit margin dashboard right? We. As for now we have shopify and Amazon orders and kind of if they are, if they are being fulfilled by shopify or Amazon. We got that from those data sources. But if we
288 00:42:51.000 ⇒ 00:43:02.309 Nicolas Sucari: want to include, like other costs. For example, marketing costs or yeah other stuff, we need to start thinking on those other data sources where we have the data so that we can start bringing that in
289 00:43:02.621 ⇒ 00:43:11.299 Nicolas Sucari: so yeah, maybe we can. I don’t know which which are the data sources, but start to focus on. What are the other stuff we need to get to that gross margin.
290 00:43:11.370 ⇒ 00:43:13.660 Nicolas Sucari: gross profit margin. Sorry.
291 00:43:14.454 ⇒ 00:43:24.310 Aman Nagpal: North beam ad spend, I feel like would be a good one that we can start getting into there. We’ve been singing from their Api again with a cloudflare worker. But you know you get it, whatever you think is the best way.
292 00:43:26.040 ⇒ 00:43:29.917 Aman Nagpal: And then I know we have some assumptions in regards to
293 00:43:30.320 ⇒ 00:43:36.365 Aman Nagpal: operations. You know, label costs, pick and pack fees. All that stuff.
294 00:43:36.990 ⇒ 00:44:01.100 Aman Nagpal: I can get you in touch, you know. Just message me. I can get you in touch with Jonathan. And Jared, and they can kind of tell you. You know what’s what it is now, and you know, we can decide what’s the best way to update that going forward whenever it updates. Would that update past data? Would that only update as of the date change things like that you guys can help us with. Maybe it’s a spreadsheet. Maybe there’s a better way to do it. You know, you guys let us know, but I think that would be good as well.
295 00:44:02.480 ⇒ 00:44:08.709 Nicolas Sucari: Excellent. Okay, yeah, I think so let’s go with north beam. And and these are data sources so that we can start
296 00:44:09.197 ⇒ 00:44:13.970 Nicolas Sucari: yeah, ingesting that data into Snowflake and creating more reports with that one.
297 00:44:14.440 ⇒ 00:44:22.269 Nicolas Sucari: Okay, also pay us. I don’t know if you were aware of any other data source, but I think we can start with North Pea, and then.
298 00:44:22.270 ⇒ 00:44:29.380 Payas Parab: Because north beam should be a bunch of that. That’s an Mta provider, right? So they should be aggregating across
299 00:44:29.820 ⇒ 00:44:33.856 Payas Parab: different ad spend vehicles that you guys have right already.
300 00:44:34.260 ⇒ 00:44:40.789 Aman Nagpal: Yup. They’re constantly syncing. I think our latest deal is maybe 4 times a day they sync, and then it’s all within north beam.
301 00:44:40.950 ⇒ 00:44:49.849 Payas Parab: Got it. Okay? Excellent. Yeah. The north beam should be fine. I thought, if we’re like, we needed the Facebook data and the Tiktok data and all that. But if it’s all coming through north beam, that should be okay.
302 00:44:49.850 ⇒ 00:44:53.879 Aman Nagpal: Is that better to go direct to the platform? Is there?
303 00:44:55.370 ⇒ 00:44:56.259 Aman Nagpal: What do you think.
304 00:44:56.880 ⇒ 00:45:00.429 Payas Parab: It’s a good question. I mean, my, I usually skew towards
305 00:45:01.940 ⇒ 00:45:07.929 Payas Parab: like setting up your own connections to the original data source is typically my, my thing, it’s just
306 00:45:07.980 ⇒ 00:45:11.020 Payas Parab: in this particular case, if it’s like North Beam
307 00:45:11.150 ⇒ 00:45:29.820 Payas Parab: is already doing that one. And you guys are already like paying them a bunch to like. Sync all that data, then adding that to your like, Snowflake and Dbt load rising up costs there, and simultaneously like, if North Beam enriches that data in any way. Then maybe it’s just better to pull it from north beam. But my honest take is typically like getting it from the original source.
308 00:45:30.212 ⇒ 00:45:38.379 Payas Parab: But you like, if you already are getting it from the original source into somewhere else, and then they’re enriching it and making it more usable. Then.
309 00:45:38.760 ⇒ 00:45:42.800 Payas Parab: like, let’s just roll from that that enriched source, you know.
310 00:45:43.160 ⇒ 00:45:52.769 Aman Nagpal: Yeah, I I’m with you there. I usually go for the original source, too. In this case, Northea, we a yes, we use it to combine all the data easily. But B,
311 00:45:52.840 ⇒ 00:46:05.059 Aman Nagpal: these media buyers, they really trust north beams. Attribution. That’s kind of a big focus of it on, you know, you know we’ll compare what is Tiktok saying? How many transactions and what is North Beam saying? How many transactions from Tiktok right.
312 00:46:05.060 ⇒ 00:46:05.620 Payas Parab: Right.
313 00:46:05.839 ⇒ 00:46:16.170 Aman Nagpal: Which I don’t know if that data will get synced right now or down the road. But I feel like it’s fine we can do North being for now for for the ad spend and all that, and then down the road. If we see any issues.
314 00:46:16.170 ⇒ 00:46:16.860 Payas Parab: Yeah.
315 00:46:16.860 ⇒ 00:46:17.190 Aman Nagpal: Just.
316 00:46:17.190 ⇒ 00:46:40.309 Payas Parab: I think north beam data moving into Snowflake. That will be good because it just puts it in a queryable environment. So then you can like. Add that to the analysis of like, okay, get me like sales from last month, orders from last month and get me like the attribution right? The total attributed orders percentages for last month, and like, get them all in one table and send it to me. You can get that done in like 1015 min, you know.
317 00:46:40.310 ⇒ 00:46:41.279 Aman Nagpal: That sounds great.
318 00:46:41.610 ⇒ 00:46:42.980 Payas Parab: Yep, alright
319 00:46:43.350 ⇒ 00:46:48.280 Payas Parab: awesome. Just wanted to check on that last time. But so yeah, I think we have some action items. We’re meeting again next week to keep
320 00:46:48.310 ⇒ 00:46:51.060 Payas Parab: kind of momentum moving on this right.
321 00:46:53.370 ⇒ 00:46:54.120 Payas Parab: Yes.
322 00:46:54.120 ⇒ 00:46:54.720 Aman Nagpal: Yeah.
323 00:46:56.600 ⇒ 00:46:57.150 Aman Nagpal: cool.
324 00:46:57.150 ⇒ 00:46:57.960 Nicolas Sucari: Excellent.
325 00:46:58.130 ⇒ 00:47:00.859 Nicolas Sucari: Okay? Well, if there is
326 00:47:01.210 ⇒ 00:47:24.220 Nicolas Sucari: nothing else, I think we’re we’re done here. Yeah, we’re gonna keep working with Brian on having all those tables moved to production and stop using the environment. As he said, it’s gonna be just changing the names. The columns are gonna be the same. But we’re gonna review all of those tables and see if we need to change anything else. And then, yeah, we’ll keep talking with
327 00:47:24.230 ⇒ 00:47:25.870 Nicolas Sucari: pay us to
328 00:47:26.644 ⇒ 00:47:41.729 Nicolas Sucari: work on that other tool? Or, yeah, the decision or prepare something so that you can take the decision on on which way to go. Yeah, and continue to build up real so that you can take a look there in in all of the data. Okay.
329 00:47:42.140 ⇒ 00:47:55.219 Aman Nagpal: That sounds great. And 3 last second notes. I don’t wanna forget one don’t know if it matters, but in the coming weeks we are changing our name and domains from single V to double VI don’t know if I mentioned that before.
330 00:47:55.220 ⇒ 00:47:55.870 Nicolas Sucari: So it’s gonna.
331 00:47:55.870 ⇒ 00:47:57.259 Aman Nagpal: Avi with 2 V’s
332 00:47:57.888 ⇒ 00:48:26.630 Aman Nagpal: matter, but wanted to throw that out there. Netsuite actually went live on Friday, which I was not aware is happening so soon. But instead of extensive now, extensive is still pulling orders. But we’re no longer using it. It’s just sitting there now, Netsuite, which I actually haven’t even been involved in that process, is pulling all the orders, and sending it to our warehouses, or 3 pls. etc. So heads up there. I know we haven’t quite gotten there yet, but important to note and then do you think we’ll need
333 00:48:26.910 ⇒ 00:48:32.329 Aman Nagpal: the data analyst? No ginger for the Dbt stuff, or whatever it was.
334 00:48:36.080 ⇒ 00:48:36.950 Aman Nagpal: Okay.
335 00:48:36.950 ⇒ 00:48:55.500 Payas Parab: I might need to confirm with Brian, unless, Nico, you know, like, if there’s anything we need to do. And, Nico, you’re not like familiar with the process. We’re helping them. Basically hire a data analyst, right? Like someone, full time to use this data warehouse and stuff. I put together some qualifications, but maybe I will just have you guys glance at them just to ensure.
336 00:48:55.500 ⇒ 00:48:56.130 Nicolas Sucari: Yeah, wherever?
337 00:48:56.130 ⇒ 00:49:00.242 Payas Parab: We they hire has the capabilities of
338 00:49:01.410 ⇒ 00:49:04.390 Payas Parab: that that are needed to maintain Dbt as needed.
339 00:49:05.160 ⇒ 00:49:07.730 Nicolas Sucari: Excellent. Yeah, okay, send that
340 00:49:08.380 ⇒ 00:49:28.140 Nicolas Sucari: Brian and dutham. And if we also, I think that if we create that kind of tool manual to share with the man and the team. I think that’s gonna be pretty straightforward for understanding if that person is gonna be able to look at those tools and to maintain everything. But yeah, excellent. We’ll take a look.
341 00:49:28.770 ⇒ 00:49:30.569 Aman Nagpal: Thank you guys so much, really appreciate it.
342 00:49:31.180 ⇒ 00:49:33.369 Payas Parab: Alright, awesome man. Thank you for the time.
343 00:49:33.820 ⇒ 00:49:34.940 Aman Nagpal: Take care! Bye-bye.
344 00:49:34.940 ⇒ 00:49:36.150 Nicolas Sucari: Guys. Bye-bye.