Meeting Title: Brainforge x CTA: Weekly! Date: 2025-12-12 Meeting participants: Katherine Bayless, Ashwini Sharma, Uttam Kumaran
WEBVTT
1 00:01:07.190 ⇒ 00:01:08.459 Ashwini Sharma: Hi, Catherine.
2 00:01:08.740 ⇒ 00:01:12.230 Katherine Bayless: Hey, good morning. Hang on one second, come back on camera.
3 00:01:12.230 ⇒ 00:01:12.780 Ashwini Sharma: running.
4 00:01:14.440 ⇒ 00:01:18.630 Katherine Bayless: Let me see… Sorry, sorry. Coming back.
5 00:01:21.570 ⇒ 00:01:23.129 Katherine Bayless: Good morning. How are you?
6 00:01:23.590 ⇒ 00:01:25.489 Ashwini Sharma: I’m good. How are you?
7 00:01:25.490 ⇒ 00:01:27.210 Katherine Bayless: Alright, alright.
8 00:01:27.660 ⇒ 00:01:37.000 Ashwini Sharma: Yeah, I’m still working on that report. I’m sorry, I got pulled into something else, and I could not complete it on time, but I’ll make sure that I have it ready by Monday.
9 00:01:37.120 ⇒ 00:01:40.780 Ashwini Sharma: But I’m still working on it, and yeah.
10 00:01:41.230 ⇒ 00:01:47.660 Ashwini Sharma: And today, I think we’re going to discuss, Thing… The FTP thing, right?
11 00:01:48.150 ⇒ 00:01:50.000 Katherine Bayless: Oh, yeah, yeah, yeah, yeah, I think…
12 00:01:50.000 ⇒ 00:01:50.450 Ashwini Sharma: Yeah, yeah, yeah.
13 00:01:50.450 ⇒ 00:01:52.690 Katherine Bayless: conversation. Yeah, I…
14 00:01:53.380 ⇒ 00:01:59.160 Katherine Bayless: Yeah, I feel like there’s other things we were going to talk about too, but in this moment, my brain is just…
15 00:01:59.160 ⇒ 00:02:02.190 Ashwini Sharma: Let me just see if Utam is around.
16 00:02:02.190 ⇒ 00:02:06.339 Katherine Bayless: Yeah, no worries. I’m gonna send a report over to a colleague real quick, so take your time.
17 00:02:46.740 ⇒ 00:02:51.590 Uttam Kumaran: Hello, sorry, I just, like, got distracted, like, right before the meeting.
18 00:02:51.590 ⇒ 00:03:00.630 Katherine Bayless: No worries, I actually was kind of in the same boat. I was just about to, like, send a report over to a colleague, and I was like, oh, I gotta get on this call, and so now I am still sending it.
19 00:03:00.980 ⇒ 00:03:13.259 Uttam Kumaran: Cool. Yeah, I was just… I’ve been doing a lot in pursuer this week, and, like, we’ve been doing a lot of… using it a lot for writing, and so I’m, like, exploring a couple different things in there, so yeah, just got distracted, but.
20 00:03:13.260 ⇒ 00:03:16.560 Katherine Bayless: I would still love to do a cursor, like, demo at that point.
21 00:03:17.170 ⇒ 00:03:19.729 Uttam Kumaran: I think we have a couple of, like.
22 00:03:19.910 ⇒ 00:03:24.699 Uttam Kumaran: basically want to get a plan for… focus of this meeting is really just get a plan for this month.
23 00:03:24.880 ⇒ 00:03:32.669 Uttam Kumaran: And then talk a little bit about, you know, the next two. I think a couple things today I wanted to go over is, one.
24 00:03:32.710 ⇒ 00:03:50.550 Uttam Kumaran: We’ve started to do some, you know, discovery internally, like, putting together a discovery plan for those two work streams we talked, with you and Jay about. We’re also still continuing to work through, you know, our initial, like, sort of data work stream.
25 00:03:50.550 ⇒ 00:03:53.330 Uttam Kumaran: Which Ashwini is leading.
26 00:03:53.690 ⇒ 00:04:08.830 Uttam Kumaran: And so maybe we could even just talk there. I guess, Catherine, like, how do you see, like, the priorities? I think for us, I feel really comfortable with the existing, you know, workstream.
27 00:04:08.830 ⇒ 00:04:24.890 Uttam Kumaran: In terms of, like, what we’re doing so far, I think, Ashwini, we can talk a little bit about even some of the short-term things, like moving some of those scripts to dbt and things like that, but I feel like we’re on track. Polytomic should come back with us
28 00:04:24.930 ⇒ 00:04:29.709 Uttam Kumaran: You know, early next week, with some insight into what they can land, and then we can start moving on that.
29 00:04:31.600 ⇒ 00:04:41.399 Uttam Kumaran: So, and then we also are gonna work probably by mid-next week. We should have, like, a plan for discovery around both of those two…
30 00:04:41.510 ⇒ 00:04:55.169 Uttam Kumaran: you know, issues, the Shopify issue and the Okta issue. Like, how do you think about prioritizing across those three? Is it dependent on, sort of, like, what we find out? Or, like, what do you think about those three in terms of priorities?
31 00:04:55.520 ⇒ 00:05:02.599 Katherine Bayless: So, I think… Hmm. Let’s see. Thoughts.
32 00:05:02.830 ⇒ 00:05:11.149 Katherine Bayless: To buy time, well, I think I will say, I learned that, apparently we do actually use Shopify to sell sponsorships for CES?
33 00:05:11.520 ⇒ 00:05:12.130 Uttam Kumaran: Okay.
34 00:05:12.260 ⇒ 00:05:20.710 Katherine Bayless: Fascinating. Not really sure why, or how, or what that’s about, but I literally had an email from, like, September that was like, can we get a report on this? And I was like.
35 00:05:20.840 ⇒ 00:05:39.009 Katherine Bayless: And I, like, I punted it, I think, over to Kyle, and I just never thought about it again, but now I’m like, oh, weird, it’s like our sponsorship data is in there. I don’t know what to do about that. Which did kind of increase the interest in Shopify. I think, I mean, truthfully, what’s hard for me to speak to the prioritization there a little bit is, like.
36 00:05:40.040 ⇒ 00:05:54.019 Katherine Bayless: it does kind of depend on Jay wanting to take on the work, so I would say they feel very prioritize-worthy to me, but if he’s, like, not on board, then it’s kind of a non-starter, because at the end of the day, I’m not in Okta.
37 00:05:54.230 ⇒ 00:05:59.719 Katherine Bayless: But I would hope that he would be motivated to get that work through.
38 00:05:59.930 ⇒ 00:06:05.380 Katherine Bayless: Speaking of Okta and authentication,
39 00:06:06.550 ⇒ 00:06:21.870 Katherine Bayless: well, maybe I’ll parking lot it, because it could derail other conversations, but I might have over-committed us to, like, solving a problem that, probably would have been better to, like, just let it stay a problem, but I can’t resist helping. But yeah, it’s a different one.
40 00:06:21.870 ⇒ 00:06:39.740 Uttam Kumaran: So maybe a good place to start is just to look at, again, like, our Gantt chart for the existing project, and kind of share, like, where we’re at now. So, we’re sort of through kickoff, we’re through, you know, setting up Snowflake. We’re also, you know, nicely, I think, are…
41 00:06:39.820 ⇒ 00:06:52.520 Uttam Kumaran: basically set up Ashwini on, like, dbt and GitHub too, right? So, I think right now we’re lev- we’re leveraging the remembers data as, like, the through line to just, like.
42 00:06:52.520 ⇒ 00:07:05.080 Uttam Kumaran: get things set up, which is great. So I feel like we’re actually parallel pathing a little bit more than I expected initially. I do feel like we’re in the sort of tool evaluation period for
43 00:07:05.100 ⇒ 00:07:10.829 Uttam Kumaran: data ingestion, so I probably will… Extend this just to…
44 00:07:11.000 ⇒ 00:07:20.150 Uttam Kumaran: like, January, but again, I’m hopeful that, you know, Polyatomic becomes the, you know, they end up working out for us, and we can sort of do that.
45 00:07:20.340 ⇒ 00:07:25.579 Uttam Kumaran: After which, like, I think soon after, we’ll just try to ingest
46 00:07:25.800 ⇒ 00:07:32.449 Uttam Kumaran: You know, we’ll try to ingest as much data internally as possible, so…
47 00:07:32.920 ⇒ 00:07:35.199 Uttam Kumaran: I think this is… this is the rough…
48 00:07:35.420 ⇒ 00:07:47.729 Uttam Kumaran: like, cadence of where we’re at. On our side, as we start to, as we’re starting to get a sense of, like, CTA, the org, and your needs, one of the things that we always start to do is basically develop, like.
49 00:07:48.050 ⇒ 00:08:06.210 Uttam Kumaran: our architecture diagrams, like, our data platform spreadsheet. The one thing that’s not clear yet now, and I think maybe will be more clear as we meet more stakeholders, is, like, what are the metrics, and, like, how do we maybe do some, like, KPI standardization?
50 00:08:06.300 ⇒ 00:08:20.690 Uttam Kumaran: You know, so I think that’ll… that’s basically what this phase is here. So we have… we have a typical structure where we put in, like, here are the metrics we found that people are reporting at, here’s, like, the variety of definitions, and, like, can we get
51 00:08:20.790 ⇒ 00:08:22.839 Uttam Kumaran: Alignment on what we want.
52 00:08:22.990 ⇒ 00:08:28.930 Uttam Kumaran: these metrics to be sourced from, like, how they’re confined, and, like, sort of, like, some type of sign-off. So that’s, like, what that…
53 00:08:29.350 ⇒ 00:08:32.020 Uttam Kumaran: Metrics, glossary, dictionaries.
54 00:08:32.020 ⇒ 00:08:34.490 Katherine Bayless: So for that, actually, we can probably…
55 00:08:35.549 ⇒ 00:08:49.449 Katherine Bayless: for the… for this current stage in our evolution, like, we still are doing the goals process kind of the old way, where it’s a two-page to-do list that they laminate and hand out, but we have just received our 2026 to-do list.
56 00:08:49.450 ⇒ 00:08:56.209 Uttam Kumaran: So, like, I mean, as much as I chafe at the way we currently set goals, they are the things people are going to want to track.
57 00:08:56.210 ⇒ 00:09:13.120 Katherine Bayless: Because those are tied to bonuses. And so, like, eventually, yes, building out additional metrics that are really quality, I think we can do, but, for, like, a place to start, those numbers in those 2026 goals are going to be what everybody’s interested in initially.
58 00:09:13.250 ⇒ 00:09:17.129 Katherine Bayless: Actually, let me just send that right now.
59 00:09:17.130 ⇒ 00:09:21.190 Uttam Kumaran: Yeah, so maybe we’ll just use the… we’ll just… so, yeah, we’ll just start with that as a scope.
60 00:09:21.190 ⇒ 00:09:21.850 Katherine Bayless: Yeah.
61 00:09:24.040 ⇒ 00:09:31.539 Katherine Bayless: Let’s see… I actually haven’t even looked at them yet myself, so, let’s.
62 00:09:31.540 ⇒ 00:09:33.630 Uttam Kumaran: You still have time, it’s, you know, it’s only December 12th.
63 00:09:35.420 ⇒ 00:09:42.649 Katherine Bayless: Exactly, exactly. Okay, DataOps, Rain Forge, CTA, alright, let’s see…
64 00:10:04.220 ⇒ 00:10:04.990 Katherine Bayless: Yup.
65 00:10:06.280 ⇒ 00:10:07.950 Katherine Bayless: Okay,
66 00:10:07.950 ⇒ 00:10:26.620 Katherine Bayless: Actually, the one question I had too, which does kind of tie into the problem I’ve caused for us, is the, the entity resolution stuff. Obviously dependent on having data to do entity resolution with, but curious kind of where we think we might be able to start working on some of that.
67 00:10:26.800 ⇒ 00:10:38.240 Uttam Kumaran: Yeah, so I think that’ll… that will be as soon as we have things landed. So I think that… that is maybe something that… as soon as we have things landed, we can scope
68 00:10:38.460 ⇒ 00:10:39.809 Uttam Kumaran: Like, what’s there?
69 00:10:40.450 ⇒ 00:10:45.660 Uttam Kumaran: And work towards that. So I’m gonna put… I’ll just put that in as an individual thing.
70 00:10:46.680 ⇒ 00:10:53.999 Katherine Bayless: Yeah, I mean, if we wanted to be able to make some headway on it without waiting for, like, data to be landed
71 00:10:54.000 ⇒ 00:11:09.109 Katherine Bayless: like, via Polytomic. I mean, we do have the S3 integration set up in Snowflake, and all of the old data is in S3. So, like, I haven’t brought it into Snowflake, but, like, it would be totally possible to start bringing in some of those, like.
72 00:11:09.110 ⇒ 00:11:13.840 Katherine Bayless: historical data sets and doing the entity resolution work around them, because, I mean.
73 00:11:14.070 ⇒ 00:11:30.219 Katherine Bayless: okay, admittedly, I have an idea of how this will work in my mind, but it may or may not be the way that is, like, best practice to approach it. But, like, my thought was, first step is, like, coming through all of the data and just figuring out, like, okay, what are all of the different identifiers in different systems?
74 00:11:30.220 ⇒ 00:11:39.970 Katherine Bayless: how many people can we already kind of, like, flatten and figure out? And then figuring out what that looks like on an ongoing ingestion basis eventually, but…
75 00:11:39.970 ⇒ 00:11:44.349 Katherine Bayless: Like, tackling the mess that we do have is potentially useful?
76 00:11:44.560 ⇒ 00:11:49.920 Uttam Kumaran: Okay, so then why don’t we, Ashwini, we can add that to, like, sources to bring in?
77 00:11:50.140 ⇒ 00:11:57.909 Uttam Kumaran: And then let’s… let’s work together on, like, what, like, an initial plan would be, like, based on… we’ll just profile what data’s in there.
78 00:11:58.250 ⇒ 00:11:58.750 Ashwini Sharma: Sure.
79 00:11:58.750 ⇒ 00:12:02.200 Katherine Bayless: I can, yeah, and I can, I can share…
80 00:12:02.670 ⇒ 00:12:15.829 Katherine Bayless: a bunch of stuff. So I have, like, a full inventory of all the, like, you know, tables and columns and stuff like that, and then I also have some kind of shadows on the cave wall documentation from the former data team about, like, what data is in what tables.
81 00:12:15.830 ⇒ 00:12:24.910 Katherine Bayless: There are, like, 300 or 400 of them, but there’s really only a handful that we would need to do the entity resolution work with.
82 00:12:24.910 ⇒ 00:12:25.480 Uttam Kumaran: Okay.
83 00:12:25.860 ⇒ 00:12:38.210 Katherine Bayless: And actually, the other thing, too, is I could go through that CTA systems inventory spreadsheet and identify, like, which systems do contain people or companies where they would have an identifier in that system, because there’s some of them, obviously, not…
84 00:12:38.710 ⇒ 00:12:41.229 Katherine Bayless: But, yeah, yeah, yeah, yeah, yeah, yeah, yeah, okay.
85 00:12:42.430 ⇒ 00:12:44.030 Katherine Bayless: Sorry, that was a whole…
86 00:12:44.030 ⇒ 00:12:49.530 Ashwini Sharma: What kind of data is there in S3? Is it a CSV file, Parquet file? What kind of data are we…
87 00:12:49.700 ⇒ 00:12:59.459 Katherine Bayless: Both! So I actually, yeah, so I took the old SQL server and I dumped it into CSV and Parquet files in S3, so you could choose your own adventure, depending on which one you like better.
88 00:12:59.460 ⇒ 00:13:00.050 Ashwini Sharma: Fuck.
89 00:13:02.470 ⇒ 00:13:03.869 Katherine Bayless: It’s a lot of data.
90 00:13:04.590 ⇒ 00:13:07.959 Uttam Kumaran: So let’s land that, actually, into Snowflake.
91 00:13:07.960 ⇒ 00:13:12.460 Katherine Bayless: And is that data dynamic, or is it just a historical snapshot?
92 00:13:13.000 ⇒ 00:13:30.549 Katherine Bayless: Yeah, so it’s not dynamic, it’s just a dump out from the old SQL server. I… I mean, talk about suspenders and belts. I, like, I took a backup, I migrated it, I took a snapshot of it in RDS, then I exported all the data to CSV and Parquet files, and did some referential integrity checks afterwards. I’m very sure!
93 00:13:30.550 ⇒ 00:13:34.580 Katherine Bayless: that I’ve backed it up, and yet, I still have not had the confidence to go close the old Azure account.
94 00:13:34.670 ⇒ 00:13:39.099 Katherine Bayless: That’ll be, like, December 31st before the ball drops. I’ll finally close my eyes and click the button.
95 00:13:39.600 ⇒ 00:13:57.739 Katherine Bayless: But yeah, so it’s just, it’s all of the historical data, and frankly, I mean, it is actually a data source we should put a little build around. I mean, it won’t change, but there will be questions against it. Like, as we continue to try and land, like, new data sources, people will still want to go back into the archives, probably. So, yeah.
96 00:13:58.150 ⇒ 00:13:59.090 Katherine Bayless: Yeah.
97 00:13:59.090 ⇒ 00:13:59.820 Uttam Kumaran: Okay.
98 00:13:59.820 ⇒ 00:14:01.639 Katherine Bayless: Yeah, there’s a lot of it. There’s a lot of it.
99 00:14:01.640 ⇒ 00:14:04.769 Ashwini Sharma: And this is just a one-time ingestion, that’s what I understand.
100 00:14:04.770 ⇒ 00:14:07.280 Katherine Bayless: Yes, yeah. Yeah, exactly.
101 00:14:09.620 ⇒ 00:14:21.690 Uttam Kumaran: Okay, cool, so then let’s drive… we could drive towards that as well, in addition to, like, we have the remembers data model. I guess, Catherine, like, is, like, getting a BI tool, like, then less relevant?
102 00:14:21.880 ⇒ 00:14:23.920 Uttam Kumaran: And like, should we push…
103 00:14:24.490 ⇒ 00:14:32.789 Uttam Kumaran: that out. Getting or, like, you know, just evaluating what to do there. I don’t know what the timing is on, like, the…
104 00:14:32.960 ⇒ 00:14:35.520 Uttam Kumaran: Power BI stuff, But…
105 00:14:36.760 ⇒ 00:14:46.120 Katherine Bayless: No, actually, that’s a good point. Yeah, to be totally honest, I feel like… Yeah. Meaning, I guess…
106 00:14:46.120 ⇒ 00:14:51.980 Uttam Kumaran: To give you the, like, the trade-offs, it’s like, we could drive towards just getting, like.
107 00:14:52.210 ⇒ 00:14:54.849 Uttam Kumaran: Mart’s ready for remembers.
108 00:14:55.060 ⇒ 00:15:03.420 Uttam Kumaran: Data for, like, and driving towards entity resolution in Snowflake with whatever we have.
109 00:15:03.620 ⇒ 00:15:09.610 Uttam Kumaran: In addition, like, and I guess, like, if we just talk about the data work stream.
110 00:15:09.730 ⇒ 00:15:19.340 Uttam Kumaran: we would just drive towards, like, accomplishing that. I don’t know yet how much time that’s gonna take, so we were initially gonna just, you know, as soon as we landed data.
111 00:15:20.280 ⇒ 00:15:30.009 Uttam Kumaran: you know, some stuff modeled, drive towards, like, a BI decision. If people are gonna… are okay with just getting flat files or going directly to Snowflake, then
112 00:15:30.260 ⇒ 00:15:34.900 Uttam Kumaran: We can push that off and instead just prioritize, like, nailing that and, like, other…
113 00:15:35.400 ⇒ 00:15:48.219 Katherine Bayless: Yeah, yeah, I think that was actually a really savvy pivot, because for all intents and purposes, we can still push data from Snowflake to Power BI, so we do have a BI tool, and yeah, I think actually…
114 00:15:48.510 ⇒ 00:15:56.719 Katherine Bayless: Yeah, yeah, yeah, yeah. Putting some more focus on the marts and the entity resolution work, rather than BI platform selection makes a lot of sense to me.
115 00:15:56.720 ⇒ 00:16:03.509 Uttam Kumaran: Yeah, because we can just get into, like, yeah, just spending on… it’ll be another… Couple-week process, at least.
116 00:16:03.690 ⇒ 00:16:20.710 Uttam Kumaran: To look through BI tools, and so maybe we kick that for a bit. And then, like, if… if… I don’t know, if we do have access to Power BI, but that’s something that we can… if we can get access to that, then we can just… if needed, we can just build stuff there, like, that’s fine.
117 00:16:21.200 ⇒ 00:16:36.129 Katherine Bayless: Yeah, I mean, so we can definitely get you access, I could talk to Jay about that. I… I feel like I want to met, you know, I mean, it’s like a balance beam that doesn’t exist, but it’s like, I… I acknowledge that we have Power BI, it does make sense to use it for things that, like…
118 00:16:36.130 ⇒ 00:16:39.280 Uttam Kumaran: We also drive just people to go direct in Snowflake.
119 00:16:39.470 ⇒ 00:16:58.820 Katherine Bayless: Yeah, well, it’s also just, like, I do want to keep people in the idea of, like, oh, we are going to get a new tool, versus, like, entrenching further in Power BI, but the reality is Kyle and Kai are putting out Power BI reports because we need to put something out, so I think light Power BI usage is fine, just don’t want to get people, like, confused and think that that’s where we’re going to stay.
120 00:16:58.820 ⇒ 00:16:59.830 Uttam Kumaran: Okay.
121 00:16:59.880 ⇒ 00:17:17.139 Katherine Bayless: mostly thinking ahead to whenever I tell finance how much it’ll cost to buy Sigma. I don’t want them to be like, but you’re doing it all in Power BI now! Yeah, yeah. But yeah, no, I think working on marts and the entity resolution stuff, that’s a better use of the time, honestly.
122 00:17:17.140 ⇒ 00:17:25.320 Katherine Bayless: Because at this point, my hope is just, like, probably most folks are going to go into, like, either dark or panic mode until, you know, mid-January, but when they come back.
123 00:17:25.319 ⇒ 00:17:31.250 Katherine Bayless: I want there to be, like, all sorts of beautiful stuff for them to play with in Snowflake. So, yeah. Yeah, yeah.
124 00:17:31.250 ⇒ 00:17:42.150 Uttam Kumaran: So yeah, I really think that the next, you know, month or so is going to be just all modeling work, and for us to just work… do working sessions on March as we arrive there, so…
125 00:17:42.150 ⇒ 00:17:42.980 Katherine Bayless: Yeah.
126 00:17:43.120 ⇒ 00:17:44.060 Uttam Kumaran: Okay. I like it.
127 00:17:44.060 ⇒ 00:17:44.870 Katherine Bayless: I like it.
128 00:17:45.590 ⇒ 00:17:51.190 Uttam Kumaran: And then, yeah, probably, you know, by mid-next week, we’ll have, like.
129 00:17:51.560 ⇒ 00:18:02.299 Uttam Kumaran: the scope of, like, discovery, like, what we… like, basic questions and what we hypothesize could be possible for both of those other work streams, the Okta and the Shopify.
130 00:18:02.410 ⇒ 00:18:07.169 Uttam Kumaran: And then, you know, we can have a conversation, and then you can let me know
131 00:18:07.990 ⇒ 00:18:13.920 Uttam Kumaran: Based on, like, what the… what, like, the lift looks like, or how… timing, and we could talk about that by that point.
132 00:18:14.210 ⇒ 00:18:15.770 Katherine Bayless: Okay, yeah, yeah, that sounds perfect.
133 00:18:15.770 ⇒ 00:18:17.420 Uttam Kumaran: what I told Sam is, like.
134 00:18:17.700 ⇒ 00:18:24.310 Uttam Kumaran: I just want to put in, like, I’ll… because he’s like, hey, let’s just start poking around. I’m like, let’s just put together, like, what the lift
135 00:18:24.430 ⇒ 00:18:27.720 Uttam Kumaran: on Discovery would be, so I’d give Catherine some…
136 00:18:28.250 ⇒ 00:18:37.360 Uttam Kumaran: Like, room to see, like, what we would need access to, or questions we would ask, and then be like, okay, this is not worth it right now, or okay, let’s go after this, you know.
137 00:18:37.360 ⇒ 00:18:39.970 Katherine Bayless: Yeah, yeah, yeah, yeah.
138 00:18:40.640 ⇒ 00:18:41.500 Katherine Bayless: Yeah.
139 00:18:42.590 ⇒ 00:18:43.760 Uttam Kumaran: I mean, yeah.
140 00:18:43.760 ⇒ 00:18:47.190 Katherine Bayless: Yeah, I, like… I just want it fixed.
141 00:18:47.190 ⇒ 00:18:48.050 Uttam Kumaran: Yeah.
142 00:18:48.910 ⇒ 00:18:50.210 Katherine Bayless: Yeah.
143 00:18:50.280 ⇒ 00:18:58.170 Uttam Kumaran: Okay, great. I think the other item also, Cather, is just talk about contract. We’re coming up on…
144 00:18:58.340 ⇒ 00:19:03.239 Uttam Kumaran: Like, our initial contract is up in, like, end of this month.
145 00:19:03.240 ⇒ 00:19:03.650 Katherine Bayless: Yep.
146 00:19:03.810 ⇒ 00:19:10.749 Uttam Kumaran: Do you want to wait till next week to sort of scope out, like, what Jan would look like, or what do you think is best?
147 00:19:11.420 ⇒ 00:19:14.480 Katherine Bayless: I mean, honestly, like, I… so…
148 00:19:15.680 ⇒ 00:19:17.799 Katherine Bayless: I don’t want to, like, make more work.
149 00:19:18.160 ⇒ 00:19:21.239 Katherine Bayless: But at the same time, I’m kind of like, I think…
150 00:19:22.700 ⇒ 00:19:30.650 Katherine Bayless: realistically, knowing how long it takes to get contracts through and all, you know, blah blah blah. I think if we put together, like, just a, like.
151 00:19:30.650 ⇒ 00:19:46.349 Katherine Bayless: Q1 scope, then, like, I can kind of slide that through the process, hopefully faster. Now that the MSA is approved, like, the scopes should go through smoother, but I do want, at the end of the new year, I think, to come up with kind of, like, a full year’s plan.
152 00:19:46.770 ⇒ 00:19:59.069 Katherine Bayless: like, I don’t know if we can pull that off right now with the poverty spec brain cells I’ve got at this point, but I think we could confidently plan for Q1, and then when we get back in January, start planning the rest of the year scope.
153 00:19:59.070 ⇒ 00:19:59.610 Uttam Kumaran: Okay.
154 00:19:59.610 ⇒ 00:20:03.249 Katherine Bayless: But I realize that’s asking you guys to go through the exercise kind of twice, so…
155 00:20:03.500 ⇒ 00:20:13.109 Uttam Kumaran: That’s fine. So I’ll put in just something for Q1. I will sort of… I’ll have it drafted and ready, then based on where we arrive next week.
156 00:20:13.110 ⇒ 00:20:13.630 Katherine Bayless: Yeah.
157 00:20:13.630 ⇒ 00:20:23.939 Uttam Kumaran: new things, we can add those scopes and then send it for, you know, for you to review. And then, yeah, I mean, again, we’re… I feel like we were moved as fast as I thought we would.
158 00:20:24.050 ⇒ 00:20:31.390 Uttam Kumaran: move, and I think we’ve been doing well. It’s like, I know that some things are shifting, and so we’re able to move stuff around, which is good.
159 00:20:31.810 ⇒ 00:20:32.340 Katherine Bayless: Yeah.
160 00:20:32.340 ⇒ 00:20:36.830 Uttam Kumaran: So, I’m happy, you know, to focus more on just
161 00:20:37.210 ⇒ 00:20:40.160 Uttam Kumaran: You know, modeling. And also, like,
162 00:20:40.310 ⇒ 00:20:59.950 Uttam Kumaran: Kind of the way we work, of course, is just, like, in pods, so as we… as I get to see, like, where the bulk of the work is, like, if there’s gonna be a 4-6 week of just a lot of modeling work, then we’ll loop in someone else from our team to come in and help… help there as well. And so, we sort of need experts in different areas, like functional folks.
163 00:21:00.030 ⇒ 00:21:10.859 Uttam Kumaran: Usually, it’s just me and someone sort of, like, shine the flashlight in and sort of, like, look around, and then we kind of understand, okay, like, how much are we taking on, who do we need? You know, so…
164 00:21:11.710 ⇒ 00:21:12.870 Uttam Kumaran: pretty good.
165 00:21:12.870 ⇒ 00:21:37.849 Katherine Bayless: I mean, you guys have been great, honestly. Like, I’m like, how much more of your time can I have? Right? I mean, yeah, like, I… seriously, I’ll take everything. But yeah, I think that for Q1, too, to that point, like, it is probably going to be a lot of modeling. Like, as we start getting the data sources, just building mart after mart after mart, yeah, yeah, yeah. And actually, I mean, maybe… maybe that actually kind of becomes a natural, like, way to
166 00:21:37.850 ⇒ 00:21:46.319 Katherine Bayless: do it, right, is if we spend Q1 landing data and modeling data, and then there’s, you know, maybe towards the end of it is the work stream around the BI selection tool.
167 00:21:46.320 ⇒ 00:21:49.009 Uttam Kumaran: You would hope it’s that clean, yeah.
168 00:21:49.340 ⇒ 00:21:50.150 Katherine Bayless: Yeah.
169 00:21:50.150 ⇒ 00:22:09.500 Uttam Kumaran: You hope it’s… but I… again, my hope is that, like, we start to empower some of the people that are pulling from this, and then that’s where we start to meet with… with them directly, right? Yes. So as we start to land things, we’ll run it by you, and then we’ll start to build the relationships with Kyle and different folks to start to, you know, serve them
170 00:22:09.550 ⇒ 00:22:10.930 Uttam Kumaran: you know, directly.
171 00:22:10.930 ⇒ 00:22:12.190 Katherine Bayless: Yeah. And…
172 00:22:12.190 ⇒ 00:22:23.300 Uttam Kumaran: basically start to build out, and then that’ll start to allow us to start to build a little bit of mini roadmaps on what their needs are analytics-wise, and then as a platform team, like, we can then
173 00:22:23.600 ⇒ 00:22:36.499 Uttam Kumaran: it’ll make the… whether we need the BI decision, you know, it’ll make that clear. Also, like, it’ll make it… because we… I need to know what their expectations are in order to make that, like, if everybody’s comfortable in SQL, then there’s… there’s other tools.
174 00:22:36.640 ⇒ 00:22:41.230 Uttam Kumaran: And so that’ll be, like, that’ll be helpful for that decision. So…
175 00:22:41.230 ⇒ 00:22:45.670 Katherine Bayless: No, we’re assuming nobody wants to do SQL, yeah.
176 00:22:45.670 ⇒ 00:22:46.230 Uttam Kumaran: Okay.
177 00:22:46.230 ⇒ 00:22:57.090 Katherine Bayless: Yeah, I mean, we might find the handful of people who are willing, but yeah, I think the overwhelming majority of stakeholders across the organization are going to want an Excel spreadsheet.
178 00:22:57.090 ⇒ 00:22:57.530 Uttam Kumaran: I’m good.
179 00:22:57.530 ⇒ 00:23:12.570 Katherine Bayless: That’s the level they’re at. Which is, I mean, hey, not nothing. And some of them have some pretty impressive VLOOKUP things rigged up, so, you know, cautiously optimistic for their abilities there. But yeah, yeah, yeah. Conversational interfaces, I think, are gonna be…
180 00:23:12.570 ⇒ 00:23:18.090 Uttam Kumaran: Yeah, so that’s what… I mean, that’s gonna be… for us to even go do, like, a proof of concept.
181 00:23:18.260 ⇒ 00:23:23.629 Uttam Kumaran: I want to… the way we’ve done in the past is we just literally take questions that we have been asked.
182 00:23:24.300 ⇒ 00:23:27.220 Uttam Kumaran: And that’s how we test the tooling, you know?
183 00:23:27.220 ⇒ 00:23:27.720 Katherine Bayless: Yeah.
184 00:23:27.720 ⇒ 00:23:30.119 Uttam Kumaran: If it can satisfy, versus, like, making up
185 00:23:30.340 ⇒ 00:23:34.860 Uttam Kumaran: questions, or, you know, thinking of some, like, nice things at work. Yeah.
186 00:23:34.860 ⇒ 00:23:42.499 Katherine Bayless: Oh, actually, to that end, so I… I didn’t get as far as I wanted to yesterday, but I was starting to go through my inbox.
187 00:23:42.500 ⇒ 00:24:05.609 Katherine Bayless: email is my Achilles heel. But there were… there’s a lot of recent stuff I just need to respond to, but there are a lot of, like, older things that I just kind of left there, because we didn’t really have a board, and I was like, I don’t know, but this is a question that I can’t solve right now, but I’ll leave it, and so I’ve been starting to finally dump those into Asana board, and so I should ask Jay about getting you guys Asana access, I’m guessing that makes the most sense? Yeah.
188 00:24:05.710 ⇒ 00:24:15.249 Katherine Bayless: And that way you could see, because, like, there being, you know, nothing in there is, like, super exciting per se, but it’s definitely a lot of, things that we will want to tackle next year.
189 00:24:15.250 ⇒ 00:24:15.760 Uttam Kumaran: Okay.
190 00:24:15.760 ⇒ 00:24:16.530 Katherine Bayless: Yeah.
191 00:24:17.500 ⇒ 00:24:18.370 Katherine Bayless: Yeah.
192 00:24:22.480 ⇒ 00:24:24.259 Uttam Kumaran: Okay,
193 00:24:24.920 ⇒ 00:24:30.129 Uttam Kumaran: So I think maybe we spend the rest of the time, Ashwini, if you want to, we can talk through the SFTP…
194 00:24:30.130 ⇒ 00:24:30.940 Katherine Bayless: Yeah.
195 00:24:30.940 ⇒ 00:24:35.899 Uttam Kumaran: stream, and then I know, Catherine, in the beginning of the call, you had another Okta thing.
196 00:24:35.900 ⇒ 00:24:47.819 Katherine Bayless: Yeah, I can… while Ashmini pulls up stuff, I can… I can talk through that one. So basically, and I… some of this might be repetitive, I might have shared some of this already, but we have in that
197 00:24:47.820 ⇒ 00:25:00.960 Katherine Bayless: ecosystem of CES vendors, where they’re all kind of daisy-chain integrated. One of the… so they all use email address as the unique identifier across systems, which is why the entity resolution work will solve for that eventually.
198 00:25:01.010 ⇒ 00:25:13.550 Katherine Bayless: But one of the vendors uses an endpoint for, like, attendee match and exhibitor recommendations that the query parameter in the call is the email address. And so if you authenticate into the mobile app.
199 00:25:13.550 ⇒ 00:25:23.449 Katherine Bayless: you can then, you know, intercept and change the email address, and so if you want to get Catherine’s recommendations instead of your own, all you need to do is drop the other email address into the query string, right?
200 00:25:23.450 ⇒ 00:25:24.100 Uttam Kumaran: Okay.
201 00:25:24.100 ⇒ 00:25:25.690 Katherine Bayless: Not great. Not… Yeah.
202 00:25:25.690 ⇒ 00:25:27.360 Ashwini Sharma: Not great.
203 00:25:27.360 ⇒ 00:25:41.330 Katherine Bayless: Yeah, not great at all. So, they were… the question amongst the team was, do we just figure it was this way last year and the year before? We’ll go one more year and then never again, or do we try to do better this year?
204 00:25:41.530 ⇒ 00:25:47.770 Katherine Bayless: then the question was, well, we don’t have any system or any unique identifier that’s in all of the systems, and I was like, well.
205 00:25:47.990 ⇒ 00:26:08.360 Katherine Bayless: In my optimism earlier this year, when I was starting to the… down the entity resolution path, I had created the data ops ID in our data, with the idea that it would eventually become a canonical ID, even though right now it’s just one-to-one with email. But I’ve been pushing it everywhere I touch, and the registration vendor is storing it for any records that I’ve sent.
206 00:26:08.360 ⇒ 00:26:13.600 Katherine Bayless: registrations come from a few places, so my stream has the DataOps ID in their system.
207 00:26:13.740 ⇒ 00:26:19.760 Katherine Bayless: And I could certainly push the ops IDs for all the others, because I have them, I just don’t have a way to push them normally.
208 00:26:19.760 ⇒ 00:26:20.200 Uttam Kumaran: Okay.
209 00:26:20.200 ⇒ 00:26:34.990 Katherine Bayless: long story, or long story short, is I think they might actually be willing to make this API change and use my DataOps IDs instead of the email address, which is awesome, but now I need to figure out, like, oh shit, how do I get these to actually be in all of the systems?
210 00:26:34.990 ⇒ 00:26:35.620 Uttam Kumaran: Yeah.
211 00:26:35.620 ⇒ 00:26:53.599 Katherine Bayless: Which is, I mean, easy to imagine now, like, I can do a bulk upload of the ones that we need, but then it’s like, okay, on-site, as registrations come in, how do I make sure that I’m keeping up with that? So, this is as of yesterday afternoon, I’m still wrapping my head around it, but I’m excited to have caused this problem.
212 00:26:53.830 ⇒ 00:26:59.669 Uttam Kumaran: Okay, cool. Well, I don’t know, this sounds like a perfect role for some type of transform job to execute.
213 00:26:59.670 ⇒ 00:27:03.639 Katherine Bayless: Yeah. Offload, and then we push things out, you know?
214 00:27:03.770 ⇒ 00:27:17.880 Katherine Bayless: Yeah, and so in… it’s a nice segue back to what we’ll talk about now, because this job that Ashrini’s gonna pull up is also the one that is assigning those IDs, and so this is kind of, like, step zero in getting that data together.
215 00:27:19.550 ⇒ 00:27:20.800 Uttam Kumaran: Okay, okay, perfect.
216 00:27:20.800 ⇒ 00:27:21.480 Katherine Bayless: Damn.
217 00:27:22.810 ⇒ 00:27:37.080 Katherine Bayless: Yeah, it was like, I don’t know, when I was new and had to figure out, like, okay, of all the old things the marketing team used to do, which one do I want to hold on to? And perhaps I was… perhaps I guess, well, because I decided that this invite process for CES was the one
218 00:27:37.400 ⇒ 00:27:45.390 Katherine Bayless: super manual, obnoxious thing that I was gonna keep supporting, and it has definitely… it’s definitely gotten me into some interesting conversations, so…
219 00:27:45.390 ⇒ 00:27:46.100 Uttam Kumaran: Great.
220 00:27:46.100 ⇒ 00:27:46.720 Katherine Bayless: Yeah.
221 00:27:49.630 ⇒ 00:27:54.549 Katherine Bayless: Okay, do you want me to kind of talk through the file here, or what’s most useful?
222 00:27:55.260 ⇒ 00:28:07.279 Ashwini Sharma: I mean, I understand what’s there in the file. What I’m trying to understand is, where are these tables? You’re saying it runs on Postgres right now? I’m assuming that they are there in Postgres,
223 00:28:07.970 ⇒ 00:28:15.259 Ashwini Sharma: to bring them into Snowflake, so you also mentioned you have already created an integration between S3 and Snowflake, or something like that?
224 00:28:15.260 ⇒ 00:28:20.509 Katherine Bayless: Yeah, so if you look in Snowflake, actually, I created that webhooks, database.
225 00:28:20.890 ⇒ 00:28:21.920 Ashwini Sharma: Okay.
226 00:28:21.920 ⇒ 00:28:30.149 Katherine Bayless: And, so this is our, market research team has been, like, soliciting interest in these show floor tours, and so I built out…
227 00:28:30.150 ⇒ 00:28:42.340 Katherine Bayless: I had already set up the webhook from Formstack to S3, and so now I just took the last mile S3 to Snowflake to create, like, a little, view that he could go in and download when we get new submissions to that form.
228 00:28:43.820 ⇒ 00:28:47.920 Ashwini Sharma: So everything is there in this stage, or where exactly?
229 00:28:47.920 ⇒ 00:28:52.240 Katherine Bayless: If you click on Views, that’s where I created the thing.
230 00:28:52.410 ⇒ 00:28:53.390 Katherine Bayless: Yeah.
231 00:29:07.420 ⇒ 00:29:11.309 Katherine Bayless: Yeah, so this was what I put together for him, and he was overjoyed.
232 00:29:12.120 ⇒ 00:29:12.760 Uttam Kumaran: Oh, great.
233 00:29:12.760 ⇒ 00:29:16.519 Katherine Bayless: But it is proof that the S3 integration works.
234 00:29:16.770 ⇒ 00:29:17.530 Ashwini Sharma: Okay.
235 00:29:17.530 ⇒ 00:29:26.540 Katherine Bayless: Because every time the form stack is submitted, the webhook sends it to the S3 bucket, and then this connected it out to Snowflake. It’s the only data that I’ve actually
236 00:29:26.540 ⇒ 00:29:29.870 Katherine Bayless: connected with the stage, I just, in the,
237 00:29:29.870 ⇒ 00:29:47.819 Katherine Bayless: IAM role scope, I only have it, accessing two buckets. One is the webhooks one, and then the other one was, like, a demo I had used when I first set it up, just to make sure it worked. But I can modify the role to have permissions to access, any other bucket, for that matter, including the one that contains all of that old archived data.
238 00:29:50.480 ⇒ 00:29:56.279 Katherine Bayless: Ultimately, the webhooks database will just get deleted after CES. It was just quick and dirty, but…
239 00:29:56.630 ⇒ 00:29:57.210 Uttam Kumaran: Okay.
240 00:29:58.200 ⇒ 00:30:20.560 Katherine Bayless: But anyway, so yeah, so all of those tables, they currently live in Postgres, but they’re essentially one-to-one with the flat file I’m grabbing from the different systems. Like, none of it’s actually integrated. I go and I grab, you know, 8 or 10 flat files, and then I import them into Postgres, run the code, and send it on. So instead of importing into Postgres, I could just park those flat files in an S3 bucket, and then they’d be available to Snowflake.
241 00:30:21.540 ⇒ 00:30:28.240 Ashwini Sharma: Oh, sorry, I didn’t follow something away here. This is coming directly from this table, right? How did you load data in this table?
242 00:30:28.240 ⇒ 00:30:28.890 Katherine Bayless: Something…
243 00:30:28.890 ⇒ 00:30:29.949 Ashwini Sharma: from the stage.
244 00:30:30.280 ⇒ 00:30:47.079 Katherine Bayless: Yeah, so I… I mean, I don’t have the commands in front of me anymore. I mean, since I’m new to Snowflake, I mean, maybe I didn’t do it right, but yeah, so I did the, like, copy data from stage command to create the table, and the table just had one row, or one column with the nested JSON in it.
245 00:30:47.080 ⇒ 00:30:50.300 Katherine Bayless: And then I created the view to unnest the JSON.
246 00:30:50.300 ⇒ 00:30:53.359 Ashwini Sharma: Okay, okay, okay, got it, got it.
247 00:30:53.560 ⇒ 00:31:00.410 Katherine Bayless: Yeah, I didn’t set up, like, Snowpipe or anything, so, like, in order to get the latest data in, I’ll have to run the copy from command again.
248 00:31:00.820 ⇒ 00:31:04.020 Ashwini Sharma: And this table is one of these, somewhere?
249 00:31:04.020 ⇒ 00:31:15.230 Katherine Bayless: No, no, it’s irrelevant to this process, it’s just, I would say, proof that it does work, and so all of the tables in this process, I could put in an S3 bucket and add to that integration.
250 00:31:15.900 ⇒ 00:31:22.850 Ashwini Sharma: Awesome, yeah. Okay, and then we can take it forward from there. But what I wanted to mention was, once
251 00:31:22.980 ⇒ 00:31:26.869 Ashwini Sharma: We have, transformations up to this point.
252 00:31:26.930 ⇒ 00:31:27.950 Katherine Bayless: Right.
253 00:31:28.010 ⇒ 00:31:32.510 Ashwini Sharma: We can put it back into S3 in a different stage.
254 00:31:32.810 ⇒ 00:31:36.659 Ashwini Sharma: Right? And then from S3 to FTP,
255 00:31:36.990 ⇒ 00:31:39.839 Ashwini Sharma: We’ll need a different approach, right?
256 00:31:40.030 ⇒ 00:31:40.620 Katherine Bayless: Yeah.
257 00:31:41.390 ⇒ 00:31:44.570 Ashwini Sharma: I’m not sure if we can use the polyatomic solution.
258 00:31:44.720 ⇒ 00:31:47.319 Ashwini Sharma: From Snowflake to S3, we’ll have to explore that.
259 00:31:47.820 ⇒ 00:31:52.590 Ashwini Sharma: But, like, from Snowflake to S3, that is something that is doable.
260 00:31:52.880 ⇒ 00:32:12.269 Katherine Bayless: Okay, okay. Yeah, I mean, I… it’s funny, I have… have written glue jobs to put data up on FTP servers before, so I… I could… I could do it in Glue, I just haven’t, because, well, glue’s kind of a pain. And then I saw that Polytomic had the SFTP connector, and I was like, oh, maybe we could just use that to take it from either S3 or Snowflake.
261 00:32:12.270 ⇒ 00:32:14.960 Katherine Bayless: Onto the Marketing Cloud FTP.
262 00:32:18.000 ⇒ 00:32:18.620 Uttam Kumaran: Yeah…
263 00:32:19.790 ⇒ 00:32:38.020 Katherine Bayless: Because basically, what happens at the end of this process, there’s a bunch of… and I should probably give you guys the DDL for the views, but there are views that, I export. Those are the files that I take to Marketing Cloud, and so there’s 5 views for Marketing Cloud that I put on their FTP.
264 00:32:38.260 ⇒ 00:32:49.200 Katherine Bayless: And then once they’re on the FTP, everything else is automated to ingest them into Marketing Cloud, process them, populate all the data extensions, etc, etc. So, like, once they hit that FTP, my hands are done.
265 00:32:49.200 ⇒ 00:33:01.630 Katherine Bayless: And then there’s a few other views that I send over to the registration vendor’s FTP, which I think I will continue doing manually, because they just seem to be, like.
266 00:33:02.040 ⇒ 00:33:11.319 Katherine Bayless: Yeah, it’s the… one of them doesn’t necessarily update that often, the other one should, but people are constantly doing the, like, oh, before you send it, before you send it! And I’m like, okay.
267 00:33:11.480 ⇒ 00:33:16.110 Katherine Bayless: Next year, we won’t have that, but at this point, it’s too late for me to be like, you know…
268 00:33:16.420 ⇒ 00:33:17.210 Uttam Kumaran: Yeah.
269 00:33:17.500 ⇒ 00:33:21.840 Katherine Bayless: Although, I don’t know, we’ll see, maybe I’ll give in. It’s just kind of annoying.
270 00:33:36.750 ⇒ 00:33:38.720 Uttam Kumaran: Any other questions, Ashwini?
271 00:33:38.950 ⇒ 00:33:43.329 Ashwini Sharma: I think, I’m good right now.
272 00:33:44.170 ⇒ 00:33:49.650 Ashwini Sharma: So, the current scope is do the transformations that are there in this file.
273 00:33:49.760 ⇒ 00:33:56.749 Ashwini Sharma: And export the data in S3 out of these tables that we have updated, right? Updated or inserted.
274 00:33:57.350 ⇒ 00:34:08.779 Ashwini Sharma: Yeah, I’ll give you… yeah, because that is kind of the missing piece in here, is the views that wind up, being what I export. So I can give you the DDL for those views. Yeah, sure, sure. That’s kind of, yeah, the final shape of the data.
275 00:34:08.820 ⇒ 00:34:24.879 Katherine Bayless: And then, I guess what I can do, too, is I’ll create an S3 bucket, and I’ll drop in… I mean, I have to do this today at 5 o’clock. So, like, I can drop in today’s files into the S3 bucket, and then you would have exactly the same sort of data that I would have starting this.
276 00:34:25.820 ⇒ 00:34:26.840 Ashwini Sharma: charcoal.
277 00:34:28.130 ⇒ 00:34:31.890 Uttam Kumaran: You wanna… are you gonna end up moving this, Ashwini, to dbt?
278 00:34:32.630 ⇒ 00:34:33.290 Ashwini Sharma: Yes.
279 00:34:33.659 ⇒ 00:34:34.269 Uttam Kumaran: Okay.
280 00:34:40.549 ⇒ 00:34:41.419 Uttam Kumaran: Okay.
281 00:34:44.219 ⇒ 00:34:44.969 Katherine Bayless: Yeah.
282 00:34:45.359 ⇒ 00:34:55.369 Katherine Bayless: It means, yeah, the invite process, generally, there’s a lot of appetite to overhaul it, but for the moment, this is… this is what we got.
283 00:34:57.210 ⇒ 00:35:04.060 Uttam Kumaran: So maybe, Ashwini, once we have that there, we can also discuss, like, how we want to run the dbt jobs.
284 00:35:04.250 ⇒ 00:35:14.519 Uttam Kumaran: On a schedule, like, that’s the next decision for us to make, whether we want to do dbt Cloud, or we want to execute, like, within Snowflake as dbt.
285 00:35:14.650 ⇒ 00:35:20.179 Uttam Kumaran: like, deep… deeper functionality, so I think this is a good use case, because we… the other…
286 00:35:20.560 ⇒ 00:35:22.820 Uttam Kumaran: Stuff we’re modeling hasn’t had, like, a…
287 00:35:23.110 ⇒ 00:35:25.539 Uttam Kumaran: Time, like, orchestration requirement yet, so…
288 00:35:25.860 ⇒ 00:35:26.410 Katherine Bayless: It is.
289 00:35:26.410 ⇒ 00:35:28.339 Uttam Kumaran: That’s a decision we can also make next week.
290 00:35:29.910 ⇒ 00:35:30.479 Uttam Kumaran: We can…
291 00:35:30.480 ⇒ 00:35:31.050 Ashwini Sharma: Michelle, yeah.
292 00:35:31.050 ⇒ 00:35:37.339 Uttam Kumaran: Like, dbt is open source, we can run this in many ways, like, we don’t have to go through cloud, but it would be good to talk through the options.
293 00:35:38.500 ⇒ 00:35:39.150 Katherine Bayless: Yeah.
294 00:35:40.140 ⇒ 00:35:44.950 Katherine Bayless: Yeah, I mean, Zia, I would say… Yeah, like…
295 00:35:46.640 ⇒ 00:36:02.349 Katherine Bayless: this thing runs every day, so not too difficult, probably, to set up a pipeline for it. Like, there’s not a ton of, like, nuance. Like, it doesn’t need to be event-triggered or anything like that, it’s just a cron job, essentially, right? I’m just a human cron job right now. But
296 00:36:02.350 ⇒ 00:36:11.259 Katherine Bayless: But yeah, I think in terms of orchestrating, like, other data refreshes, and that’ll be… that’ll be an interesting kind of thing to figure out, because we have so many systems that are…
297 00:36:11.280 ⇒ 00:36:21.110 Katherine Bayless: important, but also only really in use for, like, part of the year, versus then there’ll be some things that might be, like, yeah, it makes sense to ingest every hour or every day.
298 00:36:21.110 ⇒ 00:36:25.170 Uttam Kumaran: Yeah, so if there’s stuff that… if there’s… if there’s,
299 00:36:25.680 ⇒ 00:36:28.280 Uttam Kumaran: You know, jobs that are,
300 00:36:29.550 ⇒ 00:36:41.870 Uttam Kumaran: like, outside of just dbt, like, we need to trigger Python workloads or webhooks, then we should consider actually doing this, like, in Glue or… or something where we can orchestrate multiple, multiple things.
301 00:36:42.450 ⇒ 00:36:45.019 Uttam Kumaran: You know.
302 00:36:45.500 ⇒ 00:36:46.969 Katherine Bayless: Yeah, right now it’s…
303 00:36:47.090 ⇒ 00:36:54.330 Katherine Bayless: Right now, it’s all very cron-y type, but I think as we get our feet under us, we’ll be able to find better patterns.
304 00:36:55.980 ⇒ 00:36:57.030 Uttam Kumaran: Okay, okay.
305 00:37:00.010 ⇒ 00:37:00.500 Uttam Kumaran: Cool.
306 00:37:00.500 ⇒ 00:37:05.849 Katherine Bayless: And if this is, like, too overwhelming to, like, pull together, like, I mean, at the end.
307 00:37:05.850 ⇒ 00:37:12.789 Uttam Kumaran: No, I actually think it’s helpful, because we’re gonna… it’s just gonna front-load us making infrared decisions to accomplish it, so…
308 00:37:12.790 ⇒ 00:37:13.280 Katherine Bayless: True.
309 00:37:13.280 ⇒ 00:37:20.550 Uttam Kumaran: I… that’s what I… I know, it’s just… depending on, you know, for us, like, we go to some clients where it’s really just, like.
310 00:37:20.670 ⇒ 00:37:27.189 Uttam Kumaran: okay, plan like a whole, everything gets a requirement, everything gets planned out. We’re also in situations where
311 00:37:27.560 ⇒ 00:37:34.290 Uttam Kumaran: okay, let’s… we have to do both. Like, we’re both planning for the future, but there are immediate wins that we can get.
312 00:37:34.430 ⇒ 00:37:38.850 Uttam Kumaran: And so, that’s… that’s just the balance for us, you know? Yeah.
313 00:37:39.270 ⇒ 00:37:43.819 Katherine Bayless: Yeah Yeah, I think we’re kind of in that spot, yeah.
314 00:37:44.250 ⇒ 00:37:48.579 Uttam Kumaran: Which is good. I mean, I prefer that. Some people really press us on…
315 00:37:48.740 ⇒ 00:38:04.580 Uttam Kumaran: tons and tons and tons of planning, and we end up, like, building a lot of documentation, but, I think some people, especially the folks that just don’t have… haven’t worked with the data team or built it out, they… they really need a ton of hand-holding, and they’re like.
316 00:38:04.750 ⇒ 00:38:10.179 Uttam Kumaran: don’t even touch anything until we, like, approve everything. I’m like, okay, we can go, we can do that. We can go.
317 00:38:10.180 ⇒ 00:38:17.510 Katherine Bayless: I mean, it’s funny, it is probably, like, I am probably the bridge, between… Yes? Yeah, because there’s a lot of that, there’s a lot of that internally.
318 00:38:17.510 ⇒ 00:38:22.680 Uttam Kumaran: Which is great, because, like, I don’t know, it takes trust to, like, do both, you know, so…
319 00:38:23.020 ⇒ 00:38:23.650 Katherine Bayless: Yeah.
320 00:38:23.960 ⇒ 00:38:32.139 Katherine Bayless: Yeah, I mean, people get so excited that, like, the handful of things that I have made better, and so it’s like, yeah, the quick wins, they definitely build political capital.
321 00:38:32.440 ⇒ 00:38:33.050 Uttam Kumaran: Okay.
322 00:38:33.890 ⇒ 00:38:34.630 Katherine Bayless: Yeah.
323 00:38:37.050 ⇒ 00:38:38.420 Uttam Kumaran: Okay, perfect.
324 00:38:39.000 ⇒ 00:38:57.370 Uttam Kumaran: All right, so then I’ll follow up, probably Tuesday with some notes on the next discovery phase. I’m gonna update the Gantt chart as well today. I think, Ashwini, maybe if you want to follow up on this work stream and just thinking about dbt orchestration.
325 00:38:57.560 ⇒ 00:39:02.080 Uttam Kumaran: And then, Catherine, I’ll send this also in Slack, but if we can get
326 00:39:02.200 ⇒ 00:39:04.359 Uttam Kumaran: access to Power BI and Asana, that would.
327 00:39:04.360 ⇒ 00:39:05.470 Katherine Bayless: Yep.
328 00:39:05.470 ⇒ 00:39:07.000 Uttam Kumaran: helpful.
329 00:39:07.000 ⇒ 00:39:07.520 Katherine Bayless: Thank you.
330 00:39:08.710 ⇒ 00:39:12.309 Ashwini Sharma: How about we start with normal GitHub actions?
331 00:39:13.650 ⇒ 00:39:20.639 Uttam Kumaran: Yeah, and later… Check out the Silk Lake DBT stuff, because it came out this year.
332 00:39:20.640 ⇒ 00:39:21.360 Katherine Bayless: Yeah, yeah.
333 00:39:21.360 ⇒ 00:39:24.820 Uttam Kumaran: I want to see… because GitHub Actions is just, like, a little finicky.
334 00:39:24.820 ⇒ 00:39:25.440 Katherine Bayless: It is.
335 00:39:25.470 ⇒ 00:39:34.340 Uttam Kumaran: So, if the dbt… if we can run the dbt jobs in Snowflake, it could be, like, a really big win, so…
336 00:39:34.650 ⇒ 00:39:35.390 Katherine Bayless: Yeah.
337 00:39:35.800 ⇒ 00:39:55.139 Katherine Bayless: Yeah, I have light experience with GitHub Actions. I mean, like, for example, on commit, like, this repo pushes everything up to an S3 bucket and, like, that kind of stuff. Actually, I think maybe I’ve got a few other magic tricks in there. I think it zips my Python notebook files or something like that, but yeah, so I don’t mind doing things in GitHub Actions, but fiddly is a good word.
338 00:39:56.020 ⇒ 00:39:56.670 Uttam Kumaran: Yeah.
339 00:39:57.530 ⇒ 00:40:14.899 Katherine Bayless: Yeah, the more we can lean on a very, like, frugal tech stack, initially, the better, because, I mean, it’s just… yeah, the sticker shock is definitely real, and I know eventually we will continue to change that, but yeah, the more we can, like, squeeze value out of the tools we have, the better.
340 00:40:16.130 ⇒ 00:40:16.960 Uttam Kumaran: Perfect.
341 00:40:16.960 ⇒ 00:40:17.720 Katherine Bayless: Yeah.
342 00:40:17.720 ⇒ 00:40:21.279 Uttam Kumaran: Yeah, also, that’s why I want to see how much we could stay within the ecosystem.
343 00:40:21.730 ⇒ 00:40:22.350 Uttam Kumaran: Yeah.
344 00:40:22.350 ⇒ 00:40:38.799 Katherine Bayless: Yeah. Yeah, actually, also, not for nothing, but this particular job will give you a decent amount, or a decent sense of the volume stuff that Polytomic is looking for, because these are going to be some of our bigger data sources that we’re processing on a, like, regular basis.
345 00:40:39.920 ⇒ 00:40:40.590 Uttam Kumaran: Okay.
346 00:40:42.200 ⇒ 00:40:43.230 Uttam Kumaran: Perfect.
347 00:40:45.300 ⇒ 00:40:54.659 Katherine Bayless: Okay, so I’ll get Asana, Power BI, I’ll get the DDL documentation, I will make a note of which systems have entities in them.
348 00:40:54.930 ⇒ 00:41:00.120 Katherine Bayless: and… then we should be good.
349 00:41:01.340 ⇒ 00:41:03.679 Ashwini Sharma: I have one more question.
350 00:41:04.160 ⇒ 00:41:08.230 Ashwini Sharma: This data that the script is transforming.
351 00:41:08.800 ⇒ 00:41:15.579 Ashwini Sharma: This is a dynamic data, right? You’re running the script every day, to generate the output, and…
352 00:41:15.870 ⇒ 00:41:22.010 Ashwini Sharma: What do you have in place that moves the data from S3 to Snowflake on a daily basis?
353 00:41:23.220 ⇒ 00:41:25.790 Katherine Bayless: Oh, so right now, it’s just me, the human.
354 00:41:27.290 ⇒ 00:41:28.050 Ashwini Sharma: Okay.
355 00:41:28.270 ⇒ 00:41:31.099 Katherine Bayless: Yeah, that’s why I was saying, like, oh, I think we could actually automate this.
356 00:41:31.100 ⇒ 00:41:34.420 Uttam Kumaran: Yeah, that’s why we’re gonna put it in here. It’s not been executed.
357 00:41:35.140 ⇒ 00:41:36.050 Ashwini Sharma: Yeah.
358 00:41:36.050 ⇒ 00:41:48.899 Katherine Bayless: Yeah. Yeah, yeah, so right now, I go to a bunch of different places, and I download flat files, and then I import them to Postgres, run this script, export the views, and put that data up on the FTP servers, so…
359 00:41:53.760 ⇒ 00:41:54.480 Ashwini Sharma: Alright.
360 00:41:56.610 ⇒ 00:42:03.600 Uttam Kumaran: Cool. Okay, so let me… I’ll go ahead, and then I’ll probably find time, Catherine, for maybe Tuesday afternoon.
361 00:42:03.600 ⇒ 00:42:04.400 Katherine Bayless: Okay.
362 00:42:04.640 ⇒ 00:42:07.239 Uttam Kumaran: So I’ll just put a… I’ll just put a meeting there.
363 00:42:07.240 ⇒ 00:42:20.120 Katherine Bayless: Actually, Tuesday might be dicey. Okay. I think I was looking at my calendar, somebody else wanted a Tuesday spot. Actually, yeah, after 3pm, I’m fine. But yeah, Tuesday before 3pm, I’m, like, double-booked all over the place.
364 00:42:20.120 ⇒ 00:42:25.640 Uttam Kumaran: Okay, I will aim for… for that, and then, yeah, me and Sam will be there, and we can talk through…
365 00:42:25.640 ⇒ 00:42:26.210 Katherine Bayless: Okay.
366 00:42:30.560 ⇒ 00:42:34.609 Uttam Kumaran: Okay, awesome. Well, have a great weekend. Enjoy.
367 00:42:34.930 ⇒ 00:42:39.229 Katherine Bayless: Yeah, you guys, too. Yeah, I know, right? Yeah.
368 00:42:39.400 ⇒ 00:42:42.680 Katherine Bayless: And if you have any questions at all, you know where to find me. Don’t be shy.
369 00:42:43.030 ⇒ 00:42:43.750 Ashwini Sharma: So…
370 00:42:44.730 ⇒ 00:42:46.020 Katherine Bayless: Thank you both. Thanks.
371 00:42:46.130 ⇒ 00:42:46.960 Katherine Bayless: Cheers.