Meeting Title: Brainforge dbt Training and Demo Date: 2025-12-19 Meeting participants: Ashwini Sharma, Kyle Wandel, Chi Quinn, Katherine Bayless


WEBVTT

1 00:00:05.890 00:00:07.140 Ashwini Sharma: Hey, Kyle.

2 00:00:07.460 00:00:08.719 Kyle Wandel: Hey, how’s it going?

3 00:00:09.210 00:00:10.339 Ashwini Sharma: Good, how are you?

4 00:00:10.620 00:00:11.699 Kyle Wandel: Not too bad.

5 00:00:14.650 00:00:19.240 Ashwini Sharma: Hi, how do you pronounce your name? G, or is it Kai?

6 00:00:19.390 00:00:20.700 Chi Quinn: It’s Kai.

7 00:00:20.700 00:00:21.910 Ashwini Sharma: Yeah.

8 00:00:22.430 00:00:23.490 Ashwini Sharma: Okay.

9 00:00:24.290 00:00:25.000 Chi Quinn: Hi!

10 00:00:25.420 00:00:28.639 Ashwini Sharma: Is Catherine going to join?

11 00:00:30.880 00:00:36.929 Kyle Wandel: I would imagine so. I haven’t heard that she’s not, but she’s not online right now. I can ping her and see if she responds.

12 00:00:53.580 00:00:57.150 Kyle Wandel: Just got online, so yeah, she’s typing me now.

13 00:01:00.260 00:01:03.060 Ashwini Sharma: Let me separate the windows.

14 00:01:03.660 00:01:06.500 Kyle Wandel: Yeah, she said she’ll be here in, like, 5 to 10 minutes, so…

15 00:01:06.500 00:01:07.250 Ashwini Sharma: Okay.

16 00:01:07.700 00:01:13.520 Kyle Wandel: It’s one of those things, I think, like she said, we’re always… we keep putting up, fires, basically.

17 00:01:16.440 00:01:22.230 Kyle Wandel: I think that the Texas thing, I don’t know if you were in the call earlier, but the Hill, Texas,

18 00:01:23.790 00:01:29.019 Kyle Wandel: Not confidentiality, but data protection laws, are throwing a wrench in things right now.

19 00:01:57.890 00:02:00.740 Ashwini Sharma: So, Kai, are you familiar with SQL?

20 00:02:01.720 00:02:02.360 Kyle Wandel: Yeah, I’m pretty…

21 00:02:02.360 00:02:03.200 Chi Quinn: SQL.

22 00:02:03.200 00:02:03.610 Ashwini Sharma: -Oh.

23 00:02:03.610 00:02:04.880 Chi Quinn: Kyle.

24 00:02:04.880 00:02:06.639 Ashwini Sharma: Oh, no, that’s good.

25 00:02:06.640 00:02:07.640 Kyle Wandel: I think we’re both pretty…

26 00:02:07.640 00:02:08.919 Ashwini Sharma: Oh, okay.

27 00:02:09.340 00:02:16.959 Kyle Wandel: I think we’re both pretty familiar. I mean, I’ve used it since I was in school, so I’ve used it for over 15 years now, so…

28 00:02:16.960 00:02:17.720 Ashwini Sharma: Good, good.

29 00:02:17.720 00:02:32.819 Kyle Wandel: I’m pretty good with SQL. I think what I’m not as good at is, like, the setting up of the environments, like, the actual, like, engineering background of it. But I know R really well. I don’t know Python as well, but I know R and SQL are… I know them quite well.

30 00:02:33.010 00:02:34.300 Ashwini Sharma: Okay, cool.

31 00:02:37.070 00:02:41.260 Ashwini Sharma: Yeah, we’ll need a little bit of Python knowledge to work with dbt.

32 00:02:41.380 00:02:46.279 Ashwini Sharma: Minimal, not that expensive. I mean…

33 00:02:46.280 00:02:46.600 Kyle Wandel: I’m…

34 00:02:46.600 00:02:47.669 Ashwini Sharma: That’s right, yeah.

35 00:02:48.520 00:03:03.860 Kyle Wandel: Yeah, and I… since because I think, you know, we both know some programming languages, it’s not too difficult using, like, AI and other tools that can help do that. That cursor tool is interesting. I was just taking a look at it after you guys were talking about it.

36 00:03:04.260 00:03:05.060 Ashwini Sharma: Oh, okay.

37 00:03:05.280 00:03:09.350 Ashwini Sharma: Yeah, we’ll try to utilize that in the demo as well.

38 00:03:09.600 00:03:12.169 Kyle Wandel: Do you, so do you use the cursor?

39 00:03:12.170 00:03:13.319 Ashwini Sharma: Oh, a lot.

40 00:03:13.570 00:03:19.809 Kyle Wandel: Nice, okay. Yeah, it’s interesting how it, like, will, like, auto, auto-completes based on, other things.

41 00:03:20.480 00:03:26.800 Ashwini Sharma: it’s not just autocomplete, but sometimes it gets irritating, right? I mean… it kind of.

42 00:03:26.800 00:03:27.160 Kyle Wandel: Yeah, you know.

43 00:03:27.160 00:03:36.309 Ashwini Sharma: I’m sort of using… used to use tabs a lot, and then suddenly I push a tab, and then it adds this much of code, right?

44 00:03:36.830 00:03:42.780 Ashwini Sharma: That’s the frustrating part of using it, but other than that, it’s very helpful.

45 00:03:44.140 00:03:46.579 Kyle Wandel: Yeah, it’ll be good to know more about it.

46 00:03:47.980 00:03:55.760 Ashwini Sharma: Utam is going to do a session on Cursor someday. In today’s session, maybe I’ll utilize cursor to, you know.

47 00:03:55.940 00:03:59.519 Ashwini Sharma: Populate some of the files that we’ll be using.

48 00:04:00.240 00:04:08.690 Kyle Wandel: No, I think, yeah, this is a perfect meeting in general, just learning more about the dbt model and how, like, basically to not just create it, but how to, like, edit it and stuff, modify it.

49 00:04:09.080 00:04:09.680 Ashwini Sharma: Show.

50 00:04:22.670 00:04:27.639 Ashwini Sharma: Wait a second, I don’t know if this is a recording? Oh, it is already recording, yeah, good.

51 00:04:27.640 00:04:29.179 Kyle Wandel: Yeah, I think it is, yeah.

52 00:04:34.260 00:04:37.330 Kyle Wandel: Have you had a fun time looking at the data?

53 00:04:37.910 00:04:40.540 Ashwini Sharma: Oh, yes, oh yes.

54 00:04:40.810 00:04:43.030 Kyle Wandel: How un… how unstructured it is?

55 00:04:43.420 00:04:47.590 Ashwini Sharma: No, it follows the structure. Initially, like, my understanding was wrong.

56 00:04:47.740 00:04:50.810 Ashwini Sharma: But… but it does follow some… some structure.

57 00:04:51.080 00:05:03.229 Kyle Wandel: That is true, and that’s what actually I was telling, Kai and Kath when I first joined, too. It was interesting that, it has, like, it’s not as complete a mess as you originally thought, it’s just really siloed. Like, there’s just no, like, communication between the

58 00:05:03.910 00:05:05.080 Kyle Wandel: Right.

59 00:05:35.230 00:05:37.010 Ashwini Sharma: Are you all in Eastern Time?

60 00:05:39.230 00:05:40.070 Chi Quinn: Yes.

61 00:05:40.940 00:05:41.330 Ashwini Sharma: Cool.

62 00:05:41.330 00:05:41.890 Kyle Wandel: Yep.

63 00:05:42.090 00:05:44.369 Kyle Wandel: Merola, where are you located at?

64 00:05:45.020 00:05:45.809 Ashwini Sharma: in India.

65 00:05:46.050 00:05:47.720 Kyle Wandel: Oh, in India. Oh, nice!

66 00:05:48.090 00:05:50.330 Kyle Wandel: So this is… what time is it there for you right now?

67 00:05:50.490 00:05:56.430 Ashwini Sharma: It’s, half an hour past midnight, 12… 12.36 right now.

68 00:05:56.880 00:05:59.060 Kyle Wandel: Jeez, are you sorry, are you used to working at…

69 00:05:59.060 00:06:01.310 Ashwini Sharma: Yeah, yeah, I’m used to it.

70 00:06:01.310 00:06:02.080 Kyle Wandel: Oh, yeah.

71 00:06:02.800 00:06:05.360 Kyle Wandel: When have you… how long have you been working for Brainforge?

72 00:06:05.780 00:06:07.830 Ashwini Sharma: About a month now.

73 00:06:07.940 00:06:13.449 Ashwini Sharma: Initially, yeah, this is the first week that I’m doing 40 hours. Earlier, it was…

74 00:06:13.730 00:06:15.530 Ashwini Sharma: 20 hours a week, so…

75 00:06:15.530 00:06:16.240 Kyle Wandel: Yeah.

76 00:06:16.830 00:06:24.379 Kyle Wandel: That’s pretty sweet. Are you just more of, like, a consultant for them, or are you more so… are you a full-time employee? You just go into full-time?

77 00:06:24.380 00:06:28.260 Ashwini Sharma: Yeah, now it is 40 hours, so it’s more like full-time.

78 00:06:28.260 00:06:28.930 Kyle Wandel: Yeah.

79 00:06:31.370 00:06:34.080 Kyle Wandel: That’s pretty sweet. We work fast, so we appreciate it.

80 00:06:34.080 00:06:37.480 Ashwini Sharma: Yeah, yeah, people do really fast stuff over here.

81 00:06:43.040 00:06:55.170 Kyle Wandel: It’ll be interesting to see how you guys tackle this member engagement thing. I’ve created a couple things to send you your way, but, so you can take a look at it, let me know if you have any questions, but it’ll be interesting what you guys come up with.

82 00:06:55.400 00:06:56.080 Ashwini Sharma: Sure.

83 00:06:56.310 00:07:02.380 Ashwini Sharma: And this member engagement thing, this is derived out of the data that’s there in the remembers,

84 00:07:02.750 00:07:05.970 Ashwini Sharma: Data source, as well as some other data sources that.

85 00:07:05.970 00:07:20.229 Kyle Wandel: Yeah, it’ll be some other data sources as well, so I’ll give you the thing, but it really is, it’s like a… it’s touch points of, like, all of our different, activities that CTA has, so, there will be some flat… not some flat files, but some old files that either, A, need to be updated.

86 00:07:20.230 00:07:30.279 Kyle Wandel: Or B, just find out a new process in order to make them automatic. But there will be some flat files I’ll have to give you, especially for certain areas within the organization.

87 00:07:31.700 00:07:35.189 Kyle Wandel: And I think the biggest… the biggest thing I was gonna say is that, just…

88 00:07:36.100 00:07:41.520 Kyle Wandel: you can go one of two ways, which is A, A, recreate what, like, shift and lift, basically, or just…

89 00:07:41.710 00:07:50.010 Kyle Wandel: start from scratch and start using the new, systems, but, I think I talked to Catherine in terms of, like, what the game plan is gonna be.

90 00:07:50.370 00:07:50.950 Ashwini Sharma: Okay.

91 00:09:02.340 00:09:04.960 Kyle Wandel: Alright, she’s back online, but she’ll join pretty soon.

92 00:12:25.620 00:12:31.670 Kyle Wandel: Alright, I just pinged her one more time, and if she just says start without her, we can get started, but I’ll see what she says.

93 00:12:35.430 00:12:36.550 Ashwini Sharma: Gotcha, yeah.

94 00:12:41.860 00:12:43.149 Ashwini Sharma: Oh, that’s yours?

95 00:12:44.240 00:12:45.030 Kyle Wandel: There she is, yeah.

96 00:12:48.400 00:12:51.520 Katherine Bayless: There we go. Hello, sorry to be late, but…

97 00:12:51.520 00:12:52.470 Ashwini Sharma: No problem.

98 00:12:52.710 00:12:53.650 Katherine Bayless: Domaine.

99 00:12:54.230 00:12:57.609 Ashwini Sharma: Alright, okay, let me start then.

100 00:12:57.820 00:12:59.370 Ashwini Sharma: By sharing my screen.

101 00:13:05.580 00:13:08.659 Ashwini Sharma: Alright, let me know what you’re able to see.

102 00:13:09.520 00:13:12.240 Kyle Wandel: Yep, we can see the, PowerPoint.

103 00:13:12.540 00:13:13.340 Ashwini Sharma: Okay.

104 00:13:13.620 00:13:18.619 Ashwini Sharma: And, how about now? Can you see my terminal?

105 00:13:19.180 00:13:19.930 Kyle Wandel: Yeah.

106 00:13:20.560 00:13:23.530 Ashwini Sharma: Let me make the window a little bit smaller.

107 00:13:26.900 00:13:27.660 Ashwini Sharma: Okay.

108 00:13:38.870 00:13:42.639 Ashwini Sharma: Is this font okay, or should I just increase it a little bit?

109 00:13:44.350 00:13:45.000 Kyle Wandel: They’ll increase it.

110 00:13:45.000 00:13:46.829 Ashwini Sharma: Yeah, I was gonna say.

111 00:13:47.050 00:13:48.200 Katherine Bayless: I have old person eyes.

112 00:13:51.330 00:13:55.639 Ashwini Sharma: What is the shortcut to increase the fund only over here?

113 00:13:55.640 00:14:00.680 Katherine Bayless: I think you can do, like, Ctrl-Shift, or Apple key shift plus, or whatever.

114 00:14:00.680 00:14:02.180 Ashwini Sharma: Yeah, it didn’t work.

115 00:14:02.180 00:14:04.170 Katherine Bayless: Oh, oh, wow. Oh, that’s funny.

116 00:14:07.380 00:14:12.740 Katherine Bayless: Well, I can make two. I can zoom on Zoom, like, I can use the Zoom feature.

117 00:14:18.790 00:14:20.579 Ashwini Sharma: No, it’s not working.

118 00:14:27.490 00:14:30.429 Ashwini Sharma: Isn’t this the thing to change,

119 00:14:30.620 00:14:31.730 Kyle Wandel: Thank you.

120 00:14:35.110 00:14:37.200 Katherine Bayless: Oh, I wonder if that’s, like, as big as…

121 00:14:37.200 00:14:37.900 Ashwini Sharma: Indeed.

122 00:14:38.830 00:14:39.470 Katherine Bayless: Blue.

123 00:14:51.150 00:14:52.799 Ashwini Sharma: Is this okay? Did it increase?

124 00:14:54.830 00:15:00.510 Katherine Bayless: Hmm, it’s okay, it’s, I mean… Like I said, I’ll just…

125 00:15:00.510 00:15:01.280 Ashwini Sharma: Oh, man.

126 00:15:01.280 00:15:02.919 Katherine Bayless: Zoom in the video.

127 00:15:03.500 00:15:06.009 Kyle Wandel: If I make it, yeah, I can see, I can see one.

128 00:15:06.610 00:15:07.150 Katherine Bayless: Yeah.

129 00:15:13.060 00:15:14.260 Ashwini Sharma: I pulled this thing up.

130 00:15:14.440 00:15:15.310 Ashwini Sharma: So…

131 00:15:17.780 00:15:26.890 Ashwini Sharma: Okay, sorry about that, but I’ll make sure that I type over here. I say what I type over here.

132 00:15:30.120 00:15:33.720 Ashwini Sharma: Okay… Let me open a new terminal window.

133 00:15:36.850 00:15:38.379 Ashwini Sharma: Oh, not as big?

134 00:15:38.890 00:15:39.669 Kyle Wandel: Yep, there you go.

135 00:15:47.010 00:15:48.480 Ashwini Sharma: Alright, okay.

136 00:15:50.490 00:16:03.480 Ashwini Sharma: Okay, so this is going to be a basic one-on-one session on dbt. It’s not an advanced course, so I’ll try to keep it as simple as possible, so that all of us are on the same page, and we can at least begin with dbt, right?

137 00:16:03.720 00:16:11.560 Ashwini Sharma: In this session, what I’m going to cover is what are the prerequisites required for getting on with dbt.

138 00:16:11.660 00:16:19.180 Ashwini Sharma: what exactly is dbt, and then I’ll give a demo, and then we’ll do a walkthrough of the code that I’ve created.

139 00:16:19.330 00:16:26.249 Ashwini Sharma: And maybe, like, from that point onwards, even Kyle and Kai can play around with that thing.

140 00:16:26.810 00:16:31.379 Ashwini Sharma: so, basically, the prerequisites are,

141 00:16:31.720 00:16:34.769 Ashwini Sharma: You need to have a small amount of experience with Python.

142 00:16:35.280 00:16:43.450 Ashwini Sharma: Should be able to use command line tool, experience with Git, GitHub, or any other version control.

143 00:16:43.770 00:16:47.799 Ashwini Sharma: And obviously, SQL, that’s… I should have placed it on the top.

144 00:16:48.120 00:16:53.350 Ashwini Sharma: But yeah, if you know all these four, then I think it’s going to be pretty easy.

145 00:16:54.520 00:17:02.509 Ashwini Sharma: So what exactly is dbt, right? dbt is an open-source tool. It’s a… essentially, it’s a glorified SQL.

146 00:17:02.770 00:17:20.500 Ashwini Sharma: But it makes life very easier when you’re doing lots of data modeling, right? And what it does is, basically, it will run on top of your warehouse, so you have bought the data from different silos into the warehouse, right? So the traditional ETL

147 00:17:21.200 00:17:22.640 Ashwini Sharma: framework.

148 00:17:22.750 00:17:31.760 Ashwini Sharma: That is slightly changed. You bring the data into the warehouse, you load it into the warehouse, and then do the T part, right? So it… the ETL becomes ELT.

149 00:17:32.330 00:17:46.290 Ashwini Sharma: And using this, data analysts can, you know, do data modeling, test it, document the data models that they have created, and without using the, you know, without making it really complex.

150 00:17:46.800 00:18:01.940 Ashwini Sharma: And at the core, what it does is it will write queries, right? The dbt models translate into queries, and these queries do the data transformation. So, basically, it creates a data pipeline where you transform data from

151 00:18:02.070 00:18:10.940 Ashwini Sharma: the raw data that has been ingested into Stage 1, Stage 2, Stage 3, and then finally into MARTS, which will be used for, analysis.

152 00:18:13.660 00:18:18.100 Ashwini Sharma: If you have any questions in between, you know, just feel free to ask.

153 00:18:19.130 00:18:28.750 Ashwini Sharma: Here are some of the terms that will be frequently occurring when we are talking about dbt, right? The first thing is models. Models are basically SQL files.

154 00:18:29.270 00:18:37.660 Ashwini Sharma: Where they have a query written in them, and then it will also contain some of the CTEs, right, common table expressions within it.

155 00:18:37.890 00:18:41.419 Ashwini Sharma: But the final output of a model is a query.

156 00:18:43.080 00:18:47.449 Ashwini Sharma: And each model defines a single transformation step in the entire data pipeline.

157 00:18:48.090 00:18:51.369 Ashwini Sharma: The models refer to each other using refs.

158 00:18:52.770 00:18:59.059 Ashwini Sharma: So there is something called REF, and when one model is dependent on another model.

159 00:18:59.420 00:19:01.310 Ashwini Sharma: It… it will refer to the…

160 00:19:01.450 00:19:09.949 Ashwini Sharma: depending model by using REF, right? So, for example, if a source… if a model is dependent on a source database, right, it’s a source…

161 00:19:10.330 00:19:18.090 Ashwini Sharma: table, it’s not a… not another model. In that case, we’ll use something called source. I’ll show you how we are going to use it.

162 00:19:18.450 00:19:23.710 Ashwini Sharma: Then there are something called macros, right? Macros are basically,

163 00:19:23.860 00:19:27.569 Ashwini Sharma: some reusable pieces of Python code, that is.

164 00:19:27.860 00:19:34.299 Ashwini Sharma: You know, that can be used throughout your models. For example, like, I can give an example.

165 00:19:34.940 00:19:41.900 Ashwini Sharma: Let’s say, we want to, you know… Generate a composite key.

166 00:19:42.220 00:19:53.809 Ashwini Sharma: Right? Consisting of multiple columns, and that’s something like MD5 hash of a set of columns, right? So we could abstract that code into a model.

167 00:19:53.930 00:19:55.360 Ashwini Sharma: And then,

168 00:19:56.620 00:20:11.610 Ashwini Sharma: you know, basically utilize it within our SQL code, right? Recently, I had to create a macro where one of the fields in some table had a pretty long URL, right?

169 00:20:12.370 00:20:20.289 Ashwini Sharma: the ones that have lots of, you know, path parameters, and the idea was, like, I wanted to utilize one of the path parameters.

170 00:20:20.530 00:20:23.489 Ashwini Sharma: So, created a macro that does the parsing.

171 00:20:23.780 00:20:31.729 Ashwini Sharma: So it’s cleaner on the SQL, and it looks easy, right? I mean, it looks really clean to whoever is working on that SQL.

172 00:20:32.440 00:20:37.290 Ashwini Sharma: Seeds are lightweight CSV files. Sometimes when you’re working on Excel, you might

173 00:20:37.470 00:20:51.249 Ashwini Sharma: have a, you know, some table, right? Basically, small table, where… that you’ll use for lookup or mapping. So, if you have a similar kind of situation during your dbt modeling, we can do that via seeds.

174 00:20:51.840 00:21:01.190 Ashwini Sharma: Tests are some of the automated data quality checks that we can create. For example, like, let’s say we have a primary key in a model, and we want to ensure that it’s never null.

175 00:21:02.240 00:21:06.499 Ashwini Sharma: We can have that kind of test, in the… in the test file.

176 00:21:06.690 00:21:14.600 Ashwini Sharma: It can also check for, you know, a range of values, acceptable values, and referential integrity.

177 00:21:16.110 00:21:20.949 Ashwini Sharma: And docs are, basically documentation of models that we…

178 00:21:21.060 00:21:32.250 Ashwini Sharma: That we can add, and basically, they can be served right from local, or we can publish it to some site, and then people can, you know, see the lineage.

179 00:21:32.380 00:21:35.389 Ashwini Sharma: Look at the description of the various models, and so on.

180 00:21:37.810 00:21:50.640 Ashwini Sharma: And where does this fit, right? So, basically, we have the extract, where a tool brings in the data into the warehouse, right? Or a tool starts extracting data from your siloed application.

181 00:21:51.000 00:22:03.489 Ashwini Sharma: and then it loads data into the warehouse, in some database, some schema, and then that’s where… after that point onwards, dbt comes into picture. So it will do all the transformation.

182 00:22:03.690 00:22:06.710 Ashwini Sharma: That is needed for us to analyze the data.

183 00:22:09.600 00:22:11.760 Ashwini Sharma: So this is how,

184 00:22:12.100 00:22:22.230 Ashwini Sharma: data stack for CTA might look like. It’s not yet fully complete, but let’s see some of the data sources, right? We have remembers, and then we have Salesforce.

185 00:22:22.430 00:22:27.759 Ashwini Sharma: And then we have Salesforce Marketing Cloud, and then there is Mattress, and then so on, there are multiple sources, right?

186 00:22:27.870 00:22:35.600 Ashwini Sharma: And these sources, Polyatomic is going to create connectors for these sources. It will bring data into the raw layers in Snowflake.

187 00:22:35.920 00:22:47.569 Ashwini Sharma: And then from raw layer, we’ll convert it into staging, and then maybe intermediate, and then create mods, right? And this is the layer that will be exposed to the analysts.

188 00:22:47.960 00:22:51.970 Ashwini Sharma: Using, tools like Looker, or Sigma, or Power BI.

189 00:22:57.250 00:23:04.289 Ashwini Sharma: Okay, so that sort of, you know, is the theory on dbt, so we’ll jump over to demo.

190 00:23:10.190 00:23:15.859 Ashwini Sharma: So, like, as I said, right, dbt works, it’s SQL file,

191 00:23:16.110 00:23:20.320 Ashwini Sharma: I mean, under the hood. And SQL file has to execute on a warehouse.

192 00:23:20.870 00:23:24.449 Ashwini Sharma: So there is some kind of a connectivity between,

193 00:23:24.600 00:23:35.530 Ashwini Sharma: dbt and the warehouse, right? And that connectivity is provided via adapters, right? So when we run, dbt code in the local laptop, right.

194 00:23:35.910 00:23:46.039 Ashwini Sharma: There needs to be an adapter which sends the query to warehouse, where it executes and creates the tables and views and all other stuff, right?

195 00:23:46.790 00:23:49.110 Ashwini Sharma: So, what we’ll do is,

196 00:23:56.380 00:24:02.880 Ashwini Sharma: Alright, so first thing that we need to do is create a virtual environment where we’ll install all our dependencies.

197 00:24:19.370 00:24:24.929 Ashwini Sharma: Alright, so I’ve activated. Now we’re going to install dbt and the adapter.

198 00:24:25.150 00:24:31.080 Ashwini Sharma: So, in case, like, you know, the commands, it’s easy to forget,

199 00:24:42.900 00:24:56.130 Ashwini Sharma: So this is the one, right? We are using Snowflake Warehouse, so obviously we’ll install an adapter for Snowflake, and then we’ll be installing dbt Core. So I’m just going to copy this one. I already have Python installed in my machine.

200 00:24:56.440 00:25:02.130 Ashwini Sharma: But in case, if you don’t have, you’ll have to install Python before doing all of these things.

201 00:25:02.320 00:25:05.739 Ashwini Sharma: Before even activating, this environment.

202 00:25:25.680 00:25:29.639 Ashwini Sharma: Alright, so let’s check this one.

203 00:25:31.730 00:25:36.120 Ashwini Sharma: It should show a version for dbt Core, and then a version for dbt Snowflake.

204 00:25:36.840 00:25:38.489 Ashwini Sharma: Okay, there it is.

205 00:25:39.650 00:25:43.999 Ashwini Sharma: So now what we’ll do is, now that dbt is installed, we’ll create a dbt project.

206 00:25:44.150 00:25:50.369 Ashwini Sharma: Right? And, we’re going to add some models into it. I’ll explain the various components of the project.

207 00:25:50.510 00:25:53.630 Ashwini Sharma: And then, we’ll create some models.

208 00:26:06.070 00:26:12.420 Ashwini Sharma: So, here it will ask me, like, what is the adapter that I’m going to use, and since I have only one adapter installed.

209 00:26:12.700 00:26:14.700 Ashwini Sharma: That’s the only option available.

210 00:26:17.270 00:26:22.899 Ashwini Sharma: Okay, so here we are going to put some values. One of the values that we need to put is the…

211 00:26:23.140 00:26:24.949 Ashwini Sharma: Snowflake account.

212 00:26:25.740 00:26:31.550 Ashwini Sharma: Now, I do not remember exactly what what were the details? So…

213 00:26:32.530 00:26:40.380 Ashwini Sharma: What I’m going to do is, I’m going to look into… existing one.

214 00:27:00.430 00:27:03.819 Ashwini Sharma: Alright, this, this is… this is the account.

215 00:27:12.850 00:27:19.209 Ashwini Sharma: Username. Okay, I’m gonna give my username over here, but when you do that, just replace it with yours.

216 00:27:27.200 00:27:32.350 Ashwini Sharma: And these are a couple of options for how we are going to authenticate against Snowflake.

217 00:27:32.480 00:27:36.439 Ashwini Sharma: I’m not going to use any one of these, right? But…

218 00:27:36.710 00:27:54.459 Ashwini Sharma: you know, just to proceed, I’m going to use one of the password options. I’ll show you what has to be done for external authentication. So, like, when I run dbt code on CTA repo, I will generally go with a browser-based authentication, right? Where it opens a browser, and then I sign in, and then

219 00:27:54.460 00:27:57.509 Ashwini Sharma: Only it, you know, works further.

220 00:27:59.450 00:28:01.700 Katherine Bayless: So can we use the SSO option here?

221 00:28:02.700 00:28:06.289 Ashwini Sharma: I have not tried that. Maybe we can.

222 00:28:07.030 00:28:07.760 Katherine Bayless: Yeah.

223 00:28:08.240 00:28:09.400 Ashwini Sharma: Maybe we can.

224 00:28:10.100 00:28:13.540 Ashwini Sharma: Maybe, should I just terminate and then… okay.

225 00:28:15.400 00:28:21.490 Ashwini Sharma: Oh, it already created, so, but maybe… In it,

226 00:28:27.760 00:28:29.799 Ashwini Sharma: I’m going to create another one.

227 00:28:31.220 00:28:32.740 Ashwini Sharma: Project 2.

228 00:28:35.860 00:28:37.350 Ashwini Sharma: What is the account name?

229 00:28:52.010 00:28:59.769 Ashwini Sharma: Let’s go with third option. Okay, yeah, it works, right? So, we’ll select the default is external browser. Let’s select that.

230 00:28:59.930 00:29:03.239 Ashwini Sharma: Role. Role was, developer role?

231 00:29:05.890 00:29:07.129 Ashwini Sharma: What was the role?

232 00:29:08.220 00:29:09.500 Ashwini Sharma: Role developer.

233 00:29:14.360 00:29:18.019 Ashwini Sharma: warehouse name was Brainforce Warehouse?

234 00:29:21.350 00:29:23.010 Ashwini Sharma: No, this is the one.

235 00:29:26.860 00:29:32.200 Ashwini Sharma: default database… Dev?

236 00:29:36.980 00:29:38.670 Ashwini Sharma: World schema…

237 00:29:43.410 00:29:45.209 Ashwini Sharma: Okay, now,

238 00:29:45.550 00:29:52.190 Ashwini Sharma: Dev is a database that we have created in, in the staging environment, or sorry, in the Snowflake.

239 00:29:52.300 00:30:00.960 Ashwini Sharma: But it’s not just dev, right? It’s dev underscore something. When you go to Snowflake, you’ll see that, and I’ll explain, like, why it is like that.

240 00:30:01.260 00:30:16.509 Ashwini Sharma: But this schema thing is really important, right? And I’m giving my name underscore dev over here. This is to ensure that when I run this dbt code locally from my laptop, all the models, all the tables.

241 00:30:16.810 00:30:22.410 Ashwini Sharma: that dbt will create, will all be under this schema, Ashwini Dev.

242 00:30:22.660 00:30:36.560 Ashwini Sharma: So, if Kyle runs his model, it’ll be under his username, right? So they don’t… like, my changes does not impact the models that you create, and it’s always a good practice to follow this thing.

243 00:30:39.090 00:30:45.120 Ashwini Sharma: Threads is, the number of threads that will… that dbt will spin up to create these models.

244 00:30:45.300 00:30:51.960 Ashwini Sharma: the more number of threads, the faster it is going to be, right? So, let’s give 4 as a default.

245 00:30:54.270 00:31:03.340 Ashwini Sharma: And what it does is it creates a entry in the profiles.yaml file. I’ll explain that also, what it is.

246 00:31:04.300 00:31:09.960 Ashwini Sharma: And… yes, so… Let’s use this one, right?

247 00:31:12.190 00:31:15.100 Ashwini Sharma: Demo… Logic, too.

248 00:31:18.510 00:31:23.150 Ashwini Sharma: And I’m going to use cursor in order to make further edits into this.

249 00:31:27.740 00:31:30.030 Ashwini Sharma: Let me know if you’re able to see this thing.

250 00:31:30.770 00:31:31.600 Ashwini Sharma: Okay.

251 00:31:33.060 00:31:35.730 Ashwini Sharma: I’ll open a new terminal window,

252 00:31:39.910 00:31:45.669 Ashwini Sharma: Okay, I… Where did I create the virtual environment?

253 00:31:47.450 00:31:48.899 Ashwini Sharma: It was over here.

254 00:31:51.640 00:31:54.849 Ashwini Sharma: Oh, it has already activated it, so well and good, okay.

255 00:32:03.350 00:32:04.960 Ashwini Sharma: Alright.

256 00:32:04.960 00:32:07.790 Katherine Bayless: Do you want to be in Demo Project or Demo Project 2?

257 00:32:08.080 00:32:10.039 Ashwini Sharma: I’m in demo project… oh, so…

258 00:32:16.870 00:32:17.690 Ashwini Sharma: Okay.

259 00:32:18.140 00:32:22.019 Ashwini Sharma: Let’s look at some of the files that it has created.

260 00:32:23.660 00:32:26.269 Ashwini Sharma: This is the main file that it will create, right?

261 00:32:26.450 00:32:37.649 Ashwini Sharma: Now, here there is a name of the project, right? And there is also something called name of the profile that this project is going to use in order to establish connection with Snowflake.

262 00:32:38.020 00:32:38.700 Ashwini Sharma: Right?

263 00:32:40.330 00:32:45.289 Ashwini Sharma: And there are some other parts, like model paths, all the models are going to be in this directory.

264 00:32:45.500 00:32:48.870 Ashwini Sharma: Analysis is in this, test parts, these are all parts.

265 00:32:49.100 00:32:55.870 Ashwini Sharma: Clean target is when you run dbt clean, it’s going to clean everything inside targets and dbt packages.

266 00:32:56.110 00:32:58.790 Ashwini Sharma: And then here is the description of models.

267 00:32:59.160 00:33:06.370 Ashwini Sharma: So, in this case, what it has done is it has created a folder called Example over here, and then two models.

268 00:33:07.000 00:33:14.989 Ashwini Sharma: And that is what it is indicating. Like, models will be the top-level object, and then the second level will be the name of the project.

269 00:33:15.310 00:33:21.719 Ashwini Sharma: And then inside this will be an example, which is the name of the container for the first model.

270 00:33:22.070 00:33:28.549 Ashwini Sharma: And… This setting, it means everything under example will be

271 00:33:29.000 00:33:33.619 Ashwini Sharma: manifested as a view object. It will be mineralized as a view object.

272 00:33:34.140 00:33:36.610 Ashwini Sharma: So, when I run these things.

273 00:33:36.830 00:33:43.059 Ashwini Sharma: these two models from dbt, both of them will be created as view objects within Snowflake.

274 00:33:43.490 00:33:48.600 Ashwini Sharma: Right? Had I made this table by… this is the default setting, right? If I…

275 00:33:49.280 00:33:51.130 Ashwini Sharma: If I made something like table.

276 00:33:51.390 00:33:54.609 Ashwini Sharma: It will be created as table objects in the warehouse.

277 00:33:57.890 00:34:00.560 Ashwini Sharma: Alright, let’s leave it as View right now.

278 00:34:03.080 00:34:08.780 Ashwini Sharma: Let me go back to… To my slide…

279 00:34:11.030 00:34:21.530 Ashwini Sharma: Alright, so that, I just talked about dbt Project, now we’ll go to Profiles, and I’ll explain you what exactly will be there in the profiles file.

280 00:34:37.900 00:34:52.750 Ashwini Sharma: So, let’s take a look at these two, right? This is the one that I’m using currently for connecting to CTA’s Snowflake instance, and then running dbt code, and this is the one that it has created.

281 00:34:53.090 00:34:59.440 Ashwini Sharma: Now, you see, a lot of things are common, but there are some extra parameters over here, which was required

282 00:34:59.870 00:35:06.299 Ashwini Sharma: To, you know, reutilize the authentication tokens that we receive through SSO.

283 00:35:06.710 00:35:09.990 Ashwini Sharma: So I’m going to add these values in this one.

284 00:35:10.910 00:35:13.450 Ashwini Sharma: So, this is one of the values.

285 00:35:16.910 00:35:18.450 Ashwini Sharma: Session alive.

286 00:35:22.570 00:35:24.410 Ashwini Sharma: reuse connections.

287 00:35:30.090 00:35:32.440 Ashwini Sharma: It does not have to be in the same order.

288 00:35:33.710 00:35:35.179 Ashwini Sharma: And file path.

289 00:35:41.290 00:35:44.130 Ashwini Sharma: Snowflake, this thing, this thing, yeah.

290 00:35:45.010 00:35:50.339 Ashwini Sharma: Alright? Now, if you see here, there is a target element also, right?

291 00:35:51.370 00:35:53.789 Ashwini Sharma: This is the entry that it has created.

292 00:35:54.570 00:35:55.240 Ashwini Sharma: Right?

293 00:35:55.500 00:36:08.599 Ashwini Sharma: There is something called target element. Now, this target specifies that when you run dbt from your command line without providing any target, the default target that it is going to use is dev.

294 00:36:09.590 00:36:17.020 Ashwini Sharma: Right? Now, there… there will be, multiple search targets in a… In a production environment, right?

295 00:36:17.130 00:36:22.360 Ashwini Sharma: So, for example, this was a dev target. I can have a SDG target, right?

296 00:36:22.700 00:36:25.050 Ashwini Sharma: Where everything remains the same.

297 00:36:29.800 00:36:33.240 Ashwini Sharma: Except the database would be now STG.

298 00:36:34.720 00:36:35.480 Ashwini Sharma: Alright.

299 00:36:35.900 00:36:48.040 Ashwini Sharma: And what it means is when I run dbt, and then I specify, okay, use this target, use SDG target, it’s going to change the creation of objects into a different schema now.

300 00:36:48.370 00:36:59.849 Ashwini Sharma: different database. Schema remains the same, but it will create it in a different database called SDG, instead of creating the in-depth, right? But we’ll be using only dev.

301 00:37:00.450 00:37:02.969 Ashwini Sharma: When we’re running, locally.

302 00:37:05.550 00:37:12.290 Ashwini Sharma: Alright, this was profiles. I will send you a snapshot of this one, because,

303 00:37:14.230 00:37:19.019 Ashwini Sharma: I don’t think you have these, the other values, right? So, let’s…

304 00:37:23.780 00:37:33.379 Kyle Wandel: So when you create the dbt project, this profile, this YAML file is also created at the same time, and then you’d have to go in and modify it to the current connection, basically, the Snowflake connection?

305 00:37:33.740 00:37:38.100 Ashwini Sharma: Yes, it will add an entry into your profiles.eml file.

306 00:37:38.100 00:37:38.700 Kyle Wandel: Correct.

307 00:37:38.700 00:37:46.019 Ashwini Sharma: If it is not there, it’ll create this file and then add an entry. If it is already there, it will just append one section in it.

308 00:37:46.140 00:38:00.659 Ashwini Sharma: And some of these values you’ll be missing. For example, like this allow token, allow ID token, and token file path, and reuse connections. These are things… will not be there, so you can add them later on.

309 00:38:01.150 00:38:01.840 Kyle Wandel: Okay.

310 00:38:03.380 00:38:07.360 Ashwini Sharma: Alright, now, this is there.

311 00:38:08.520 00:38:12.130 Ashwini Sharma: Go back to my presentation. Profiles is done.

312 00:38:12.260 00:38:15.549 Ashwini Sharma: We’ll talk about schema.yaml, right? And…

313 00:38:20.540 00:38:31.620 Ashwini Sharma: before talking about schema.yaml, let me talk about something called source.yaml, right? So, now, now, I mean, this example would be a very, you know, basic thing.

314 00:38:31.800 00:38:37.100 Ashwini Sharma: in any DBT training goals, so I’ll not follow this one. But what we’ll do is…

315 00:38:37.540 00:38:41.890 Ashwini Sharma: let’s try to model some data that’s there in Remember’s, dataset.

316 00:38:42.540 00:38:48.660 Ashwini Sharma: It’ll be, you know, this will help us with a real, real kind of example, right?

317 00:38:48.840 00:38:56.600 Ashwini Sharma: So, what I’m going to do here is I’m going to create a folder called, de-aging.

318 00:38:59.830 00:39:06.249 Ashwini Sharma: And in this staging, we’ll be modeling the remembers data, right?

319 00:39:10.020 00:39:10.900 Ashwini Sharma: So…

320 00:39:11.980 00:39:25.940 Ashwini Sharma: And, think about it this way, right? Staging is a staging area where we do some basic cleaning stuff and all those things, right? Renaming stuff, aliasing, standard naming conventions,

321 00:39:26.510 00:39:33.390 Ashwini Sharma: You know, ensuring that all the columns follow a standard data type, and so on, right? And…

322 00:39:34.080 00:39:39.940 Ashwini Sharma: remembers is just one of the data sources that are going to be there, right? So…

323 00:39:42.190 00:39:47.660 Ashwini Sharma: tomorrow there is going to be Shopify, and then there’s going to be Salesforce, and then so on, right?

324 00:39:47.780 00:39:53.140 Ashwini Sharma: So, that’s why I’m trying to ensure that all the data remains separate.

325 00:39:53.290 00:40:00.190 Ashwini Sharma: And within this one, we’ll just mimic one of them, right? Let’s look at a CRM.

326 00:40:00.460 00:40:01.520 Ashwini Sharma: database.

327 00:40:03.720 00:40:04.580 Ashwini Sharma: Alright.

328 00:40:04.710 00:40:07.120 Ashwini Sharma: Now, within CRM,

329 00:40:07.270 00:40:13.130 Ashwini Sharma: Okay. Now, as we do these things, right, a lot of things will change, right? Here, we need to specify

330 00:40:13.370 00:40:16.990 Ashwini Sharma: That we have a… model called staging.

331 00:40:18.330 00:40:28.020 Ashwini Sharma: And the staging here, schema should be… What do I put? STG.

332 00:40:28.870 00:40:29.680 Ashwini Sharma: Alright.

333 00:40:30.070 00:40:34.879 Ashwini Sharma: Now, what it’ll do is, it is going to append this schema name.

334 00:40:35.200 00:40:38.780 Ashwini Sharma: With the schema that we provided in profiles.

335 00:40:39.020 00:40:49.529 Ashwini Sharma: And profiles, if you remember, I had made a schema name as Ashwini underscore dev, right? Now it’s going to add STG to that, and then it’s going to create objects within it.

336 00:40:52.510 00:40:56.950 Ashwini Sharma: So, let’s, let me go to Snowflake first.

337 00:42:14.520 00:42:21.700 Ashwini Sharma: Alright, so, if I just look into solutions… dot, the stomach.

338 00:42:33.480 00:42:34.110 Katherine Bayless: Thank you.

339 00:42:36.250 00:42:36.970 Ashwini Sharma: And…

340 00:42:40.460 00:42:43.460 Ashwini Sharma: Now, we have some… some data over here, right?

341 00:42:46.870 00:42:47.610 Ashwini Sharma: Okay.

342 00:42:49.930 00:42:50.690 Ashwini Sharma: Yep.

343 00:42:51.530 00:43:00.080 Ashwini Sharma: All right? Now, in this exercise, what we’ll do is we’ll try to model an individual, right? Or an organization, right?

344 00:43:00.250 00:43:07.109 Ashwini Sharma: And individuals and organizations, they seem to be like dimensions, right? So we’ll model them as a dimension object, right?

345 00:43:07.230 00:43:17.939 Ashwini Sharma: And if you see this one here, anything with an O is an organization, and anything with an I is an individual. And then there are two other attributes also, which I’m not sure what exactly they are.

346 00:43:25.250 00:43:27.690 Ashwini Sharma: Sorry, it’s CRM.individual.

347 00:43:30.740 00:43:47.049 Ashwini Sharma: Okay. Now, if you notice, right, I’ve selected a database over here. I also select, told the SQL that, okay, I want to read this table under this schema, right? And the same thing will have to do with dbt as well. We’ll have to tell it that this is

348 00:43:47.310 00:43:49.210 Ashwini Sharma: From where you’re going to read it.

349 00:43:49.890 00:43:52.509 Ashwini Sharma: So, let me go back to this one.

350 00:43:52.860 00:43:56.619 Ashwini Sharma: And, I’m going to utilize cursorp now, right?

351 00:43:58.100 00:44:00.440 Ashwini Sharma: So, let’s say,

352 00:44:05.550 00:44:08.970 Ashwini Sharma: Create a sources.yaml file.

353 00:44:09.320 00:44:12.140 Ashwini Sharma: under… remembers.

354 00:44:39.820 00:44:42.680 Kyle Wandel: And while that loads, what was remembers again, really quickly?

355 00:44:43.700 00:44:51.890 Ashwini Sharma: Remembers is, is, how I have containerized models for remembers application within this.

356 00:44:52.380 00:44:53.230 Ashwini Sharma: Boulder.

357 00:44:54.100 00:44:54.780 Kyle Wandel: Okay.

358 00:44:56.280 00:45:04.979 Ashwini Sharma: So, finally, this is how it’s going to look like, right? For example, let’s say Salesforce… Marketing Cloud, right?

359 00:45:05.350 00:45:10.960 Ashwini Sharma: And maybe after some time, we’ll have Shopify.

360 00:45:15.200 00:45:19.990 Ashwini Sharma: And right now, since we have only remembers, so we are only working with remembers data.

361 00:45:20.230 00:45:26.860 Ashwini Sharma: So, you see, the cursor has already created a file called this one, and this is how it is structured, right?

362 00:45:27.020 00:45:37.639 Ashwini Sharma: Now, we need to give some information to it. Database, right? Database is one of the information. So let me go back to this one, and I will copy this thing.

363 00:45:41.010 00:45:45.430 Ashwini Sharma: ESN… This one.

364 00:45:45.610 00:45:46.280 Ashwini Sharma: Yeah.

365 00:45:48.750 00:45:51.089 Ashwini Sharma: And this is the database name.

366 00:45:54.880 00:45:56.429 Ashwini Sharma: and schema.

367 00:46:00.610 00:46:02.610 Ashwini Sharma: Alright? Now, I need to…

368 00:46:02.810 00:46:09.170 Ashwini Sharma: tell it that this is the database, this is the name through which I want to refer

369 00:46:09.890 00:46:13.820 Ashwini Sharma: this database and this schema combination in my dbt models.

370 00:46:14.120 00:46:18.360 Ashwini Sharma: Right? So, let’s say I give it a name called Remember CRM.

371 00:46:21.130 00:46:21.990 Ashwini Sharma: Alright.

372 00:46:23.100 00:46:28.330 Ashwini Sharma: Now, we have certain tables within the CRM schema. One of the tables is Customer.

373 00:46:33.690 00:46:36.460 Ashwini Sharma: Was it customer or customers?

374 00:46:37.460 00:46:38.939 Ashwini Sharma: Customer, yeah.

375 00:46:39.400 00:46:43.510 Ashwini Sharma: And then the other table is… individual.

376 00:46:55.340 00:46:56.780 Ashwini Sharma: Alright, this is enough.

377 00:46:57.510 00:46:58.310 Ashwini Sharma: Okay.

378 00:46:58.490 00:47:05.080 Ashwini Sharma: Now, we’ll go to CRM, we’ll create a new file within CRM, Which is…

379 00:47:05.720 00:47:13.950 Ashwini Sharma: let’s say, let’s create a staging file first, right? STG… Individual data.

380 00:47:16.480 00:47:24.720 Ashwini Sharma: now this is… now, we’ve just created a very first model, right? If you want to see this one, we can… we can also take a look at this.

381 00:47:25.900 00:47:32.459 Ashwini Sharma: this configuration is saying that it needs to be materialized as a table. Now,

382 00:47:32.970 00:47:41.039 Ashwini Sharma: few minutes back, I had shown something else over here, right? Here it says that everything inside examples should be materialized as a view.

383 00:47:41.280 00:47:48.099 Ashwini Sharma: But when we go to this first dbt model, here it is saying this model has to be mineralized as a table.

384 00:47:48.240 00:48:00.410 Ashwini Sharma: So, this is a way to override a higher level property that is applied to all the models, right? Through this thing, we said everything under examples is view.

385 00:48:00.460 00:48:10.100 Ashwini Sharma: But here we are saying, yeah, maybe everything under example is maybe a view, but this particular model has to be a table, right? So we are overriding that particular property.

386 00:48:10.180 00:48:17.989 Ashwini Sharma: And see what it is doing. It’s defining a CTE over here, and then from this CTE, it’s doing a select star.

387 00:48:18.600 00:48:23.729 Ashwini Sharma: And that is what creates a model in the backend, in the warehouse, so…

388 00:48:24.070 00:48:37.039 Ashwini Sharma: per individual, we’ll do the same thing, right? And the way I like to do it is, like, I’d like to follow certain coding conventions so that later, when somebody else tries to visit your model, it’s…

389 00:48:37.320 00:48:43.240 Ashwini Sharma: kind of easier for that person to understand, right? Rather than… Reading through spaghetti code.

390 00:48:43.490 00:48:54.150 Ashwini Sharma: So let’s create a small CTE, right? I like to name my CTEs as C underscore, so that as soon as I see something as C underscore, I know that it’s a CTE and not something else.

391 00:48:54.520 00:49:01.570 Ashwini Sharma: Customer staging… as… Right?

392 00:49:01.750 00:49:16.250 Ashwini Sharma: What do we have? I can do a select star, but then again, these are, you know, bad ways to do it, so we can… Okay, for an exam… for this example, for this exercise, we’ll do something very smaller.

393 00:49:17.450 00:49:21.749 Ashwini Sharma: So, we know that we need contact ID, type.

394 00:49:23.790 00:49:27.499 Ashwini Sharma: and record number. I’ll just take 3 columns.

395 00:49:30.960 00:49:38.030 Ashwini Sharma: Contact ID as contact ID? I’m kind of, you know, converting it into snake case.

396 00:49:38.490 00:49:42.320 Ashwini Sharma: What is the other one?

397 00:49:44.220 00:49:45.240 Ashwini Sharma: Type…

398 00:49:48.620 00:49:53.040 Ashwini Sharma: And… Record number.

399 00:50:10.780 00:50:17.599 Ashwini Sharma: Okay, cursor populated it directly, but this is how we are going to refer to a source file.

400 00:50:19.120 00:50:22.210 Ashwini Sharma: You define a source, and then you say.

401 00:50:23.190 00:50:29.079 Ashwini Sharma: what is the name of this source? The source name is Remember CRM. That’s what we give over here, right?

402 00:50:29.390 00:50:30.890 Ashwini Sharma: Remember CRM.

403 00:50:31.180 00:50:34.029 Ashwini Sharma: And what is the table name? Table name is customer.

404 00:50:36.650 00:50:42.459 Ashwini Sharma: And, I’m going to create a different CT, where this is individual CT.

405 00:50:42.790 00:50:46.939 Ashwini Sharma: Was it the individual, or… Yeah, it was the individual on here.

406 00:50:48.590 00:50:55.690 Ashwini Sharma: But this is not fully correct, right? We also need to, aye.

407 00:50:56.070 00:50:58.409 Ashwini Sharma: Now it becomes only individual.

408 00:50:58.950 00:51:04.769 Ashwini Sharma: And from the individual staging, we need some other information.

409 00:51:06.430 00:51:12.540 Ashwini Sharma: So in this case, we’re going to take a lesser number of columns, just take the first name and last name.

410 00:51:13.250 00:51:17.389 Ashwini Sharma: And… Is there anything else?

411 00:51:18.190 00:51:18.740 Kyle Wandel: Total.

412 00:51:19.830 00:51:24.580 Ashwini Sharma: Title. Do we need title? Okay, let’s take title. Title, first name, last name.

413 00:51:26.920 00:51:35.110 Ashwini Sharma: I thought, and then is deleted. This is… this is important, right? We’ll take the username also.

414 00:51:38.370 00:51:41.599 Ashwini Sharma: Let me type it out of here, and then copy it to…

415 00:51:44.610 00:51:49.099 Ashwini Sharma: What we need is customer ID, right? Customer ID.

416 00:51:50.890 00:51:52.090 Ashwini Sharma: And,

417 00:52:02.820 00:52:03.950 Ashwini Sharma: Alex’s name.

418 00:52:06.900 00:52:08.510 Kyle Wandel: Username.

419 00:52:22.160 00:52:23.100 Ashwini Sharma: Sorry to you.

420 00:52:23.850 00:52:29.320 Ashwini Sharma: Okay, we’ll do aliasing now, as customer ID.

421 00:52:29.830 00:52:30.590 Ashwini Sharma: Right?

422 00:52:31.440 00:52:33.789 Ashwini Sharma: Title is okay, first name…

423 00:52:41.000 00:52:46.309 Ashwini Sharma: So cursor is doing a lot of stuff by itself. Username, I’ll keep it as username.

424 00:52:46.760 00:52:50.999 Ashwini Sharma: from… Individual, this is correct, right?

425 00:52:56.020 00:52:57.560 Ashwini Sharma: Where is…

426 00:53:02.080 00:53:08.490 Ashwini Sharma: false, right? And now what we can do is we can establish a join between these two CTEs.

427 00:53:11.980 00:53:16.889 Kyle Wandel: You’re gonna have to… I think it’s deleted is, all one… all one word, there’s no.

428 00:53:17.560 00:53:19.899 Ashwini Sharma: Oh, yes, that’s correct, right.

429 00:53:20.390 00:53:22.219 Ashwini Sharma: Is deleted is one word here.

430 00:53:22.600 00:53:23.150 Kyle Wandel: Nope.

431 00:53:27.890 00:53:30.990 Ashwini Sharma: In fact, we could do something like this, right?

432 00:53:31.600 00:53:32.260 Kyle Wandel: Yeah.

433 00:53:34.210 00:53:36.419 Ashwini Sharma: And then in Snowflake, you can use this.

434 00:53:37.170 00:53:38.059 Kyle Wandel: You can do either one.

435 00:53:38.460 00:53:42.040 Kyle Wandel: Can you do either one, or is it… is it just… has to be the new variable name?

436 00:53:42.560 00:53:51.990 Ashwini Sharma: Yeah, you can use, yeah, you can use either one. In BigQuery, you can’t do this, right? It doesn’t allow you to do that. Snowflake does it, so…

437 00:53:52.090 00:53:59.860 Ashwini Sharma: Again, let’s do… DC, and this is… CI…

438 00:54:10.760 00:54:15.580 Ashwini Sharma: Yeah, this is sometimes why cursor is… Frustrating.

439 00:54:16.090 00:54:18.819 Ashwini Sharma: Alright, so now, now we have this thing, right?

440 00:54:18.930 00:54:22.190 Ashwini Sharma: We have created our very first model over here.

441 00:54:22.300 00:54:27.120 Ashwini Sharma: Let’s… let’s go back to dbt Project once, and then see…

442 00:54:27.580 00:54:33.540 Ashwini Sharma: that I’ve given. I’ve given staging schema as SDG, This is correct,

443 00:54:35.650 00:54:39.819 Ashwini Sharma: We have the sources file, everything is there in place, let’s try to run it now.

444 00:54:40.530 00:54:45.030 Ashwini Sharma: Before running it, what we can do is I can run a dbt debug.

445 00:54:45.560 00:54:48.359 Ashwini Sharma: To see if my connection and everything works.

446 00:54:49.920 00:54:56.249 Ashwini Sharma: And this is what happens, right, when you try to run a dbt command. It opens up a browser window.

447 00:54:56.580 00:55:03.250 Ashwini Sharma: Right? If you are not authenticated, it will ask you to authenticate using user ID, password, and the token.

448 00:55:03.360 00:55:10.410 Ashwini Sharma: And once it is authenticated, you can close this. Next time, it will not ask again until your session is active.

449 00:55:14.030 00:55:17.159 Ashwini Sharma: Alright, you’re trying to authenticate…

450 00:55:17.970 00:55:20.929 Ashwini Sharma: As differs from the user currently logged in.

451 00:55:22.230 00:55:26.460 Ashwini Sharma: What just happened? Let me look at my profile files. Did I give something wrong?

452 00:55:38.370 00:55:44.079 Ashwini Sharma: Demo project 2… This is correct.

453 00:56:07.050 00:56:08.919 Ashwini Sharma: Let me run a different command.

454 00:56:47.540 00:56:50.099 Ashwini Sharma: What did I give wrong over here?

455 00:57:24.080 00:57:34.299 Katherine Bayless: Out of small curiosity, because I don’t feel like you had anything that was wrong, so I’m kind of curious if it’s more of an Okta thing.

456 00:57:34.830 00:57:41.510 Katherine Bayless: like… Were you… was it working this way before, kind of thing?

457 00:57:41.510 00:57:43.399 Ashwini Sharma: It… it was, yeah…

458 00:57:43.400 00:57:44.110 Katherine Bayless: Yeah.

459 00:57:44.110 00:57:46.520 Ashwini Sharma: It was working, let me see…

460 00:57:47.060 00:57:49.879 Ashwini Sharma: Let me look at the original code, right?

461 00:57:52.250 00:57:55.010 Ashwini Sharma: So if I do a decompile…

462 00:58:48.320 00:58:55.359 Katherine Bayless: Oh, I think it’s brainforge.ai instead of Brainforge.com. I think you have brainforge.com in the model we just built.

463 00:58:56.520 00:58:58.330 Ashwini Sharma: Did I get bins.com?

464 00:58:58.740 00:58:59.769 Katherine Bayless: I think so.

465 00:59:00.090 00:59:08.230 Kyle Wandel: It also… it keeps opening up more CTA Octa things, so I don’t know… I mean, that probably… I don’t know if that has something to do with it, but not… you started with 1, now you have 4, so…

466 00:59:08.750 00:59:13.220 Ashwini Sharma: Yeah, yeah, it opens 4 for the first time, and then it should not open after that.

467 00:59:16.320 00:59:23.219 Katherine Bayless: Yeah, I think if you go back to a cursor for the project, we were working, like, the demo project, I think it’s you at brainforge.com in this one.

468 00:59:23.840 00:59:26.180 Katherine Bayless: Oh, yeah, yeah, yeah.

469 00:59:28.030 00:59:32.029 Ashwini Sharma: Nice catch, yeah. Okay, let’s, let’s do that now.

470 00:59:33.180 00:59:35.870 Ashwini Sharma: Decomposite…

471 00:59:35.870 00:59:38.550 Kyle Wandel: I love coding, I mean, it’s just what it is.

472 00:59:39.010 00:59:40.120 Ashwini Sharma: Yeah.

473 00:59:41.220 00:59:51.809 Katherine Bayless: I mean, I always do laugh, because it’s like, people think we’re, like, brilliant, and it’s like, no, we’re just really good at finding the dumbest mistakes, you know, right? It’s like, it’s always the tiniest little thing that’s the actual problem.

474 00:59:52.060 01:00:05.020 Kyle Wandel: I think it was, UTOM’s, the way he said it was, we like to go, go, go, and ask questions later, whereas people like to plan and everything. Like, not really how coding works. Like, you just, you hit go, go, go in dev, and then it works, then you put it to abroad.

475 01:00:07.400 01:00:13.740 Ashwini Sharma: Okay, so dbt compile worked, and probably dbt debug should also have worked,

476 01:00:16.530 01:00:23.680 Ashwini Sharma: Yeah, so it’s working now, right? Let me send that to you again, because I made a mistake.

477 01:00:25.390 01:00:29.949 Ashwini Sharma: And this one… Let me change this one also, AI.

478 01:00:30.880 01:00:31.600 Ashwini Sharma: Right?

479 01:00:36.220 01:00:39.780 Ashwini Sharma: I’ll put this in Slack after the meeting, so that,

480 01:00:39.960 01:00:42.840 Ashwini Sharma: But for now, let it go over here, yeah.

481 01:00:43.380 01:00:52.779 Ashwini Sharma: All right, it’s working, and then you’ll notice, right? It’s not, opening the browsers again and again.

482 01:00:52.900 01:01:06.269 Ashwini Sharma: And this is something that I figured out very recently. Otherwise, like, I was doing that thing for four browsers every time I was trying to build the models, which is so frustrating. But let’s take a look at the compiled code now, right?

483 01:01:06.540 01:01:23.739 Ashwini Sharma: See, this is the compiled code, and what dbt did behind the scenes is, like, it created this CD, where we had given, you know, that, in those curlies, I’ve mentioned source, and then the source name, and the table name. That got translated to an actual

484 01:01:23.900 01:01:28.700 Ashwini Sharma: Table name, that’s there in Snowflake, right? See here?

485 01:01:29.310 01:01:33.100 Ashwini Sharma: And then, this was straightforward, right?

486 01:01:34.510 01:01:46.039 Ashwini Sharma: Yeah, so now it has created a staging model, which is basically a clean version of data that we are going to use for creating a dimension, let us say, right?

487 01:01:47.830 01:01:54.469 Ashwini Sharma: now what we can do is we can utilize, let’s run it now, right? I think this is,

488 01:01:56.620 01:01:59.739 Ashwini Sharma: And, while running, right, dbt is…

489 01:02:00.060 01:02:11.130 Ashwini Sharma: I think, it allows you to run only selected models. For example, let’s say during my development, I made changes only to SDG individual. Now, I don’t want to run all the models.

490 01:02:11.410 01:02:16.130 Ashwini Sharma: But… there may be a situation where I want to run the models.

491 01:02:16.330 01:02:19.859 Ashwini Sharma: which are dependent on SDG individual, right? Because

492 01:02:19.980 01:02:28.220 Ashwini Sharma: That makes sense, right? Maybe I deleted some column in SDG Individual, and that’s going to impact all the downstream dependencies, right?

493 01:02:28.340 01:02:36.180 Ashwini Sharma: Or, let us say, I want to rebuild the models on which SDG individual is dependent on.

494 01:02:37.310 01:02:46.530 Ashwini Sharma: Right? Maybe data is refreshed, and I want to rebuild everything. So in that case, what we can do is we can select a particular model.

495 01:02:47.180 01:02:54.539 Ashwini Sharma: Select, right, this command. And then you can specify a plus And then SDG underscore.

496 01:02:59.480 01:03:05.120 Ashwini Sharma: Dev, you can either put a plus in the front, or you can put a plus in the back.

497 01:03:05.240 01:03:10.980 Ashwini Sharma: When you put a plus in the front, it’s going to build all the models on which STG individual is dependent.

498 01:03:11.340 01:03:16.320 Ashwini Sharma: And if you put a plus in the back, it’s going to run all the models which are dependent on SDJ individual.

499 01:03:17.030 01:03:20.170 Ashwini Sharma: Right? In this case, right now, we don’t have anything that’s…

500 01:03:21.400 01:03:27.859 Ashwini Sharma: On which either this is dependent, or some other models, which is dependent on individual.

501 01:03:28.620 01:03:30.800 Ashwini Sharma: So let’s see what it does now.

502 01:03:33.170 01:03:36.890 Ashwini Sharma: Alright, how about listing schemas and database dev.

503 01:03:38.430 01:03:43.489 Ashwini Sharma: does not exist. Alright, so it’s saying object does not exist.

504 01:03:43.650 01:03:53.690 Ashwini Sharma: what I’m going to do is I’m going to make a minor change in the profiles. Okay, in the… it worked for this one, because I have a lot of macros in the background.

505 01:03:54.010 01:04:04.389 Ashwini Sharma: which runs, and then it converts that schema into something that… that is going to work, right? In this case, it will not work. So what we’ll do is, I’m going to make a…

506 01:04:06.750 01:04:12.720 Ashwini Sharma: a database that already exists in the Snowflake, and then use that to create my objects.

507 01:04:16.290 01:04:24.360 Ashwini Sharma: Okay, so DevStaging is a database that exists there, and I’m going to utilize this one. So let’s run it again.

508 01:04:32.540 01:04:33.580 Ashwini Sharma: All right.

509 01:04:34.260 01:04:35.849 Ashwini Sharma: So you see, it has done.

510 01:04:35.980 01:04:43.379 Ashwini Sharma: What it says is, it created a model, ashwiniDevSTG.SDGindividual, right?

511 01:04:43.630 01:04:46.720 Ashwini Sharma: If we go back to Snowflake and then see…

512 01:05:02.360 01:05:04.430 Ashwini Sharma: Individual. It’s a view.

513 01:05:16.060 01:05:20.640 Ashwini Sharma: So we have all the things that… that we had queried, right?

514 01:05:21.090 01:05:22.530 Ashwini Sharma: And it…

515 01:05:30.690 01:05:31.899 Ashwini Sharma: Right over here.

516 01:05:33.720 01:05:36.740 Ashwini Sharma: Now, if, it’s created as a view.

517 01:05:37.040 01:05:44.510 Ashwini Sharma: If we go and change this setting over here, where is dbt Project?

518 01:05:45.470 01:05:46.140 Ashwini Sharma: Right?

519 01:05:47.510 01:05:56.100 Ashwini Sharma: Okay, by default, it is view, but I can give a materialization over here, right? So, let’s say… table.

520 01:05:57.040 01:06:02.660 Ashwini Sharma: Now I’m saying that explicitly that it has to be materialized as a table, Let’s run it again.

521 01:06:13.320 01:06:17.730 Ashwini Sharma: Yeah, this time you see it says… created SQL table model.

522 01:06:20.020 01:06:20.680 Ashwini Sharma: Right.

523 01:06:20.880 01:06:29.380 Ashwini Sharma: And what it does is, when you’re running it, right, in the run folder, it will create the script that creates the,

524 01:06:29.710 01:06:32.750 Ashwini Sharma: the object in DB. So in this case, it is…

525 01:06:33.300 01:06:36.500 Ashwini Sharma: table. In the previous case, it would have been a view.

526 01:06:40.930 01:06:47.650 Ashwini Sharma: Okay, did I exceed the time, or do we still have some time?

527 01:06:49.220 01:06:54.649 Katherine Bayless: And I think we’re over the meeting time, but I… I’m not in. I don’t have anything else. I should keep going.

528 01:06:54.650 01:06:56.310 Ashwini Sharma: Oh, okay, okay, cool.

529 01:06:56.620 01:07:02.860 Ashwini Sharma: All right, yeah, a couple of things are remaining, so I thought I’ll just complete that, huh?

530 01:07:06.460 01:07:07.190 Ashwini Sharma: Just one minute.

531 01:07:07.610 01:07:13.239 Ashwini Sharma: No, there are multiple places where we can define these things.

532 01:07:14.360 01:07:20.860 Ashwini Sharma: One of the places that we defined a configuration for these models is DVD project, right?

533 01:07:20.980 01:07:27.299 Ashwini Sharma: The other place that we can define is a YAML file within, within these,

534 01:07:27.600 01:07:34.059 Ashwini Sharma: models itself, right? And the third place where we can define is within the model itself.

535 01:07:34.300 01:07:37.250 Ashwini Sharma: For example, like, this one, let us see, right?

536 01:07:37.370 01:07:39.850 Ashwini Sharma: Here, we had a config defined.

537 01:07:40.010 01:07:41.259 Ashwini Sharma: within the model.

538 01:07:42.140 01:07:48.949 Ashwini Sharma: Now, similarly, we can define this config in a YAML file, like, for example, this has a schema.yaml, right?

539 01:07:49.070 01:07:55.929 Ashwini Sharma: Similarly, we can create a schema.yaml for remembers, right? So, let’s, remember CRM.

540 01:07:56.180 01:07:58.020 Ashwini Sharma: So let’s do that.

541 01:08:00.050 01:08:01.679 Ashwini Sharma: Let’s create a…

542 01:08:05.670 01:08:08.420 Ashwini Sharma: schema.html file, paste it.

543 01:08:10.530 01:08:14.139 Ashwini Sharma: Model name is STG underscore individual.

544 01:08:14.940 01:08:16.229 Ashwini Sharma: It’s,

545 01:08:21.760 01:08:23.140 Ashwini Sharma: model, right?

546 01:08:23.580 01:08:28.830 Ashwini Sharma: column name. Let’s look at… Column name.

547 01:08:29.880 01:08:31.200 Ashwini Sharma: What do we have?

548 01:08:33.210 01:08:47.580 Ashwini Sharma: Okay, I pulled in both contact ID and customer ID. Ideally, I should not have done that, right? Instead of star, I should have done, let’s say, cc.customerID. Let’s say this is the primary key for us.

549 01:08:47.750 01:08:54.710 Ashwini Sharma: We have… CC dot… Type, and cc.recordnumber.

550 01:08:56.040 01:09:01.109 Ashwini Sharma: And then CI.Title, first name, last name.

551 01:09:01.500 01:09:06.359 Ashwini Sharma: Maybe is deleted is not even required, right? Delete this one.

552 01:09:08.950 01:09:09.800 Ashwini Sharma: Alright.

553 01:09:10.350 01:09:17.249 Ashwini Sharma: So this is our model, and we have customer ID column, which We want to ensure that

554 01:09:17.779 01:09:19.259 Ashwini Sharma: This is not null.

555 01:09:22.060 01:09:26.419 Ashwini Sharma: And it should be unique, it should not be null, right? This is the test that we are…

556 01:09:26.580 01:09:27.580 Ashwini Sharma: putting in.

557 01:09:34.080 01:09:37.360 Ashwini Sharma: Alright, now let’s rebuild this model.

558 01:09:44.630 01:09:49.290 Ashwini Sharma: Okay, something happened, right? Invalid identifier, cc.customerID.

559 01:09:50.250 01:09:52.010 Ashwini Sharma: Why is it invalid?

560 01:09:54.540 01:09:59.389 Ashwini Sharma: Our customer ID is… is for CI, so let’s do this.

561 01:10:00.470 01:10:03.080 Ashwini Sharma: Customer ID. It’s the same thing, right?

562 01:10:15.030 01:10:19.890 Ashwini Sharma: Alright, it created this thing. Now, what we’ll do is, we’ll try to test it, right?

563 01:10:30.990 01:10:33.030 Ashwini Sharma: Okay, it failed for something.

564 01:10:33.260 01:10:34.760 Ashwini Sharma: What did it fail for?

565 01:10:37.810 01:10:40.530 Ashwini Sharma: Unique, my first dbt model.

566 01:10:42.260 01:10:43.280 Ashwini Sharma: Digging.

567 01:10:43.880 01:10:45.260 Ashwini Sharma: They exist.

568 01:10:59.480 01:11:03.620 Ashwini Sharma: Okay, I think, because I didn’t give a schema definition for it.

569 01:11:17.470 01:11:20.160 Ashwini Sharma: Let’s run all the models in it, right?

570 01:11:20.690 01:11:22.349 Ashwini Sharma: And then we’ll do a test.

571 01:11:29.050 01:11:29.870 Ashwini Sharma: Okay.

572 01:11:30.370 01:11:32.420 Ashwini Sharma: Now I’m going to run dbt test.

573 01:11:39.690 01:11:41.140 Ashwini Sharma: Okay, what field?

574 01:11:46.320 01:11:47.090 Ashwini Sharma: Hmm.

575 01:11:53.140 01:11:55.390 Kyle Wandel: Unique and not null.

576 01:11:57.000 01:12:04.179 Katherine Bayless: Can we separate them into a unique test and a not null test? Because I’m willing to guess it’s a unique error.

577 01:12:04.660 01:12:05.150 Kyle Wandel: Undo.

578 01:12:05.800 01:12:08.700 Ashwini Sharma: My first not null is failing.

579 01:12:08.700 01:12:09.270 Katherine Bayless: ready.

580 01:12:09.270 01:12:09.869 Ashwini Sharma: Yeah, no.

581 01:12:09.870 01:12:11.989 Katherine Bayless: It does say. Oh, okay, cool, cool.

582 01:12:11.990 01:12:19.649 Ashwini Sharma: Maybe it became null. See, over here, right? So we had two, these things. One is select one as ID in the model.

583 01:12:19.760 01:12:22.189 Ashwini Sharma: And the other one was select null as ID.

584 01:12:22.790 01:12:23.310 Katherine Bayless: Okay.

585 01:12:23.310 01:12:32.169 Ashwini Sharma: So, one of the records has a null identifier, and that’s why it’s failing. So let’s remove this one, and then let’s do select 2 as ID.

586 01:12:32.600 01:12:37.469 Ashwini Sharma: Now, let’s try to run it once, and then we’ll run the test again.

587 01:12:38.050 01:12:43.210 Ashwini Sharma: I wasn’t prepared for all these kind of errors occurring in a demo.

588 01:12:43.590 01:12:49.280 Katherine Bayless: Actually, I mean, it’s really helpful, because it’s cool to get to see how this is actually gonna, like, work, yeah.

589 01:12:49.830 01:12:50.756 Ashwini Sharma: Okay, so…

590 01:12:51.220 01:12:58.019 Kyle Wandel: Is the… like, the most simple way to put this is almost like a big, glorified stored procedure that helps create views…

591 01:12:58.020 01:13:03.870 Ashwini Sharma: Yes, it’s a glorified sequel, right? That’s what I said in the first, right?

592 01:13:04.270 01:13:05.389 Kyle Wandel: Yeah, I mean…

593 01:13:05.390 01:13:05.720 Ashwini Sharma: Yeah.

594 01:13:05.720 01:13:09.460 Kyle Wandel: All this new language is fun to learn, but at the same time, it’s all the same thing, too.

595 01:13:09.460 01:13:18.880 Ashwini Sharma: Right. But it becomes really easy, right? As your models grow in numbers, and your transformations becomes really complex.

596 01:13:18.990 01:13:22.120 Ashwini Sharma: That’s where the power of dbt comes into picture.

597 01:13:22.440 01:13:29.620 Ashwini Sharma: So, you see, we did a dbt test, and everything is running now, all looks good and fun.

598 01:13:30.640 01:13:36.719 Ashwini Sharma: But believe me, when you start working on a real project again, these issues will come up again and again, right?

599 01:13:36.720 01:13:37.320 Kyle Wandel: Yeah.

600 01:13:37.510 01:13:39.140 Ashwini Sharma: So, what’s the result?

601 01:13:39.140 01:13:40.949 Kyle Wandel: With our clean data, you know?

602 01:13:42.440 01:13:46.449 Katherine Bayless: That’s why I was like, I bet it’s a duplicate in there somewhere.

603 01:13:46.450 01:13:47.720 Ashwini Sharma: Okay.

604 01:13:47.720 01:13:54.239 Kyle Wandel: So, now the next thing is, we’ll do a dbt docs generate, and that’s the last part of the topic, right?

605 01:13:54.390 01:13:55.890 Kyle Wandel: You know, dogs.

606 01:13:57.000 01:13:59.979 Katherine Bayless: Actually, one more question about the tests, sure, sure.

607 01:14:00.290 01:14:11.190 Katherine Bayless: like, just using it as a hypothetical example, like, let’s say, you know, our membership number is never less than 1,000, right? And so, would we want to…

608 01:14:11.260 01:14:28.690 Katherine Bayless: use, like, dbt tests to say, like, fail the build if the membership, or, like, the active membership count comes in under 1,000, or is that not really what the tests are meant for, and that should just be, like, a notification somewhere that’s, like, probably a data quality alert?

609 01:14:28.900 01:14:29.400 Ashwini Sharma: The Testament.

610 01:14:29.400 01:14:34.940 Katherine Bayless: Are they meant for just, like, don’t let this out, or are they meant for, like, also catching data quality pieces?

611 01:14:34.940 01:14:51.520 Ashwini Sharma: To some extent, yes, it is supposed to cache data quality, but the kind of condition that you’ve imposed, that might not be, you know, doable using dbt test. So, maybe we’ll have to rely on some other kind of data quality.

612 01:14:52.330 01:14:58.070 Ashwini Sharma: I haven’t done that, like, we are doing it, for a different client,

613 01:14:59.060 01:15:12.789 Ashwini Sharma: But we are using a metaplane called Data Observability Tool, and that catches when a certain table has a different row count, or it has a row count beyond a certain threshold, or less than a certain threshold.

614 01:15:13.300 01:15:14.609 Katherine Bayless: Or, or if it grew.

615 01:15:14.610 01:15:18.720 Ashwini Sharma: It goes beyond, you know, certain expected range.

616 01:15:19.420 01:15:35.959 Katherine Bayless: Right, right. Yeah, like, I think where we would actually use something for that would be with, like, the marketing cloud stuff, because I know sometimes Courtney would ask for, like, you know, can you get me a list of blah blah blah, and I’ll put it together, and she’s like, there’s no way that’s enough people on there, right? Like, that kind of thing. So, down the road, but just curious, yeah.

617 01:15:35.960 01:15:36.690 Ashwini Sharma: Yeah.

618 01:15:36.850 01:15:41.460 Ashwini Sharma: Okay, well, let’s do a dbt docs generate now,

619 01:15:43.630 01:15:47.569 Ashwini Sharma: Oh, sorry, I didn’t give a… generate.

620 01:15:58.560 01:16:01.170 Ashwini Sharma: So it creates a catalog over here.

621 01:16:01.330 01:16:04.090 Ashwini Sharma: Noggett.

622 01:16:04.380 01:16:06.689 Ashwini Sharma: catalog.json, which will contain

623 01:16:06.950 01:16:13.669 Ashwini Sharma: all the information that is needed to generate the documentation, right? Now, if I do a dbt docserve.

624 01:16:17.310 01:16:26.729 Ashwini Sharma: It will open a browser, and then you can see there is a source called customer and individual. This is referring to the tables that are there.

625 01:16:27.010 01:16:30.690 Ashwini Sharma: In, in the source, table.

626 01:16:30.850 01:16:37.819 Ashwini Sharma: And then here are the models, right? Let’s look at the models. Staging, remember, CRM, STC, individual.

627 01:16:38.010 01:16:48.599 Ashwini Sharma: So it might have a description, which I… obviously, I gave a wrong description, but you could describe properly what this model does, what are these columns.

628 01:16:48.740 01:16:53.090 Ashwini Sharma: What do they mean, right? All these things. And,

629 01:16:53.970 01:16:58.160 Ashwini Sharma: I think we can take a look at this one. Maybe this has a better description.

630 01:16:59.260 01:17:02.260 Ashwini Sharma: Right? You can see the compiled source code.

631 01:17:03.260 01:17:05.619 Ashwini Sharma: Source and the compiled code.

632 01:17:06.030 01:17:07.930 Ashwini Sharma: Same thing over here.

633 01:17:09.100 01:17:10.830 Ashwini Sharma: Code, compiled code.

634 01:17:11.120 01:17:15.509 Ashwini Sharma: And then you can also see a lineage. Where’s the lineage?

635 01:17:21.640 01:17:25.370 Ashwini Sharma: Oh, come on, what is the lineage?

636 01:17:27.600 01:17:36.899 Ashwini Sharma: Individual… So, it depends on customer and individual table. There used to be a lineage.

637 01:17:45.500 01:17:49.499 Katherine Bayless: Yeah, I feel like I saw lineage on one of the other tabs, or one of the other bottles, or something.

638 01:17:49.810 01:17:50.270 Katherine Bayless: I see.

639 01:17:50.270 01:17:52.829 Ashwini Sharma: Let’s do… I’m missing.

640 01:17:54.210 01:17:55.539 Kyle Wandel: This is gonna sound weird, but…

641 01:17:55.540 01:17:57.650 Ashwini Sharma: Oh, over here, hit it, hit it, hit.

642 01:17:57.650 01:17:58.650 Kyle Wandel: Oh, that’s pretty cool.

643 01:17:59.630 01:18:11.099 Kyle Wandel: Is there a way to, like, build a dbt model in this, like, interface, and then generate the code, or is it really just… you have to use the code, and then it just generates that?

644 01:18:11.960 01:18:15.289 Ashwini Sharma: So, sorry, I didn’t follow that, what is it?

645 01:18:15.290 01:18:30.380 Kyle Wandel: So, I see that this is, like, obviously this is on your local host, but does dbt have, like, a website where you can, or an application where you can literally input some of the, like, all of these things, basically, onto this type of UI, and then it generates the code for you? Is there anything like that out there?

646 01:18:30.380 01:18:37.979 Ashwini Sharma: No, no. So, dbt Cloud is there, right? Where you can do the same thing on a.

647 01:18:37.980 01:18:38.709 Kyle Wandel: Do what?

648 01:18:39.000 01:18:43.600 Ashwini Sharma: But yeah, not something like what you are thinking.

649 01:18:44.400 01:18:54.850 Ashwini Sharma: So you can see this kind of lineage, right? As our models get more and more complex, we can have multiple of such entities, and then we can track how a certain entity is being built.

650 01:18:55.080 01:18:59.200 Ashwini Sharma: Much easier when, when, when you’re tracking down

651 01:18:59.340 01:19:05.419 Ashwini Sharma: You know, dependencies and figuring out, okay, how did we derive this model visually?

652 01:19:06.110 01:19:10.220 Ashwini Sharma: So this is the graph, and that brings me…

653 01:19:11.830 01:19:16.080 Ashwini Sharma: Over here, yeah, we looked at all of these things.

654 01:19:16.890 01:19:23.320 Ashwini Sharma: And I can give a quick, walkthrough of our codebase.

655 01:19:25.730 01:19:26.430 Ashwini Sharma: Good.

656 01:19:42.270 01:19:50.389 Ashwini Sharma: So, this is the code base, and you can download it and then, you know, read it from the GitHub repo.

657 01:19:50.880 01:19:58.529 Ashwini Sharma: But I’ll just give an overview of what exactly is happening over here, right? So it’s a bit complex, compared to

658 01:19:58.630 01:20:09.480 Ashwini Sharma: what I’ve given in a demo, but overall, it follows the same kind of structure, right? So if you see models, right, models are broken into staging, intermediate, and mods.

659 01:20:09.580 01:20:13.539 Ashwini Sharma: In staging, it’s remembers, and then in remembers, we have

660 01:20:13.640 01:20:25.140 Ashwini Sharma: different, functional areas. There is a sources.yaml file where all the sources are listed down. See, all of these are done by a cursor only, like, I didn’t type any of these things.

661 01:20:25.320 01:20:30.810 Ashwini Sharma: And… let’s go back to… Oh, did I close it?

662 01:20:32.200 01:20:34.290 Ashwini Sharma: No, no, I didn’t close it, okay.

663 01:20:34.480 01:20:36.770 Ashwini Sharma: staging, right? Let’s go to CRM.

664 01:20:37.110 01:20:41.490 Ashwini Sharma: And if you see all this staging, these are…

665 01:20:42.850 01:20:44.880 Ashwini Sharma: So let’s look at some of these.

666 01:20:46.400 01:20:48.499 Ashwini Sharma: The individual, right?

667 01:20:49.620 01:20:55.880 Ashwini Sharma: See, what I’ve done is, in addition to renaming of columns, I’ve also done a typecasting to ensure that

668 01:20:56.370 01:21:04.659 Ashwini Sharma: You know, when we read this model, we know exactly what a certain data type is going to be, and it is going to remain consistent, right?

669 01:21:04.840 01:21:07.430 Ashwini Sharma: Yeah.

670 01:21:08.450 01:21:12.179 Kyle Wandel: And if you go to the… if you go to the YAML… the schema file.

671 01:21:12.920 01:21:13.879 Ashwini Sharma: Yep.

672 01:21:15.070 01:21:20.160 Kyle Wandel: This is your, like, what you think the descriptions of the values are. This is not from Impexium, right?

673 01:21:20.470 01:21:24.300 Ashwini Sharma: Yeah, yeah, this is cursor generated, so it might be wrong right now, yeah.

674 01:21:24.300 01:21:32.680 Kyle Wandel: No, no, no, no worries at all. I mean, I think we’ll have to go through this as well, but it’s good to know that, that all of it’s kind of laid out already, and we can kind of take a… start taking a look at it.

675 01:21:32.680 01:21:33.450 Ashwini Sharma: Right.

676 01:21:33.690 01:21:43.469 Ashwini Sharma: And so staging is all about, you know, renaming and, you know, and ensuring the data types are correct.

677 01:21:43.760 01:21:48.510 Ashwini Sharma: Let’s, go to one level above staging, right?

678 01:21:48.810 01:22:01.269 Ashwini Sharma: Let’s go to MARTS. And in March, what I’ve done is I’ve created a dimension called individual. So what I do over here is I’m taking a CTE, right, which is basically coming from customer email.

679 01:22:01.980 01:22:09.450 Ashwini Sharma: And I mentioned, right, we use ref when models are referring to each other, they use REF.

680 01:22:09.690 01:22:13.379 Ashwini Sharma: When models are referring to source, they use source.

681 01:22:13.860 01:22:20.320 Ashwini Sharma: So here, this model, them individual, is referring to a staging model that I’ve already created over here.

682 01:22:20.660 01:22:27.909 Ashwini Sharma: And this is exactly what, what the CT is. So, for every customer ID, I’m trying to

683 01:22:28.030 01:22:31.679 Ashwini Sharma: you know, get a corresponding email ID for it, right?

684 01:22:31.880 01:22:38.860 Ashwini Sharma: And, since each customer has multiple email IDs, I did some kind of partitioning logic over here.

685 01:22:39.040 01:22:42.930 Ashwini Sharma: Which reduces it to just one email ID based on

686 01:22:43.480 01:22:47.399 Ashwini Sharma: If there is a isPrimary, true?

687 01:22:47.550 01:22:53.129 Ashwini Sharma: then I take it, otherwise it’s based on the last modified email ID.

688 01:22:54.530 01:23:00.009 Ashwini Sharma: And then I take all the details from customer, join it with individual.

689 01:23:00.670 01:23:09.249 Ashwini Sharma: And join it with email, and then get all the necessary fields that I have selected over here.

690 01:23:09.890 01:23:15.699 Ashwini Sharma: And I follow the similar syntax for organization, except for organization, there is no email.

691 01:23:16.020 01:23:20.409 Ashwini Sharma: And it’s just a join between customer and organization staging tables.

692 01:23:20.730 01:23:23.440 Ashwini Sharma: And then there is the FACT membership.

693 01:23:23.770 01:23:32.349 Ashwini Sharma: Which is, some selected columns from this membership benefit table, which maps the organization to another organization.

694 01:23:32.730 01:23:38.399 Ashwini Sharma: and… And then… what I do is,

695 01:23:39.200 01:23:44.420 Ashwini Sharma: I joined this team organization that I’ve created with this city.

696 01:23:45.520 01:23:47.759 Ashwini Sharma: To get a parent organization name.

697 01:23:49.460 01:23:54.860 Ashwini Sharma: And then, and then this report comes, right? So this report…

698 01:23:55.080 01:23:59.699 Ashwini Sharma: Is, again, here, here is a related contact where

699 01:23:59.880 01:24:04.720 Ashwini Sharma: I’m trying to get a con- a single contact for… Any organization ID?

700 01:24:04.980 01:24:12.879 Ashwini Sharma: And that contact is sort of joined with the individual. Organization member benefit relationship.

701 01:24:13.230 01:24:19.150 Ashwini Sharma: And then joined it with related contact, and then this is joining with individual.

702 01:24:19.670 01:24:27.669 Ashwini Sharma: And we are looking at only organizations where this organization ID is null, right? Parent organization ID is null. This is…

703 01:24:27.810 01:24:29.100 Ashwini Sharma: benefits from.

704 01:24:29.390 01:24:31.890 Ashwini Sharma: That is empty.

705 01:24:34.640 01:24:46.960 Ashwini Sharma: Yeah, so this is… I mean, this is kind of, like, if you know SQL, it will be understandable, right? Nothing complex over here. I’ll walk you through things that are…

706 01:24:47.260 01:24:53.910 Ashwini Sharma: That might not be very familiar, right? Everything under models is going to be extremely familiar to you.

707 01:24:54.590 01:24:58.589 Ashwini Sharma: But what’s not familiar would be things like this.

708 01:25:00.960 01:25:03.709 Ashwini Sharma: For example, generate database name, right?

709 01:25:03.850 01:25:05.300 Ashwini Sharma: And what it does…

710 01:25:05.420 01:25:11.920 Ashwini Sharma: You see, this is a part of a Python code, and what it is doing here is…

711 01:25:12.370 01:25:20.580 Ashwini Sharma: In the profiles file, if you have noticed, right, I defined targets, and in targets, I define database name as well as the schema name.

712 01:25:21.350 01:25:23.550 Ashwini Sharma: Now, that is the target schema name.

713 01:25:24.020 01:25:27.310 Ashwini Sharma: what I’ve given in, in, profile file.

714 01:25:27.570 01:25:35.449 Ashwini Sharma: Now, in this one, in addition to profile file, if I define in dbt project, right,

715 01:25:36.540 01:25:38.710 Ashwini Sharma: If I define in dbt project.

716 01:25:39.080 01:25:54.390 Ashwini Sharma: schema to use, right? For example, in this case, I’m saying, you know, use this schema. Although I have defined a schema in profile file, I’ve also defined a database in profiles file, but in addition to that, I’m saying, use this database, right? This is called a custom

717 01:25:54.620 01:25:59.669 Ashwini Sharma: Custom database and custom schema name. Custom database name, custom schema name, right?

718 01:25:59.790 01:26:10.630 Ashwini Sharma: So when I look into this one, these two are, you know, overridden macros of dbt. So they will be invoked. When I give this name, generatedatabasename.sql.

719 01:26:10.870 01:26:27.339 Ashwini Sharma: and generate schemaName.sql, these two files will be invoked, functions will be invoked, and then these two functions generate database name and schema name, right? So while it is generating database name, all it does is it will look at the custom database name.

720 01:26:27.710 01:26:32.849 Ashwini Sharma: If it is not there, then it will just use the default database name, which is basically the target name.

721 01:26:33.050 01:26:38.369 Ashwini Sharma: Otherwise, if there is a custom database name, what it does is it will take the default database name.

722 01:26:39.310 01:26:45.660 Ashwini Sharma: put an underscore in it, and then use a custom database name, right? So, let’s say, let’s take a look at this one.

723 01:26:50.580 01:26:54.440 Ashwini Sharma: So, this is the target that I use for this project, right?

724 01:26:54.540 01:26:56.940 Ashwini Sharma: Here you see the database name is DEV.

725 01:26:57.350 01:26:58.080 Ashwini Sharma: Right?

726 01:26:58.750 01:27:03.829 Ashwini Sharma: It was dev, and in dbt project, I said, use staging.

727 01:27:05.240 01:27:06.710 Ashwini Sharma: So, what it did…

728 01:27:07.090 01:27:13.280 Ashwini Sharma: What this does is it will say, you know, use the database name dev underscore stG.

729 01:27:14.330 01:27:15.090 Ashwini Sharma: Alright.

730 01:27:15.250 01:27:24.700 Ashwini Sharma: And similar thing goes for schema name, right? Now, in the schema name, I’ve given in my profiles file, I’ve given Ashwini underscore dev, right?

731 01:27:25.000 01:27:28.839 Ashwini Sharma: So, when this runs, and in the DVD project, I said.

732 01:27:29.380 01:27:33.900 Ashwini Sharma: schema to be used, sorry, it’s here. Schema to be used is…

733 01:27:34.440 01:27:47.750 Ashwini Sharma: For accounting, it was STG remembers Accounting, for app, it was STG Remember App, and so on, right? For different, different functional areas, I put different schema. So when this runs, generate schema name.

734 01:27:48.870 01:27:53.889 Ashwini Sharma: Initially, it takes the target name, which is… Ashwini underscore dev.

735 01:27:54.110 01:28:01.390 Ashwini Sharma: And if I’ve not provided a custom schema, it’s just the target schema name.

736 01:28:01.530 01:28:06.570 Ashwini Sharma: But if the target name is Broad or SDG, right, which will…

737 01:28:06.880 01:28:22.019 Ashwini Sharma: work in a CICD environment, then we just take the custom schema name. But if it is a dev, then it should be the default schema name, which is the target schema name, Ashuni underscore dev, underscore the schema name that I’m providing for individual

738 01:28:22.270 01:28:28.739 Ashwini Sharma: Functional areas. So, let’s go back to this one, and…

739 01:28:29.040 01:28:31.439 Ashwini Sharma: If you see this thing, right?

740 01:28:32.010 01:28:33.820 Ashwini Sharma: Dev underscore staging.

741 01:28:34.120 01:28:35.610 Ashwini Sharma: You’ll see,

742 01:28:35.850 01:28:45.609 Ashwini Sharma: This is SFDev, SDG, remember? Now, SFDev is coming because in another profile file that I provide, when I run it from Snowflake.

743 01:28:45.890 01:28:49.520 Ashwini Sharma: it appends sfdev to the schema name.

744 01:28:51.120 01:29:06.129 Ashwini Sharma: And, okay, I have not run it from my schema name since yesterday. I deleted everything, so it does not show anything. But if I run it from my schema, it is going to show Ashwini underscore dev, and then it will show some files over here.

745 01:29:06.620 01:29:07.929 Ashwini Sharma: Schema all the way here.

746 01:29:11.650 01:29:15.720 Ashwini Sharma: Yeah, that… that covers almost the DBT.

747 01:29:16.160 01:29:17.230 Ashwini Sharma: module.

748 01:29:18.700 01:29:20.510 Ashwini Sharma: What else?

749 01:29:20.510 01:29:28.920 Katherine Bayless: I think this is really, like, awesome. It’s a lot to digest, for sure, but, I think, what would be a good exercise, maybe, is, like.

750 01:29:29.910 01:29:36.830 Katherine Bayless: so we have that, webhooks thing that I built for Chris, and, like, taking the, you know.

751 01:29:36.980 01:29:47.570 Katherine Bayless: what that is at the end of the little pipeline that we had built, and, like, recreating that as a proper dbt project, I think that’d be a good, like, tiny exercise for us to practice with, like…

752 01:29:47.570 01:29:48.310 Ashwini Sharma: Oh, shot, yeah.

753 01:29:48.310 01:29:49.190 Katherine Bayless: Yeah.

754 01:29:49.190 01:29:53.300 Ashwini Sharma: And if you run into any problems, you know, just ping me, and then I can…

755 01:29:53.940 01:29:57.380 Ashwini Sharma: Maybe. I mean, you’ll be doing it next week only, right now.

756 01:29:57.950 01:29:59.090 Katherine Bayless: Yeah, yeah, yeah.

757 01:29:59.120 01:30:00.260 Ashwini Sharma: Yeah, good.

758 01:30:00.400 01:30:15.740 Ashwini Sharma: This is, like, this is the workspace of Snowflake, so what we do is we pull in the dbt project that’s there in GitHub, right? It appears like this, and then you can do the same thing that I did from,

759 01:30:16.000 01:30:17.260 Ashwini Sharma: from cursor.

760 01:30:17.490 01:30:19.949 Ashwini Sharma: We can run dbt over here, right?

761 01:30:20.940 01:30:39.029 Ashwini Sharma: So, you see, this has a different profile file, right? The profile file that I use is local to me, right? The profile file that you’ll be using is local to you. But when it runs in a CI-CD environment, it needs a different profile file, and that’s why there is another additional profile file over here.

762 01:30:40.630 01:30:44.540 Ashwini Sharma: So you don’t need to use this, this is only for CICD environments.

763 01:30:44.660 01:30:48.760 Ashwini Sharma: Make sure that you have your own local profile,

764 01:30:50.240 01:31:00.439 Ashwini Sharma: profile file under .dbt folder. I mean, when you’re doing a dbt initiate it’s going to create it anyways, but if it doesn’t do, like, I’ve mentioned it over here, what you need.

765 01:31:04.980 01:31:06.529 Ashwini Sharma: In the README file.

766 01:31:08.540 01:31:11.519 Ashwini Sharma: Alright, any questions?

767 01:31:14.790 01:31:16.540 Katherine Bayless: Probably a lot, but next.

768 01:31:16.540 01:31:17.380 Kyle Wandel: Yeah, probably will be.

769 01:31:18.920 01:31:34.170 Kyle Wandel: I mean, this is a really good setup, though, and I think you’ve done a really good job of kind of, like, creating a template, basically, for us, and kind of, we’ll take the next steps and go from there, but I think, Catherine, you’re right, having some practice with that webhooks would be good. Keep our feet wet a little bit,

770 01:31:35.040 01:31:36.189 Kyle Wandel: And then go from there.

771 01:31:36.760 01:31:45.300 Katherine Bayless: Yeah, yeah. Starting with correcting what I called it and switching it to be Formstack, so that it follows the convention of naming it after the platform the data is from.

772 01:31:47.110 01:31:49.830 Katherine Bayless: Yeah, I like it. Thank you, Ashmini. Seriously.

773 01:31:49.830 01:31:53.570 Ashwini Sharma: Yeah, welcome, welcome. Yeah, and feel free to ping me anytime.

774 01:31:53.740 01:31:54.880 Ashwini Sharma: I was originally talking about…

775 01:31:55.380 01:31:58.120 Kyle Wandel: DPT code at 2AM in the morning to compress it.

776 01:31:59.900 01:32:00.440 Katherine Bayless: Yay!

777 01:32:00.570 01:32:05.470 Katherine Bayless: Yeah, I could not do it. I could not.

778 01:32:06.280 01:32:10.159 Ashwini Sharma: It’s 3PM here, and my brain’s already, fried.

779 01:32:11.740 01:32:12.300 Katherine Bayless: Yeah.

780 01:32:12.470 01:32:14.559 Ashwini Sharma: I’d rather have a meeting with them now.

781 01:32:17.750 01:32:21.590 Katherine Bayless: Will you share the recording in Slack, too, once it’s.

782 01:32:21.590 01:32:22.600 Kyle Wandel: Yeah, that’s great.

783 01:32:22.600 01:32:24.980 Ashwini Sharma: Sure, sure, yeah, I’ll do that, yeah.

784 01:32:25.440 01:32:26.380 Katherine Bayless: That’d be awesome.

785 01:32:27.050 01:32:28.840 Ashwini Sharma: Alright, okay.

786 01:32:29.650 01:32:30.410 Katherine Bayless: Well, that’s cool that.

787 01:32:30.410 01:32:34.590 Kyle Wandel: Bye-bye. Bye-bye.

788 01:32:34.590 01:32:36.710 Katherine Bayless: Have a nice weekend. Bye.