2026-04-30_dbt_training

Meeting Title: dbt training Date: 2026-04-30 Meeting participants: Scratchpad Notetaker, Greg Stoutenburg, Caitlyn Vaughn, bpeiair, Nandika Jhunjhunwala

WEBVTT

1 00:00:45.770 ⇒ 00:00:47.550 Caitlyn Vaughn: Greg, we meet again.

2 00:00:47.770 ⇒ 00:00:49.340 Greg Stoutenburg: Hello again, it’s been a while.

3 00:00:50.000 ⇒ 00:00:52.100 Greg Stoutenburg: How have you been lately?

4 00:00:52.100 ⇒ 00:00:58.089 Caitlyn Vaughn: I’m, like, so tired whenever it’s, cloudy outside, it makes me so sleepy.

5 00:00:58.090 ⇒ 00:01:04.090 Greg Stoutenburg: Same. I can’t even handle it. Yesterday was like… it was like, why bother? Like, I’m just… I’m just not doing anything, forget it.

6 00:01:05.519 ⇒ 00:01:09.749 Greg Stoutenburg: And then, of course, it’s not what really happens. Instead, it’s just like, no, it’s just gonna feel much harder.

7 00:01:10.110 ⇒ 00:01:11.479 Caitlyn Vaughn: That’s exactly it.

8 00:01:14.540 ⇒ 00:01:15.560 Caitlyn Vaughn: So funny.

9 00:01:16.120 ⇒ 00:01:17.150 Greg Stoutenburg: Hey, Brian.

10 00:01:17.150 ⇒ 00:01:18.800 bpeiair: Hello. Hello, hello.

11 00:01:18.800 ⇒ 00:01:20.459 Caitlyn Vaughn: How’s it going?

12 00:01:20.750 ⇒ 00:01:22.310 bpeiair: Going well, nice to meet you.

13 00:01:22.310 ⇒ 00:01:23.880 Caitlyn Vaughn: Nice to meet you as well!

14 00:01:25.950 ⇒ 00:01:27.929 Greg Stoutenburg: Kayla Nautica will be on, right?

15 00:01:28.510 ⇒ 00:01:30.510 Caitlyn Vaughn: Yes, I think so.

16 00:01:30.510 ⇒ 00:01:33.890 Greg Stoutenburg: Yeah, just wanted to confirm. I mean, she had marked yes, but I just wanted to confirm.

17 00:01:42.040 ⇒ 00:01:43.870 Caitlyn Vaughn: to the stream.

18 00:01:49.450 ⇒ 00:01:50.709 Caitlyn Vaughn: Yeah, she should be here.

19 00:01:50.710 ⇒ 00:01:51.320 Greg Stoutenburg: Yep, great.

20 00:01:51.320 ⇒ 00:01:52.120 Caitlyn Vaughn: He’s coming.

21 00:01:52.120 ⇒ 00:02:16.929 Greg Stoutenburg: Yeah, well, okay, cool. Yeah, well, while we wait, Brian’s, done a bunch of work for Brainforge, and is a dbt pro, and, knows… also knows Utam from way back when. Brian, Caitlin, and Utam met in person, in… in Austin once upon a time, so lots of UTAM connections on the call. But yeah, I’ll just… I’ll be mostly just a passenger here, because I am, I’m out-experted.

22 00:02:16.930 ⇒ 00:02:20.310 Greg Stoutenburg: On this topic, and and let you all go for it.

23 00:02:21.490 ⇒ 00:02:25.139 bpeiair: Cool. Is there anyone else I’m waiting for before I just, get going?

24 00:02:25.140 ⇒ 00:02:26.739 Caitlyn Vaughn: No, it’s just the two of us.

25 00:02:26.740 ⇒ 00:02:27.899 Greg Stoutenburg: Non-dose disjointed.

26 00:02:27.900 ⇒ 00:02:29.950 bpeiair: Okay, perfect, okay.

27 00:02:30.380 ⇒ 00:02:33.799 bpeiair: So, let’s see… 50 minutes.

28 00:02:33.870 ⇒ 00:02:36.779 bpeiair: I am Brian,

29 00:02:36.780 ⇒ 00:02:54.610 bpeiair: I have been in data since… I graduated 2013, and I can’t really do the math right now, so 13 years, I guess? Yeah. I graduated also from Bucknell, which is where Utam graduated, and he was actually a couple years under me. And then we met in the data world after school.

30 00:02:54.790 ⇒ 00:03:02.839 bpeiair: With dbt experience, I looked it up. dbt came out in 2016.

31 00:03:02.870 ⇒ 00:03:19.050 bpeiair: And I started using it in 2017, and I believe I was one of the, like, rotating test user groups, because I used to work at WeWork, famously, on the news, WeWork, and WeWork was an early adopter of dbt, which is…

32 00:03:19.140 ⇒ 00:03:22.120 bpeiair: cool, and then not cool, what happened afterwards.

33 00:03:22.870 ⇒ 00:03:40.300 bpeiair: got fired. But, they had a really cool data team, to be honest. They were kind of on the forefront of one of the first Snowflake customers and DPT customers using Airflow to orchestrate everything together, Kubernetes, etc. And so, since then, I have…

34 00:03:40.320 ⇒ 00:03:53.300 bpeiair: hopped around, I did a lot of, client work with UTAM and without Utam, mostly riding solo, but I’ve… I’ve done at least over…

35 00:03:53.300 ⇒ 00:04:02.459 bpeiair: at least over 10 dbt implementations and over 20 of these kinds of talks, so hopefully, I will be able to

36 00:04:02.500 ⇒ 00:04:20.229 bpeiair: give a really good intro of why the industry uses dbt, what it’s used for. I have my own template, dbt Git repo, that I’ll screen share, because 80% of the skeleton of dbt organization is the same.

37 00:04:20.230 ⇒ 00:04:24.899 bpeiair: I was brought in to, just talk about this, so I don’t have…

38 00:04:24.900 ⇒ 00:04:41.829 bpeiair: context of the exact code in your repo, and it didn’t make sense to just give me access for one day. So I’ll do my best if there are questions of, if there’s a different setup, I should be able to answer those questions later on. But before I share my repo,

39 00:04:42.100 ⇒ 00:04:57.770 bpeiair: DBT, I kind of use it as SQL on steroids. It’s mostly just a SQL wrapper. I also apologize if I’m saying things that you all heard before, but, just gonna go with my spiel here.

40 00:04:57.770 ⇒ 00:04:59.230 Nandika Jhunjhunwala: That’s great, yeah.

41 00:04:59.500 ⇒ 00:05:17.260 bpeiair: Before… before dbt existed, I remember using Python to schedule SQL scripts in a folder on some remote desktop that’s always on, and that’s kind of like how we did quote-unquote data modeling in the past.

42 00:05:17.370 ⇒ 00:05:25.909 bpeiair: Data modeling being… and I read through a little bit of documentation here, it’s,

43 00:05:25.990 ⇒ 00:05:32.679 bpeiair: in its purest form, it’s just, if you… I always use the example, if you have a…

44 00:05:32.720 ⇒ 00:05:41.139 bpeiair: If your company wants to understand a customer, and you have Zendesk and Salesforce and Stripe and Workday.

45 00:05:41.140 ⇒ 00:05:56.399 bpeiair: And the customer ID is within multiple different applications. A data model is just like a DIM customer, where you take one object and you join it on all four, but you have to do it in a clean way that doesn’t lead to duplicates, you don’t want to…

46 00:05:56.400 ⇒ 00:06:03.029 bpeiair: It’s like that… I mean, I always get GET with, like, email joins. It’s like people have multiple emails, and then it creates 4 of the same customer.

47 00:06:03.030 ⇒ 00:06:18.050 bpeiair: And so data modeling is just that. It’s when you bring in all your data from these third-party apps, and you want to have one clean line item, or one clean customer, or one clean product, but you want it to have different dimensions and SKUs, and you want it all to be in the same place.

48 00:06:18.260 ⇒ 00:06:36.490 bpeiair: data modeling is, or data architecture, and it’s going through so many different terms, is just that. The raw data gets, hopefully, scheduled to be placed somewhere. I believe you guys are on Snowflake. So, you know, that raw data goes into Snowflake in all these different multiple databases.

49 00:06:36.490 ⇒ 00:06:41.229 bpeiair: And then dbt just sits on top of all of those databases, and…

50 00:06:41.230 ⇒ 00:06:46.820 bpeiair: through your work or AI, or whatever it is, it’s one…

51 00:06:46.820 ⇒ 00:06:57.860 bpeiair: repo, that has SQL transformation in it. And if I go back to, like, what we used to do, where we tried our best to schedule crons with, like.

52 00:06:58.040 ⇒ 00:07:13.609 bpeiair: before dbt, with singular SQL scripts, there was no concept of dependencies, there’s no concept of, if this table runs, run this after it succeeds, and not if it fails. It usually would just run every SQL model, and sometimes

53 00:07:13.610 ⇒ 00:07:25.230 bpeiair: something takes an hour, and the next day it takes 3 hours, and the second model runs faster, and then you just… you miss out on that data. And so, dbt is really great as,

54 00:07:25.560 ⇒ 00:07:31.159 bpeiair: Looking at it even more high level, it’s how,

55 00:07:31.400 ⇒ 00:07:40.079 bpeiair: what’s it called? The founders of dbt wanted to treat data models as software, and in the past, these data models

56 00:07:40.170 ⇒ 00:07:48.839 bpeiair: either didn’t really live in any repo, but even if SQL scripts were in GitHub, there was no…

57 00:07:48.840 ⇒ 00:08:12.169 bpeiair: easy process for developers to link things together, and so they wanted to… their kind of tagline was, we want to treat data as a software or as a service. And so, with something like dbt comes things that software, does, or, sorry, software engineering testing does, which is testing and unit testing and dependency tracking, and…

58 00:08:12.170 ⇒ 00:08:20.140 bpeiair: Scheduling and, loads and loads of documentation, version control… or not even version control, but just, like,

59 00:08:22.060 ⇒ 00:08:30.769 bpeiair: Yes, it is version control. Sorry, okay. But all those things that software can do, data, should also follow those guidelines.

60 00:08:30.830 ⇒ 00:08:43.800 bpeiair: And I think the most important thing, that dbt does is the… which I think is the most… one of the most important, software engineering skills is dry, do not repeat yourself. And before dbt.

61 00:08:43.799 ⇒ 00:08:56.719 bpeiair: We had, you know, let’s say multiple data analysts are all trying to define customer by revenue, and they all start from scratch, and, you know, it just… none of it talks to each other, and they all get different numbers.

62 00:08:56.720 ⇒ 00:09:03.170 bpeiair: And so the dry, do not repeat yourself works great in dbt, because if you define

63 00:09:03.170 ⇒ 00:09:28.150 bpeiair: let’s say you clean up Salesforce customers in one model, and then you use that in four different reporting models in another aggregate field or whatever. It’s all coming from one, place, as opposed to, like, either a raw version of Salesforce customers somewhere, or, like, multiple data scientists all saying that, I have the best version of data customers, use mine, and then someone’s like, use mine. And then you have five, and you try to join everything together.

64 00:09:28.150 ⇒ 00:09:28.750 bpeiair: doesn’t work.

65 00:09:29.440 ⇒ 00:09:42.719 bpeiair: So, are there any questions of, of my, just, oh, nice. Okay, perfect. Of just, like, my, my broad, explanation? Otherwise, I’ll just share my screen.

66 00:09:43.220 ⇒ 00:09:53.870 Caitlyn Vaughn: No, I think all of that makes sense. I will say that we’re using MotherDuck and not Snowflake as the model underneath, or the database underneath Omni.

67 00:09:54.510 ⇒ 00:10:18.790 Caitlyn Vaughn: Yes, I’d be interested… I mean, we are planning on taking this over ourselves in the next couple of weeks, and, like, doing a lot of this modeling and, like, running with the data engineering stuff ourselves. So as much, you know, dbt context as you can give us would be great, and then if we have a little bit of time at the end, I would also be interested in you looking at the way things are set up, and maybe giving some advice on how we could, like, restructure to make it better.

68 00:10:19.450 ⇒ 00:10:30.429 bpeiair: Sure. Was the… the skeleton of your repo, was it set up at first by AI or Claude or anything, or was it set up by a human being?

69 00:10:31.000 ⇒ 00:10:33.470 Caitlyn Vaughn: Hopefully it was set up by a human being.

70 00:10:33.470 ⇒ 00:10:34.970 bpeiair: Cool. Okay.

71 00:10:34.970 ⇒ 00:10:42.080 Greg Stoutenburg: I can be sure that it would never be set up just by AI. I was not involved in that part of the implementation, but yeah.

72 00:10:42.870 ⇒ 00:10:45.790 bpeiair: And I wasn’t even saying, like, right in…

73 00:10:45.940 ⇒ 00:10:51.980 bpeiair: this day and age, it’s not a right or wrong answer anymore. It’s just now different.

74 00:10:52.140 ⇒ 00:11:02.820 bpeiair: I… but, again, I don’t know what’s going on as, in the previous, like, implementation. I personally am,

75 00:11:03.050 ⇒ 00:11:15.190 bpeiair: a recently changed AI advocate, where I pushed back against it for 2 years until I realized it’s pretty cool. But we can talk about all that stuff at another time. Later. So let me…

76 00:11:15.360 ⇒ 00:11:24.130 bpeiair: Hopefully I don’t have to restart Zoom to be able to share this. Let’s see… Hang on.

77 00:11:27.310 ⇒ 00:11:30.030 bpeiair: Isn’t this? Oh, God, I hope it’s this.

78 00:11:30.530 ⇒ 00:11:31.640 bpeiair: Share.

79 00:11:33.010 ⇒ 00:11:33.990 bpeiair: Okay.

80 00:11:34.300 ⇒ 00:11:40.999 bpeiair: Whoa, I don’t even know what that is. Let me increase this a little bit. Okay. Ignore, ignore that.

81 00:11:41.380 ⇒ 00:11:51.550 bpeiair: Okay, so this is, what most DBT repos… structure R.

82 00:11:51.710 ⇒ 00:12:03.220 bpeiair: And I’m kind of going to… I think it auto-populates with all these folders whenever you set up dbt for the first time. I’ll skip the README, I’ll say that…

83 00:12:03.590 ⇒ 00:12:11.840 bpeiair: We’ll dive into these folders. A lot of them you might not use. For example, I don’t think anyone uses this.

84 00:12:11.900 ⇒ 00:12:28.449 bpeiair: analyses. I believe analyses are just SQL files, where if somebody doesn’t want to save something in the repo, but it’s a one-off query that they want to use, they can save a SQL file in here, but this doesn’t get triggered, and usually nothing is in there. So let’s not even worry about this at all.

85 00:12:28.700 ⇒ 00:12:35.909 bpeiair: Models is the most important thing that I’m sure everyone knows, but I’m gonna save that for last, because that’s kind of like the meat and potatoes of what dbt does.

86 00:12:36.550 ⇒ 00:12:45.550 bpeiair: Your dbt project is a YAML file, that should have a lot of the metadata around the… the name of the…

87 00:12:45.550 ⇒ 00:13:08.450 bpeiair: the connections, the database, the schemas that you’re, setting up to, these are… these all will have probably been set up. This is, like, the one time, like, set up the secret key, give it a username, let it be able to write to your database. The only reason why this would ever change, is if you migrate databases, or if the user of…

88 00:13:08.550 ⇒ 00:13:25.809 bpeiair: if the user of dbt… the user that you allow dbt to use gets deleted or renamed, you would have to rename it here as well. But otherwise, this should all be set up, and you usually won’t have to worry about it that much. But it is…

89 00:13:25.860 ⇒ 00:13:47.979 bpeiair: in this YAML file, or if you need to set up new connections to, like, another database or something, like, you could set all that stuff up in here. And I would say AI is pretty good at doing this part. If you tell AI, like, the username and the keys and the server, AI will, like, ping it, test it, and then it’ll put it in this YAML. Oh, and I should say, in the past year, I have been

90 00:13:47.980 ⇒ 00:13:53.900 bpeiair: using AI for dbt just in my own personal time to kind of, like, play around to see what’s possible.

91 00:13:53.940 ⇒ 00:13:58.410 bpeiair: So AI is pretty good at that, and I’ll also tell you what AI’s not good at in a little bit.

92 00:13:58.570 ⇒ 00:14:07.570 bpeiair: Okay, so, I’m gonna go through the least important ones… First,

93 00:14:07.990 ⇒ 00:14:21.430 bpeiair: Actually, they’re all important, but, like, probably not used. Macros are SQL files, but macros are not really SQL. It is something called Jinja, and, it is really…

94 00:14:21.430 ⇒ 00:14:36.809 bpeiair: mostly used for the… for dry, again, do not repeat yourself. I don’t have a good example of this, but I use macros for, like, let’s say you have a case win statement of, like, a…

95 00:14:37.070 ⇒ 00:14:53.340 bpeiair: where you’re doing CaseWin… I’m just gonna make something up. In your database, you have United States, and you want to rename it to U.S, and so you have a giant CaseWin statement for, like, countries. If that is, like, your business… your business wants to do the

96 00:14:53.550 ⇒ 00:14:58.959 bpeiair: the code, instead of… and obviously you can get this, but just as an example,

97 00:14:59.230 ⇒ 00:15:12.860 bpeiair: there’s case-win statements that sometimes are repeated everywhere. Macros gives you a chance to… it can inject this macro into a model, based on… you would call this macro

98 00:15:12.860 ⇒ 00:15:22.369 bpeiair: in SQL, and it would just put in whatever’s in here. So if this was a case when statement for the United States equals US blah blah blah,

99 00:15:22.630 ⇒ 00:15:39.990 bpeiair: it… let’s say you use that in, like, 6 or 7 different models. Instead of, you know, copy-pasting that giant case win statement, you would put that case win statement here one time, and then whenever you need to change United States to U.S, you would just call this macro instead of, you know, writing it. So…

100 00:15:40.000 ⇒ 00:15:52.779 bpeiair: very dry, stuff. There is obviously, like, crazy stuff you can do outside of SQL. Like, what we’re doing here, is just something… let’s see, this,

101 00:15:53.010 ⇒ 00:16:02.399 bpeiair: if you’re in production, change the target, or sorry, trim the schema name. Like, there is other stuff you can do here, I’m just trying not to overcomplicate things.

102 00:16:02.400 ⇒ 00:16:15.089 bpeiair: But that would be, like, data engineering stuff that, when you’re starting up, you probably don’t need. It’s also… you can also do stuff in macros, for…

103 00:16:15.090 ⇒ 00:16:40.069 bpeiair: When you have a bigger team, and depending on somebody’s role, if somebody’s, like, a data analyst, and you don’t want them to be destructive in your environment, you can set up a role that’s, like, if that role is business analyst, then never push to prod, or only use dev. So there’s a lot of, like, staging environments, like, all that stuff. I’m not sure where you guys are at with that. I’ve seen people just push to prod, and it doesn’t matter because it’s small.

104 00:16:40.070 ⇒ 00:16:51.919 bpeiair: versus, we need to have a staging area, and you promote tables from dev to prod. A lot of this stuff happens here. It’s very boring. If you want to talk about it, I’m happy to talk about it later.

105 00:16:52.010 ⇒ 00:17:07.479 bpeiair: Seeds… see… I don’t think I have any here. Seeds are just CSVs, basically. If you have a hard-coded table… if you have a hard-coded CSV, the most common one I see are, currency exchange rates.

106 00:17:07.480 ⇒ 00:17:13.740 bpeiair: You literally just, like, you can export that from Google Sheets or Excel, save it as a CSV,

107 00:17:14.000 ⇒ 00:17:24.050 bpeiair: put that whole CSV in Seeds, and then dbt will treat that CSV file as a table. And then you can reference that CSV file anywhere you want.

108 00:17:24.349 ⇒ 00:17:27.970 bpeiair: So, yeah, so that’s good for just stuff that you don’t…

109 00:17:28.359 ⇒ 00:17:36.660 bpeiair: It’s not worth the effort to set something up, to push something into a database, and usually it, like, never changes. That’s what a seat is for.

110 00:17:39.720 ⇒ 00:17:49.529 bpeiair: Tests are… there’s two different kinds of tests, and I’m gonna open models now, but I’m not gonna get into the SQL models quite yet. You’re gonna…

111 00:17:49.740 ⇒ 00:17:57.099 bpeiair: probably have two different types of YAMLs in your models. This is where all the magic happens, and where all the SQL models, like, actually live.

112 00:17:57.240 ⇒ 00:18:04.550 bpeiair: But there should be a… some sort of YAML for the… all of the models

113 00:18:05.130 ⇒ 00:18:12.780 bpeiair: in your repo, where… and this isn’t necessary, you don’t need to do this, but you can… if you type in a description.

114 00:18:12.920 ⇒ 00:18:29.550 bpeiair: in the, depending, actually, on your data warehouse, this description will, like, kind of, like, feed into that table, so when you open the data warehouse and you click it, you’ll at least have some sort of description. And then the reason that I brought this up is because there’s two different kinds of dbt tests.

115 00:18:29.550 ⇒ 00:18:44.729 bpeiair: There are basic default dbt tests, which you can define in the schema, and this is… these are the two most popular ones, so this is basically, like, whenever this table gets, materialized, or finished.

116 00:18:44.730 ⇒ 00:19:02.359 bpeiair: building in however you’re triggering dbt, run the test, not unique, not null, on customer ID. So the customer ID column should never be null. If it is null, this test will say failure, and then wherever you’re orchestrating dbt.

117 00:19:02.360 ⇒ 00:19:05.039 bpeiair: You’ll see the not null test failed.

118 00:19:05.080 ⇒ 00:19:09.879 bpeiair: It lets the analyst know there’s a null here, there should never be a null.

119 00:19:10.110 ⇒ 00:19:19.530 bpeiair: And then the same way that a primary key should always be unique, there’s a unique test as well. Dbt will basically do a…

120 00:19:20.030 ⇒ 00:19:28.260 bpeiair: Count all, and then count distinct, and compare them. And if it succeeds, great, and if it doesn’t, you pass the not unique test.

121 00:19:30.040 ⇒ 00:19:31.200 bpeiair: I really haven’t…

122 00:19:31.200 ⇒ 00:19:31.540 Nandika Jhunjhunwala: So…

123 00:19:31.970 ⇒ 00:19:32.779 bpeiair: Oh, yeah, what’s up?

124 00:19:32.780 ⇒ 00:19:39.899 Nandika Jhunjhunwala: Yeah, go for it. Can we also define descriptions for column names? Are we allowed to put this in the.

125 00:19:39.900 ⇒ 00:19:40.370 bpeiair: Oh, yeah.

126 00:19:40.370 ⇒ 00:19:42.290 Nandika Jhunjhunwala: Or is it just at the table level?

127 00:19:42.290 ⇒ 00:19:46.489 bpeiair: Yeah, this doesn’t have everything in it, but yeah, you totally can.

128 00:19:46.490 ⇒ 00:19:46.900 Nandika Jhunjhunwala: Miss.

129 00:19:46.900 ⇒ 00:19:48.029 bpeiair: or a customer.

130 00:19:48.300 ⇒ 00:19:52.180 bpeiair: You totally can have a description for the table and a description for the column.

131 00:19:53.460 ⇒ 00:19:56.440 Nandika Jhunjhunwala: And then for the macros,

132 00:19:57.090 ⇒ 00:20:09.550 Nandika Jhunjhunwala: it sounds like there’s, like, multiple use cases for it. It’s, like, global settings, and you can also kind of use it for normalization. Is that, like, a common use case? Like, for normalizing, like, certain, like.

133 00:20:10.150 ⇒ 00:20:16.099 Nandika Jhunjhunwala: Text or, some sort of column that you need it to be in a certain format, always.

134 00:20:16.310 ⇒ 00:20:17.360 Nandika Jhunjhunwala: Type stuff.

135 00:20:17.360 ⇒ 00:20:22.520 bpeiair: Basically… basically, yeah, yeah, bas… Or, like, if there’s…

136 00:20:22.780 ⇒ 00:20:28.460 bpeiair: what I’ve seen recently is, like, marketing always wants to remove…

137 00:20:28.680 ⇒ 00:20:41.860 bpeiair: coupons for employees and coupons that say 100% off or whatever. That’s just, like, a where clause filter, but they save that WHERE clause filter as a macro instead of, in every model, putting in where discount code equals employee.

138 00:20:41.860 ⇒ 00:20:42.300 Nandika Jhunjhunwala: Got it.

139 00:20:43.760 ⇒ 00:20:57.880 bpeiair: It’s, again, not necessary. It’s kind of like, if you have the time, you can play around with it, but if y’all are on a time crunch to get this out the door, you don’t have to worry about it, but just know that it’s there will be nice in the future when you want to optimize some of your stuff.

140 00:20:58.320 ⇒ 00:20:59.500 bpeiair: Yeah, totally.

141 00:21:00.250 ⇒ 00:21:08.990 Caitlyn Vaughn: When you save a macro, does that automatically apply to every model that you create, or is it, like, you just reference it?

142 00:21:09.240 ⇒ 00:21:13.270 bpeiair: Nope, when you save anything in macros, it won’t…

143 00:21:13.970 ⇒ 00:21:19.630 bpeiair: I think I said before, it says SQL file, but it’s not going… your orchestration engine won’t run these.

144 00:21:19.930 ⇒ 00:21:20.830 Caitlyn Vaughn: Oh, right.

145 00:21:20.830 ⇒ 00:21:26.779 bpeiair: Unless, in your models, you do… I can’t remember the… it’s like.

146 00:21:30.820 ⇒ 00:21:32.400 bpeiair: Yeah, it’s, it’s like this.

147 00:21:32.400 ⇒ 00:21:33.090 Caitlyn Vaughn: Okay.

148 00:21:33.330 ⇒ 00:21:36.770 bpeiair: So, whenever you reference

149 00:21:37.270 ⇒ 00:21:43.799 bpeiair: in the curly brackets, a macro, it’ll grab this and then put it into the SQL file, whatever it is, whatever it may be.

150 00:21:43.980 ⇒ 00:21:45.200 Caitlyn Vaughn: Okay, that makes sense.

151 00:21:45.200 ⇒ 00:21:51.880 bpeiair: Yeah, if it sits here, then it’s not gonna do anything, unless you, target it in the models folder.

152 00:21:53.560 ⇒ 00:22:11.309 bpeiair: Oh, yeah. The reason that I wanted to show these tests is because, there’s also a tests folder, and so that’s kind of confusing. There are two different, very distinct kinds of tests in dbt. These basic tests live under models.

153 00:22:11.310 ⇒ 00:22:25.269 bpeiair: And then, if you want to get crazy… crazy style, there are business logic-specific tests where you write SQL, and it treats it as, like.

154 00:22:25.390 ⇒ 00:22:40.949 bpeiair: for example, I don’t have the example on here, but, when you sum net revenue for the past 3 months, it should always be positive, it should never be negative. It’s just, like, a very specific

155 00:22:41.170 ⇒ 00:22:42.530 bpeiair: numeric…

156 00:22:42.670 ⇒ 00:22:50.459 bpeiair: rule that isn’t… that’s not a basic test, it’s just something that you define. Or even, like, you know, in…

157 00:22:50.670 ⇒ 00:23:08.769 bpeiair: In Argentina, these numbers should never be under 10. Something very, very specific. You would write that SQL as a SQL query, where if you put that SQL query under tests, and you reference… or, sorry.

158 00:23:09.230 ⇒ 00:23:11.380 bpeiair: Your orchestration engine

159 00:23:11.850 ⇒ 00:23:28.569 bpeiair: this gets into the weeds, but if it points to tests and actually runs the tests, it’ll run that test every time a run finishes. If the results of the test are null, then the test passed. If the results of the tests are not null, then the test fails.

160 00:23:28.570 ⇒ 00:23:33.720 bpeiair: So you kind of have to write backward SQL. It’s… you have to write it in a way that’s, like.

161 00:23:33.720 ⇒ 00:23:36.480 bpeiair: some…

162 00:23:37.620 ⇒ 00:23:56.289 bpeiair: sum the revenue, and then with, like, a WHERE clause, like, where, revenue is less than or equal to zero, like, negative, right? And so, it should never be zero, so that query will return null, but if it ever returns something that’s not null, that means a negative invoice got aggregated somewhere, and it will show up.

163 00:23:56.290 ⇒ 00:23:59.340 bpeiair: And it’ll, it’ll, it’ll flag as a test.

164 00:23:59.500 ⇒ 00:24:12.479 bpeiair: Ai is good at writing these tests, and it’s good at, like, doing that backward SQL that I was talking about, because it messes me up sometimes, too, where I’m just writing the SQL, and the test fails every day, and I’m like, oh, I forgot to do… it’s greater than and not less than, whatever.

165 00:24:12.950 ⇒ 00:24:22.239 bpeiair: I don’t think you are going to need snapshots, but do you guys like slowly changing dimension too?

166 00:24:24.250 ⇒ 00:24:25.300 bpeiair: Great! We don’t ever.

167 00:24:25.300 ⇒ 00:24:28.340 Nandika Jhunjhunwala: We have snapshots, though, in our modeling.

168 00:24:29.110 ⇒ 00:24:29.800 Nandika Jhunjhunwala: huh?

169 00:24:29.930 ⇒ 00:24:34.990 Nandika Jhunjhunwala: We have some tables with, like, the name, like, snapshots underscore.

170 00:24:34.990 ⇒ 00:24:42.680 bpeiair: Okay. This might be confusing. The concept of a snapshot is… just…

171 00:24:43.120 ⇒ 00:25:00.889 bpeiair: is like an incremental, table, right? It’s like, you want all of your customers run today, April 30th, and then tomorrow, May 1st, you want all those customers run again, and then you append it to the same table, so you keep the history. Snapshots is a… is a…

172 00:25:01.700 ⇒ 00:25:14.859 bpeiair: Not new, but it’s a… it’s a slowly changing dimension table, meaning that instead of doing that incremental insert thing, which you can do in models, you just have to write, like, incremental, always append.

173 00:25:15.140 ⇒ 00:25:18.780 bpeiair: I’m gonna talk about this, I’m getting myself confused.

174 00:25:18.860 ⇒ 00:25:31.180 bpeiair: If you ever see SCD2 or Snapshot specifically in dbt, it’s talking about a method that is hard for an analyst to actually write in SQL,

175 00:25:31.180 ⇒ 00:25:43.259 bpeiair: But it basically does a changelog for an entity. I need to make this quick. So if you have a Salesforce customer, and you…

176 00:25:43.330 ⇒ 00:25:50.910 bpeiair: and the Salesforce customer never changes their address or any of their information, and you do an incremental table, that customer will stay the same

177 00:25:51.380 ⇒ 00:26:02.980 bpeiair: yesterday, today, tomorrow, etc. And so that customer will live in all those dates, which, depending on the size of your company, can lead to just, like, extra data that you don’t really need.

178 00:26:03.160 ⇒ 00:26:10.070 bpeiair: Scd2, or Slowly Changing in Dimension, means that instead,

179 00:26:10.470 ⇒ 00:26:29.440 bpeiair: the dbt will take me, let’s say I’m that customer, it’ll take me, and it’ll have a date start and a date end, with a bunch of columns for dimensions, and it’ll be like, I was created, let’s say, last month, and I haven’t done anything with my Salesforce account, and so I’m one row of data.

180 00:26:29.470 ⇒ 00:26:45.490 bpeiair: And then today, I update my address. So that’s… that’s a changelog update. It’ll take that first row of data for me, it’ll say, end today, it’ll create a new row, and it’ll say Brian Pay, it’ll update my address, and it’ll have a new start date.

181 00:26:45.620 ⇒ 00:26:54.589 bpeiair: So, in my example of me being a customer for 30 days, instead of snapshotting me 30 times per day.

182 00:26:54.780 ⇒ 00:27:00.789 bpeiair: I’m snapshotted twice, and it gives you a date range of when my activity changed.

183 00:27:01.270 ⇒ 00:27:13.230 Caitlyn Vaughn: Is this something that you would use in parallel to, like, a specific grain on a table where you wanted the grain to be X, so you would set up a snapshot that would, like, update that table?

184 00:27:13.990 ⇒ 00:27:24.599 bpeiair: it would be only if you know specifically, like, in that case, like, I only care if Brian changes his phone number. If you’re doing targeted.

185 00:27:25.430 ⇒ 00:27:42.059 bpeiair: To be honest with you, these kinds of snapshots are really just to save space, because if you multiply my example by a million customers, then you have hundreds of millions of rows versus a couple million rows, if you’re doing the everyday append strategy.

186 00:27:42.230 ⇒ 00:27:43.429 Caitlyn Vaughn: Hmm. It’s not real.

187 00:27:43.430 ⇒ 00:27:44.599 bpeiair: Probably, yeah, for.

188 00:27:44.600 ⇒ 00:28:03.249 Caitlyn Vaughn: Yeah, just to say this back to you, because I’m trying to understand. So if it is giving you a new incremental daily snapshot, let’s say, and I guess you’re saying, like, the only time we would really care to know is when something changes, right? So would that information then be passed on, or is it, like.

189 00:28:03.300 ⇒ 00:28:07.839 Caitlyn Vaughn: We just have the full history, and you, like, re-query that every single time.

190 00:28:09.320 ⇒ 00:28:15.040 bpeiair: in the… in the… In the complicated snapshot world.

191 00:28:15.240 ⇒ 00:28:19.880 bpeiair: DBT scans me in that table to see if there’s a change.

192 00:28:19.880 ⇒ 00:28:20.560 Caitlyn Vaughn: Yeah.

193 00:28:20.870 ⇒ 00:28:24.839 bpeiair: And only adds a… yeah, only as a row if there was a change.

194 00:28:24.840 ⇒ 00:28:25.780 Caitlyn Vaughn: Okay, for the first one.

195 00:28:25.780 ⇒ 00:28:30.600 bpeiair: is yes, but… But you can get the same results

196 00:28:30.890 ⇒ 00:28:40.789 bpeiair: in an incremental model, like, it’s just… it’s just as fine that I exist as a customer every single day in, like, a customer

197 00:28:41.070 ⇒ 00:28:47.860 bpeiair: a full history customer table that has, like, a run date. It’s fine for me to also exist 30 times in the past 30 days.

198 00:28:47.970 ⇒ 00:28:54.250 bpeiair: Because when you’re, like, counting distinct active customers by day, you want me to show up every day.

199 00:28:54.760 ⇒ 00:28:59.919 bpeiair: It all depends on, like, the metric that you’re… like, I know people who do both.

200 00:29:00.700 ⇒ 00:29:04.330 bpeiair: But it might be, like, when this situation arises.

201 00:29:04.550 ⇒ 00:29:08.799 bpeiair: maybe you guys can reach out to me, and we can, like, discuss further.

202 00:29:09.300 ⇒ 00:29:14.810 bpeiair: wanted to say out loud, since it is one of the features of dbt, I was like, I might as well just say what it does.

203 00:29:14.960 ⇒ 00:29:20.679 bpeiair: usually people are like, okay, whatever, but you all actually are interested in it, so I just kept talking about it.

204 00:29:20.920 ⇒ 00:29:22.959 Caitlyn Vaughn: Okay, alright, let’s move on.

205 00:29:22.960 ⇒ 00:29:42.490 bpeiair: No, it is a good question, and I’m probably missing… there’s a lot of use cases for both. There’s no right or wrong answer, it’s just, like, how big the data is, and how analysts like to write, like, their SQL, and or how AI likes to write SQL for any sort of, like, reporting tool.

206 00:29:43.070 ⇒ 00:29:55.900 bpeiair: Okay, let’s see, okay, this is the last part that I had to talk about, so we’re almost there. Under models, this is where this can be whatever… this…

207 00:29:56.180 ⇒ 00:30:00.099 bpeiair: Structure, folder structure, is,

208 00:30:00.970 ⇒ 00:30:07.429 bpeiair: doesn’t follow any sort of, like, dbt structure. I’ve seen hundreds of variations of, in this.

209 00:30:07.640 ⇒ 00:30:11.480 bpeiair: specific structure, I have a folder for raw, meaning

210 00:30:11.530 ⇒ 00:30:26.859 bpeiair: like, the customer data in Salesforce. Like, the raw table, and then an intermediate part where you’re doing some sort of cleaning and whatever, and then you have marts, which is, like, the final table. So in this example, there’s these…

211 00:30:26.860 ⇒ 00:30:38.450 bpeiair: sources… I actually missed this part. There’s a sources YAML. You don’t have to do this, you can hard code tables if you want to, but dbt also allows you to, in a YAML file, name…

212 00:30:39.090 ⇒ 00:30:45.240 bpeiair: not name, what are the names of the tables that you want to use? And put them in here, and it just connects to it.

213 00:30:45.340 ⇒ 00:31:03.630 bpeiair: so that when you have… when you want to use the order table from e-commerce, you can just say from, and then this is like the dbt Jinja, you never hardcode a table, it’s always curly with brackets, and then name of model or source. So…

214 00:31:04.180 ⇒ 00:31:08.219 bpeiair: intermediate Amazon order is just, like, it takes,

215 00:31:08.910 ⇒ 00:31:14.789 bpeiair: the couple of columns that you need for Amazon orders from the source, and then it saves it as a table.

216 00:31:15.140 ⇒ 00:31:18.040 bpeiair: You’ll see…

217 00:31:18.040 ⇒ 00:31:19.819 Nandika Jhunjhunwala: versus coming from raw?

218 00:31:20.420 ⇒ 00:31:25.480 bpeiair: Yeah, in this example, yeah, yeah, it’s coming from… it’s coming from some database.

219 00:31:25.760 ⇒ 00:31:34.419 bpeiair: That already exists, where it’s set up that, yeah, that data is not being manipulated by real people, it’s just a dump of the application.

220 00:31:34.840 ⇒ 00:31:41.289 bpeiair: And then I want to take that, and I want to cast stuff, or do whatever I need to do to, like, clean it up.

221 00:31:41.630 ⇒ 00:31:57.860 bpeiair: And then… so, like, in this example, this person has Amazon orders, and they have Shopify orders, and so for fact orders, they want to bring in Shopify and Amazon and join it together and make, like, a master order table.

222 00:31:58.370 ⇒ 00:32:03.340 bpeiair: And this is how… so the ref is how you reference…

223 00:32:03.480 ⇒ 00:32:16.880 bpeiair: a model that already exists in here, and this is… this little line of Jinja is, like, 90% of why people use dbt, because when this runs, dbt makes sure that this

224 00:32:17.280 ⇒ 00:32:18.410 bpeiair: Updates.

225 00:32:19.250 ⇒ 00:32:28.060 bpeiair: successfully first, before moving on to fact orders. So, depending on your lineage, if you have, like, 10 models, and you never want the data to be wrong.

226 00:32:28.210 ⇒ 00:32:31.720 bpeiair: It’ll run step-by-step, it’ll finish.

227 00:32:31.920 ⇒ 00:32:48.390 bpeiair: whatever’s downstream, or sorry, upstream, I always get that wrong. It’ll finish whatever’s upstream, it’ll run a test, it’ll say that data’s clean, move on to the next step, and it just keeps going, versus what I said I did 10 years ago, which is just, like, pray that they run at the same time.

228 00:32:48.550 ⇒ 00:32:56.310 bpeiair: So, what was I gonna say? So… yeah, usually,

229 00:32:56.410 ⇒ 00:33:07.889 bpeiair: companies like to do, like, here are all my sources, I’ll do a little bit of cleaning here, and then I’ll join them all together after they’re clean in some sort of, like, marked

230 00:33:08.050 ⇒ 00:33:10.599 bpeiair: And then this table gets pushed

231 00:33:10.770 ⇒ 00:33:14.880 bpeiair: Into your database, or data warehouse, and then…

232 00:33:15.390 ⇒ 00:33:24.259 bpeiair: So, orchestration doesn’t really happen here. I don’t know how you’re orchestrating it, but you can set them up on crons, or whatever schedule that you want, and it’ll run every day, or twice a day, or whatever it is.

233 00:33:24.460 ⇒ 00:33:26.480 bpeiair: And…

234 00:33:27.360 ⇒ 00:33:33.270 bpeiair: And yeah, and then you just kind of, like… you kind of go from there. I’ve seen,

235 00:33:33.990 ⇒ 00:33:40.760 bpeiair: logic layers and dimension layers, and then, like, an aggregate reporting layer, but I believe from your documentation, you want

236 00:33:40.760 ⇒ 00:33:54.179 bpeiair: the BI tool to handle all the aggregation, which is great, love that, because otherwise, some clients have, like, 20 reporting ag tables in here, and then nobody knows what’s what and which is what. So, I think that’s great.

237 00:33:55.200 ⇒ 00:33:56.139 Nandika Jhunjhunwala: We have that.

238 00:33:56.830 ⇒ 00:33:58.140 bpeiair: Yeah, that’s awesome.

239 00:33:58.140 ⇒ 00:34:03.629 Nandika Jhunjhunwala: No, we currently have, like, 20 reporting tables, so we’re trying to move towards the…

240 00:34:03.630 ⇒ 00:34:05.179 bpeiair: Oh, just kidding, that’s awesome.

241 00:34:05.180 ⇒ 00:34:06.260 Nandika Jhunjhunwala: Yeah.

242 00:34:06.260 ⇒ 00:34:24.139 Greg Stoutenburg: Yeah, yeah, yeah, that’s the way to do it. Yeah, thanks, Brian. Yeah, no, the documentation that I sent you, one of the things that I sent you was, something that, you know, they were saying, like, let’s do this by way of first principles. So, rather than a description of reality as it stands now.

243 00:34:24.210 ⇒ 00:34:24.920 Caitlyn Vaughn: Yeah.

244 00:34:25.189 ⇒ 00:34:31.139 Caitlyn Vaughn: So, the doc that I sent over is… we basically have been digging into…

245 00:34:31.369 ⇒ 00:34:38.299 Caitlyn Vaughn: like, the actual data engineering side of things, so that Nanda and I can, like, start handling this going forward.

246 00:34:38.369 ⇒ 00:34:42.189 Caitlyn Vaughn: And that was the doc we created, because…

247 00:34:42.219 ⇒ 00:35:07.129 Caitlyn Vaughn: basically downstream, like, the things that I could see before were, like, Blobby, right? In Omni, Blobby is the AI, chat in there, where you can generate charts and, like, ask questions, and every time we used it, it would give us, you know, basically, like, incorrect responses for a variety of reasons, part… you know, partially our fault, partially theirs. But in the actual modeling, when we started digging in, there’s basically… we wanted

248 00:35:07.129 ⇒ 00:35:20.619 Caitlyn Vaughn: to go with, like, the Kimball. That document I sent you is what we decided is, like, our version of good. Like, this is what we’re measuring right and wrong against, and then we went through and basically figured out there are fact and dim tables, but, like.

249 00:35:20.619 ⇒ 00:35:30.679 Caitlyn Vaughn: the DIM tables are basically… there’s, like, 5 customer DIMM tables, and each one is, like, a different combination of sources that’s, like, re-sliced with different definitions, and then…

250 00:35:30.769 ⇒ 00:35:44.909 Caitlyn Vaughn: There’s also, basically intermediate tables, which are, like, the tables where those tables are, like, massively joined into, like, one massive table, and then pushed into Omni, so I think that’s why we’re getting

251 00:35:44.959 ⇒ 00:35:54.379 Caitlyn Vaughn: problems, because they’re, like, specifically set for, you know, certain amounts of time, or, like, certain sources, instead of just having, like.

252 00:35:54.789 ⇒ 00:36:00.139 Caitlyn Vaughn: order table. It’s, like, Salesforce order table, Hyperline order table, kind of a thing.

253 00:36:01.400 ⇒ 00:36:09.630 bpeiair: That’s why, AI is super new for me as well. What I’ve started to do is, really…

254 00:36:10.440 ⇒ 00:36:20.599 bpeiair: break down specificities of the names of a data… of a database or a schema, or whatever you want to call it. That way, in these kinds of BI tools.

255 00:36:20.650 ⇒ 00:36:30.579 bpeiair: I say, using only this database, please do X, Y, or Z. Otherwise, if it sees a customer table in, like, 5 different databases, it’s just gonna guess.

256 00:36:30.580 ⇒ 00:36:32.559 Caitlyn Vaughn: Totally.

257 00:36:32.880 ⇒ 00:36:37.260 Caitlyn Vaughn: But that customer table should be the same in all 5 anyway, right?

258 00:36:37.760 ⇒ 00:36:52.850 bpeiair: the customer table, knowing nothing about your setup, if… there’s usually a customer object, and like, I brought this up before, like, Stripe and Zendesk and Salesforce, so they are… they are named customer, but they have different contents in it.

259 00:36:52.850 ⇒ 00:36:53.480 Caitlyn Vaughn: Right.

260 00:36:53.480 ⇒ 00:36:56.180 bpeiair: Which is… which is why we… we try to, like…

261 00:36:56.640 ⇒ 00:36:59.349 bpeiair: I don’t know, prompt engineering is this whole thing.

262 00:36:59.810 ⇒ 00:37:02.160 bpeiair: that… is dumb.

263 00:37:04.770 ⇒ 00:37:28.679 Greg Stoutenburg: Yeah, so Brian, just for a little bit of context on their setup, so the way that it’s configured is inside of Omni, there are these things called topics that are, basically curated joins with some AI context built into them. So when someone asks a question in the text field, it directs to the appropriate topic, where we’ve sort of, like, where we’ve sort of curated to avoid that sort of thing that you’re talking about.

264 00:37:28.680 ⇒ 00:37:28.995 bpeiair: Oh.

265 00:37:29.310 ⇒ 00:37:30.209 Greg Stoutenburg: Now, that’s not to say that it’s.

266 00:37:30.210 ⇒ 00:37:32.080 bpeiair: You should have stopped me 10 minutes ago, I was random.

267 00:37:32.080 ⇒ 00:37:48.389 Greg Stoutenburg: No, no, no, no, no, well, I mean, well, yeah, you’re here because they want to, get in on what’s behind the scenes here in DBT, so… But yeah, as of… the way that it’s implemented right now is that the intended cleanup of the kind of things.

268 00:37:48.390 ⇒ 00:37:49.270 bpeiair: Yeah, it’s perfect.

269 00:37:49.270 ⇒ 00:37:52.380 Greg Stoutenburg: performed in Omni with, with topics, yeah.

270 00:37:52.840 ⇒ 00:37:57.939 bpeiair: If I spoke out of turn, that’s my bad, Greg. Sorry, but it’s all good.

271 00:37:57.940 ⇒ 00:38:01.949 Greg Stoutenburg: No, no, you’re good. I just wanted to clarify that that’s… that that’s where that piece of it exists, that’s all.

272 00:38:01.950 ⇒ 00:38:08.290 bpeiair: Yeah. I’m actually… I’m unfamiliar with this BI stack, so I, wouldn’t be able to…

273 00:38:09.090 ⇒ 00:38:28.929 bpeiair: Everything’s fine. We’re all good. Everyone’s happy, we all love data. The last thing that you also probably know, but I think is really important, is in all the SQL models, there is a config. There’s… actually, you don’t need a config, but I think there should always be a config at the top of everything, where you can do, a ton of different stuff.

274 00:38:28.960 ⇒ 00:38:33.950 bpeiair: Tags, when you, are you using tags?

275 00:38:35.880 ⇒ 00:38:39.969 Greg Stoutenburg: I didn’t do the dbt work either, so we’re in a, yeah, just, yeah.

276 00:38:39.970 ⇒ 00:38:40.850 bpeiair: 15 people…

277 00:38:40.850 ⇒ 00:38:42.399 Caitlyn Vaughn: Is this in a fact table?

278 00:38:43.130 ⇒ 00:38:48.119 bpeiair: This… in this example, it’s a fact table, but, just…

279 00:38:48.300 ⇒ 00:38:53.670 bpeiair: in a broad sense, let me just write some stuff, Amazon, and then, like.

280 00:38:53.980 ⇒ 00:38:55.439 Caitlyn Vaughn: We do have configs.

281 00:38:55.610 ⇒ 00:38:59.510 bpeiair: Okay, tags are a way where…

282 00:39:00.150 ⇒ 00:39:17.259 bpeiair: when people are small and it’s a monolithic, like, dbt structure, you literally… you can just do dbt run and it runs everything. In one repo, if you want to modularize runs, we use tags. And so, like, if you have customers and products

283 00:39:17.280 ⇒ 00:39:23.309 bpeiair: Coming from two different sources, and you want customers, the whole customer thing, to run

284 00:39:23.590 ⇒ 00:39:33.220 bpeiair: once a day, but the product thing to only run, like, once a week, whatever it is. You would separate that with tags, and you would basically do dbt run tag equals

285 00:39:33.320 ⇒ 00:39:34.450 bpeiair: Customer?

286 00:39:34.590 ⇒ 00:39:47.600 bpeiair: And only the SQL models tagged with customer will run in your repo, and then you can put that on a separate cron job, separate from, like, a production run, or a, you know, test run, or, like, whatever you want, so…

287 00:39:47.600 ⇒ 00:39:50.130 Caitlyn Vaughn: Can you give an example of when this would be used?

288 00:39:50.690 ⇒ 00:39:58.660 bpeiair: For… For finance, we want invoices.

289 00:39:58.760 ⇒ 00:40:05.770 bpeiair: the invoice facts table to run every 3 hours, because that is the Stripe,

290 00:40:05.940 ⇒ 00:40:20.559 bpeiair: Stripe refresh cron, so… I want my… I want fact invoices to run every 3 hours, but I want my customer tables to only run once a day, because they don’t have to run, every 3 hours.

291 00:40:21.130 ⇒ 00:40:28.700 bpeiair: I made that up, but, like, that’s something… like, does that make sense? Like, just timing versus, with…

292 00:40:29.290 ⇒ 00:40:32.589 bpeiair: The other big thing is, like, in a monolithic run.

293 00:40:32.670 ⇒ 00:40:49.890 bpeiair: If something fails, everything fails. So, if you have orders and invoices and customers and products all in one place, if customers fail, the whole thing fails. If you modularize it, then everything else can run successfully, and just customers will fail, and you can just figure out the bug for customers and fix that.

294 00:40:49.960 ⇒ 00:40:55.629 bpeiair: Instead of having to wait, instead of everything going down, having to fix one bug somewhere.

295 00:40:55.980 ⇒ 00:41:06.030 bpeiair: And then, you know, re-triggering it. So it’s kind of like… that’s… again, like, people do both things. I like modularization. Takes a little bit longer to set up.

296 00:41:06.840 ⇒ 00:41:07.570 bpeiair: Sure.

297 00:41:07.920 ⇒ 00:41:14.120 bpeiair: your bugs won’t crash your whole run. It’ll only crash the customer run, or the invoice run.

298 00:41:14.870 ⇒ 00:41:20.129 Caitlyn Vaughn: Okay, I think I’m starting to get it. I guess my question is…

299 00:41:20.550 ⇒ 00:41:31.990 Caitlyn Vaughn: I guess, A, where does the kind of cron part of this come in? Like, how are we scheduling things? I guess I’m not sure how that’s related to tags. And then…

300 00:41:31.990 ⇒ 00:41:32.630 bpeiair: Yep.

301 00:41:32.630 ⇒ 00:41:32.950 Caitlyn Vaughn: B.

302 00:41:33.560 ⇒ 00:41:34.530 Caitlyn Vaughn: Think, seriously?

303 00:41:34.530 ⇒ 00:41:35.150 bpeiair: it going.

304 00:41:35.150 ⇒ 00:41:39.150 Caitlyn Vaughn: sorry, for tags, there’s customer test production. So is that, like.

305 00:41:39.150 ⇒ 00:41:41.590 bpeiair: I made these up. It can be anything you want, but yeah.

306 00:41:41.590 ⇒ 00:41:45.450 Caitlyn Vaughn: Yeah, yeah, yeah, so is that, like… are these, like, models, or tables, or…

307 00:41:45.790 ⇒ 00:41:50.529 bpeiair: So… Here, I’ll… I can… I’ll try to do this really, really quickly.

308 00:41:51.640 ⇒ 00:41:52.850 Caitlyn Vaughn: Fuel’s important.

309 00:42:00.080 ⇒ 00:42:03.140 bpeiair: Let’s see… this is Amazon? This is Shopify?

310 00:42:04.450 ⇒ 00:42:05.400 bpeiair: Amazon.

311 00:42:06.310 ⇒ 00:42:07.300 bpeiair: Shopify.

312 00:42:11.710 ⇒ 00:42:12.840 bpeiair: Shopify.

313 00:42:13.010 ⇒ 00:42:14.100 bpeiair: No!

314 00:42:14.290 ⇒ 00:42:17.450 bpeiair: Stop it, cursor. I hate you. Amazon.

315 00:42:17.760 ⇒ 00:42:21.740 bpeiair: Shopify… and Amazon.

316 00:42:22.870 ⇒ 00:42:34.020 bpeiair: Okay, so… I don’t know your orchestration engine, but by default, it just does the command dbt run.

317 00:42:34.280 ⇒ 00:42:37.630 bpeiair: And dbt run looks at this repo and runs everything.

318 00:42:37.630 ⇒ 00:42:38.050 Caitlyn Vaughn: Frances.

319 00:42:38.050 ⇒ 00:42:45.340 bpeiair: Or you do dbt run select… Tag… Amazon.

320 00:42:46.070 ⇒ 00:42:48.430 bpeiair: If this gets fed into your orchestration engine.

321 00:42:48.570 ⇒ 00:42:52.940 bpeiair: It’ll only run models tagged with Amazon, which would be this Amazon order.

322 00:42:53.130 ⇒ 00:42:53.620 Caitlyn Vaughn: In fact.

323 00:42:53.620 ⇒ 00:42:54.500 bpeiair: headquarters.

324 00:42:54.830 ⇒ 00:43:00.219 bpeiair: And any other SQL file that has Amazon, but it won’t run Shopify, because Shopify has a Shopify tag.

325 00:43:02.630 ⇒ 00:43:07.079 bpeiair: And then you can be… This, you can do this, like.

326 00:43:07.440 ⇒ 00:43:15.040 bpeiair: run every morning, and Shopify run every night. It’s just… it allows you to modularize however you want.

327 00:43:15.040 ⇒ 00:43:15.860 Caitlyn Vaughn: Hmm.

328 00:43:18.240 ⇒ 00:43:21.580 Nandika Jhunjhunwala: And we schedule runs on the intermediate tables.

329 00:43:23.390 ⇒ 00:43:26.779 bpeiair: You can schedule… I don’t know what your scheduling looks like.

330 00:43:26.910 ⇒ 00:43:28.170 bpeiair: Unfortunately.

331 00:43:29.320 ⇒ 00:43:33.039 bpeiair: I would assume it runs every morning.

332 00:43:33.530 ⇒ 00:43:35.280 bpeiair: Probably the entirety.

333 00:43:35.680 ⇒ 00:43:38.130 bpeiair: But, but I, but I have no idea.

334 00:43:38.310 ⇒ 00:43:42.380 bpeiair: I’m just showing you the broad use case for tags, because I think

335 00:43:42.840 ⇒ 00:43:52.580 bpeiair: growing pains comes from monolithic runs, and so I just wanted to show you that it is an option that exists if you want to break things out.

336 00:43:53.880 ⇒ 00:44:04.480 Caitlyn Vaughn: Yeah, we do have tags, I don’t… I can’t see, like, where the runtime… I think… I’m pretty sure it’s running once a day. I just know that, like, in passing, but I can’t.

337 00:44:04.480 ⇒ 00:44:05.700 bpeiair: I think, I think it’s…

338 00:44:05.700 ⇒ 00:44:06.569 Caitlyn Vaughn: At some point…

339 00:44:06.750 ⇒ 00:44:13.709 bpeiair: Yeah, you’ll figure out… orchestration can be so many different things that I don’t know, but,

340 00:44:14.850 ⇒ 00:44:20.309 bpeiair: Yeah, I think if somebody shows you it, I think it’ll be really easy to just be like, oh, this is how it’s working, then you can, like, change the crons and stuff.

341 00:44:21.140 ⇒ 00:44:22.090 bpeiair: I think I’m gonna take…

342 00:44:22.090 ⇒ 00:44:22.650 Nandika Jhunjhunwala: So.

343 00:44:22.650 ⇒ 00:44:23.620 bpeiair: It’s finished. Oh.

344 00:44:23.620 ⇒ 00:44:28.009 Nandika Jhunjhunwala: are… sorry, sorry, yes, I can ask questions at the end.

345 00:44:28.010 ⇒ 00:44:31.230 bpeiair: I was gonna rush through the rest of the config stuff, and then…

346 00:44:31.230 ⇒ 00:44:32.590 Nandika Jhunjhunwala: Yes, yes.

347 00:44:32.590 ⇒ 00:44:47.740 bpeiair: I will have time for questions. I can go till 3 as well. Really important… well, not really important. I hate saying really important, just in case you guys aren’t doing it, and you decide that it’s, like, end of the world. It’s not, but it’s fine.

348 00:44:48.030 ⇒ 00:45:00.650 bpeiair: You can do database equals… data, database on… And… schema equals mark… schema?

349 00:45:01.360 ⇒ 00:45:07.199 bpeiair: to… distinguish… If you want different tables to go to different places.

350 00:45:07.350 ⇒ 00:45:12.779 bpeiair: If you want this to go to database.mart.schema, and you want something else to go to database.int.

351 00:45:13.100 ⇒ 00:45:15.489 bpeiair: You would… you would let it know here.

352 00:45:16.610 ⇒ 00:45:20.870 bpeiair: If you don’t have… if models don’t have this at all.

353 00:45:21.080 ⇒ 00:45:39.409 bpeiair: it defaults to whatever the default database and schema is in your dbtproject.yaml file, like your setup file. There’s usually some sort of default, that if you don’t have it, but if you want to tell it things to go to different places, yeah, database…

354 00:45:39.520 ⇒ 00:45:40.350 bpeiair: schema.

355 00:45:40.780 ⇒ 00:45:53.000 bpeiair: For organizational purposes, or if you ever have, like, PII, and you want that to go not in the main repo, like, there’s all these different reasons why people would want different tables to, like, point different directions.

356 00:45:53.450 ⇒ 00:45:56.370 Caitlyn Vaughn: So the schemas are the YML files.

357 00:45:57.140 ⇒ 00:46:00.270 bpeiair: The, this, sorry, this schema is the,

358 00:46:00.600 ⇒ 00:46:10.210 bpeiair: in your data warehouse, I’m in BigQuery mode. It would be, like, it would be actually data warehouse name, and then database name.

359 00:46:10.390 ⇒ 00:46:11.280 bpeiair: like.

360 00:46:11.280 ⇒ 00:46:11.840 Caitlyn Vaughn: Hmm.

361 00:46:11.840 ⇒ 00:46:19.430 bpeiair: the name of your data warehouse, and then when you open it, it has, like, all those, like, icons, that it would be whatever that name is. And then when you open that up.

362 00:46:19.650 ⇒ 00:46:22.719 bpeiair: There might be, like, sub… Let’s just call them folders.

363 00:46:22.720 ⇒ 00:46:23.410 Nandika Jhunjhunwala: Tables.

364 00:46:23.410 ⇒ 00:46:25.630 bpeiair: Yeah, it’s the folders within your data warehouse.

365 00:46:26.170 ⇒ 00:46:28.420 Nandika Jhunjhunwala: So, I think the way I understand it as well is, like.

366 00:46:28.630 ⇒ 00:46:32.439 Nandika Jhunjhunwala: A schema is, like, a bunch of tables together.

367 00:46:32.830 ⇒ 00:46:36.749 Nandika Jhunjhunwala: And so it’s like a folder within MotherDoc.

368 00:46:36.930 ⇒ 00:46:39.810 Nandika Jhunjhunwala: And then that folder will have multiple tables set.

369 00:46:40.430 ⇒ 00:46:45.800 Nandika Jhunjhunwala: Is it, like, it’s a combination of tables, is like a schema, and that schema can…

370 00:46:46.140 ⇒ 00:46:48.539 Nandika Jhunjhunwala: Live in a database.

371 00:46:49.780 ⇒ 00:46:50.800 Nandika Jhunjhunwala: Yup. Right?

372 00:46:51.240 ⇒ 00:46:54.740 bpeiair: Exactly. It’s for… it’s for database organization.

373 00:46:55.000 ⇒ 00:46:55.360 Nandika Jhunjhunwala: Yep.

374 00:46:55.360 ⇒ 00:47:03.460 bpeiair: If you want everything to live in, like, one database versus in multiple, it’s a design choice.

375 00:47:04.110 ⇒ 00:47:14.089 bpeiair: Here’s the final thing that I don’t know if y’all do, but I hope you do. Type is the most important

376 00:47:14.770 ⇒ 00:47:23.879 bpeiair: thing in a config that I can think of, and it probably defaults to table, but there are four.

377 00:47:24.420 ⇒ 00:47:27.849 bpeiair: I’ll write them out. Yep, there’s… oh! Look at that!

378 00:47:28.370 ⇒ 00:47:29.659 bpeiair: That’s not real.

379 00:47:31.180 ⇒ 00:47:38.379 bpeiair: Okay, there it is, and then… They’re ephemeral. Okay.

380 00:47:38.520 ⇒ 00:47:40.259 bpeiair: These are the four.

381 00:47:40.520 ⇒ 00:47:45.720 bpeiair: table types that you can choose from.

382 00:47:46.040 ⇒ 00:47:49.389 bpeiair: If you make a SQL file a table.

383 00:47:49.550 ⇒ 00:47:54.249 bpeiair: And it runs. It treats it as a… Truncate and replace?

384 00:47:54.480 ⇒ 00:48:02.590 bpeiair: Every single day. So you have a SQL file, whatever the SQL’s in there, if the type is a table, it’ll drop that table and recreate the table.

385 00:48:02.790 ⇒ 00:48:04.240 bpeiair: Every single, every single day.

386 00:48:05.110 ⇒ 00:48:08.279 bpeiair: A view is a SQL view,

387 00:48:09.330 ⇒ 00:48:17.240 bpeiair: So, it just takes the query, and as a view in your data warehouse, that query

388 00:48:17.580 ⇒ 00:48:22.620 bpeiair: Runs when a user queries the view, as opposed to the data

389 00:48:23.140 ⇒ 00:48:28.269 bpeiair: Being pre-run for somebody, which saves them time, because otherwise they’re…

390 00:48:28.650 ⇒ 00:48:34.390 bpeiair: querying the entire query, if they look at a view. Nobody uses views, don’t worry about that, just know it exists.

391 00:48:34.630 ⇒ 00:48:50.230 bpeiair: Incremental, is what I was talking about earlier, where instead of dropping… truncating and replacing every day, based on whatever the date column is, it looks at the date column, and every day it looks for new

392 00:48:50.610 ⇒ 00:49:02.800 bpeiair: Data for the date that you’re running, and instead of dropping and replacing the table, it keeps that table… it persists that table, and incrementally appends the new data every day into the table.

393 00:49:02.910 ⇒ 00:49:22.259 bpeiair: People like it because it’s faster. Instead of dropping and recreating, like, 5 years worth of data every day, you… you do it one time, and then from there on, you start inserting day after day after day. Makes it faster. But some people also like table because, if somebody changes a, like.

394 00:49:22.730 ⇒ 00:49:39.610 bpeiair: the definition of a product or a SKU, they want that definition to persist through their entire history, which you can’t do if it’s incremental. You would have to rewrite history with whatever new definition you have. It’s, again, user preference, depending on the use case of the business.

395 00:49:39.730 ⇒ 00:49:47.610 bpeiair: And then ephemeral, means that it treats this SQL as a subquery. It doesn’t save… it doesn’t…

396 00:49:48.270 ⇒ 00:49:55.109 bpeiair: compute anything, it doesn’t run anything, but I like Ephemeral, because…

397 00:49:55.180 ⇒ 00:50:09.990 bpeiair: if I want this logic to exist as a reference, but I don’t… but nobody needs to ever query it, and I never want… need to see it in my warehouse, I just need to be able to have this business logic saved.

398 00:50:10.370 ⇒ 00:50:13.209 bpeiair: somewhere so that I can reference it.

399 00:50:13.680 ⇒ 00:50:22.360 bpeiair: In another model that actually gets materialized, I use Ephemeral. It effectively acts as a subquery.

400 00:50:22.360 ⇒ 00:50:38.339 bpeiair: it’s not saved anywhere in the data warehouse, but if this is ephemeral, and I use fact orders somewhere else in another SQL file, it will take this SQL, make it a subquery, and inject it into the new model that I have.

401 00:50:39.400 ⇒ 00:50:46.130 bpeiair: I like ephemeral a lot, because you don’t need, sometimes, 100 staging tables, it confuses people.

402 00:50:50.700 ⇒ 00:50:52.769 bpeiair: I lost my train of thought. We can go to…

403 00:50:52.770 ⇒ 00:50:55.349 Caitlyn Vaughn: That would be great. Do you have an example for ephemeral?

404 00:50:55.710 ⇒ 00:51:12.620 bpeiair: Yeah, let’s say this is, ephemeral… Let’s see… new file… Ephemeral… Customer.

405 00:51:19.350 ⇒ 00:51:20.390 bpeiair: Emerald.

406 00:51:21.000 ⇒ 00:51:23.209 bpeiair: Select star from sales.

407 00:51:24.140 ⇒ 00:51:34.190 bpeiair: Or stop customer where employee equals false, and start date is greater than or equal to 20250001.

408 00:51:34.300 ⇒ 00:51:47.389 bpeiair: And then here, I want to, select star from… If… e-customer… okay.

409 00:51:47.650 ⇒ 00:51:50.210 bpeiair: And then I want to left join.

410 00:51:51.150 ⇒ 00:51:54.050 bpeiair: beef Amazon customer.

411 00:51:55.050 ⇒ 00:51:58.629 Caitlyn Vaughn: Monica, do you have, like, are you able to find type anywhere?

412 00:51:59.930 ⇒ 00:52:06.079 Caitlyn Vaughn: I have been looking through the fact tables, I can’t find it. I don’t know if I’m just looking in the wrong place.

413 00:52:06.860 ⇒ 00:52:07.880 Nandika Jhunjhunwala: Let me check.

414 00:52:10.110 ⇒ 00:52:12.490 bpeiair: Everything in the config is optional.

415 00:52:12.610 ⇒ 00:52:13.759 bpeiair: But it defaults…

416 00:52:13.760 ⇒ 00:52:14.380 Nandika Jhunjhunwala: Totally.

417 00:52:14.380 ⇒ 00:52:17.930 bpeiair: It defaults to… Whatever your default is.

418 00:52:17.930 ⇒ 00:52:18.410 Nandika Jhunjhunwala: table.

419 00:52:18.410 ⇒ 00:52:19.670 bpeiair: I think table, yeah.

420 00:52:19.840 ⇒ 00:52:21.779 Caitlyn Vaughn: And how would we find what the default is?

421 00:52:22.540 ⇒ 00:52:23.830 Nandika Jhunjhunwala: If it’s not mentioned.

422 00:52:24.050 ⇒ 00:52:28.890 bpeiair: If it’s not mentioned, it might be in dbt Project.

423 00:52:29.550 ⇒ 00:52:33.880 bpeiair: But that question’s kind of like a… I don’t think I would be able to answer it with…

424 00:52:33.880 ⇒ 00:52:35.439 Caitlyn Vaughn: Okay, whoever designed it.

425 00:52:35.440 ⇒ 00:52:37.160 bpeiair: I know about the environment.

426 00:52:37.160 ⇒ 00:52:37.560 Caitlyn Vaughn: Yeah.

427 00:52:37.560 ⇒ 00:52:43.570 bpeiair: And, again, I’m showing you features, you don’t have to use them, but if…

428 00:52:43.570 ⇒ 00:52:43.900 Caitlyn Vaughn: This is great.

429 00:52:43.900 ⇒ 00:52:47.540 bpeiair: Since y’all are curious, I’m trying my best to.

430 00:52:47.720 ⇒ 00:52:48.390 Caitlyn Vaughn: Yeah.

431 00:52:48.390 ⇒ 00:53:00.840 bpeiair: So what I would do with Ephemeral is, I have, like, I have Salesforce customer where I have this very small, like, business logic, I don’t want employees in it, and I only want start dates.

432 00:53:00.850 ⇒ 00:53:09.030 bpeiair: And then, let’s say I have a customer table from Zendesk where it’s the same thing, and I have, like, 5 different customer tables.

433 00:53:09.110 ⇒ 00:53:15.260 bpeiair: that then I join all 5 into, this’ll be fact… this’ll be, sorry, DIM customer.

434 00:53:17.690 ⇒ 00:53:22.610 Nandika Jhunjhunwala: I mean, we would use the 5 as ephemeral tables, so we can just have, like, that one.

435 00:53:22.610 ⇒ 00:53:23.600 bpeiair: You can.

436 00:53:23.600 ⇒ 00:53:24.650 Nandika Jhunjhunwala: vision. Okay.

437 00:53:24.650 ⇒ 00:53:29.249 bpeiair: Yeah, you can. So, if you didn’t have them as ephemeral, you would have…

438 00:53:29.570 ⇒ 00:53:36.209 bpeiair: Six, like, cleaned-up customer tables of every single application somewhere in your data warehouse.

439 00:53:36.340 ⇒ 00:53:44.159 bpeiair: which is, you know, it’s compute and storage. And then you would have DIM customer, which joins it all together, somewhere else, which is compute and storage.

440 00:53:44.270 ⇒ 00:53:57.659 bpeiair: But if you don’t want anybody to look at DIM customer Salesforce, or if nobody has any interest in looking at it, then you can just make them ephemeral so that it removes clutter.

441 00:53:57.760 ⇒ 00:54:06.439 bpeiair: And it removes, like, storage, because if these… if these aren’t ephemeral, if these are 5…

442 00:54:06.590 ⇒ 00:54:22.720 bpeiair: third-party applications that you have that all have customer information, and these aren’t ephemeral, then you are 5X-ing your customer data in your warehouse, because you have your customers cleaned out of Salesforce, and your customers cleaned out of Stripe, and your customers clean… and then you join it all together, and you have customers again.

443 00:54:22.800 ⇒ 00:54:33.589 bpeiair: you don’t necessarily need all that, unless somebody’s doing, like, you know, a deep dive analysis into Amazon customers, I don’t know. But for the most part, you could, you know.

444 00:54:34.080 ⇒ 00:54:39.560 bpeiair: squished down… Your storage and compute, instead of

445 00:54:39.780 ⇒ 00:54:45.479 bpeiair: dbt building 7 customer tables just to get a clean, DIM customer.

446 00:54:45.800 ⇒ 00:54:52.220 bpeiair: you’re just running DIM Customer, And the ephemeral models act as subqueries.

447 00:54:52.890 ⇒ 00:54:54.939 bpeiair: And they won’t get saved in the warehouse.

448 00:54:55.590 ⇒ 00:54:58.650 Caitlyn Vaughn: Okay, this is interesting. We had,

449 00:54:59.340 ⇒ 00:55:07.269 Caitlyn Vaughn: We have 3 tools. We have Salesforce Hyperline and Omni, right? And all of those have our…

450 00:55:07.570 ⇒ 00:55:11.029 Caitlyn Vaughn: ARR count. Like, how much revenue we have.

451 00:55:11.440 ⇒ 00:55:24.739 Caitlyn Vaughn: Total, or in the year, whatever. And each of them were different, and we wanted to pull all three to figure out why the numbers were off, specifically, so we could talk to our board about it.

452 00:55:25.070 ⇒ 00:55:39.699 Caitlyn Vaughn: So, in this scenario, would something like this work for that? Because we obviously… we don’t want the raw data in, in dbt, because… or in a topic, because we don’t want it querying or using that raw data, right?

453 00:55:40.370 ⇒ 00:55:42.070 Caitlyn Vaughn: someone’s this work?

454 00:55:44.180 ⇒ 00:55:45.839 bpeiair: From a design perspective.

455 00:55:48.600 ⇒ 00:56:02.200 bpeiair: I probably don’t feel comfortable saying yes or no to that as a one-time teacher situation, but I would… if you… if you’re getting ARR from 3 different systems.

456 00:56:02.740 ⇒ 00:56:19.369 bpeiair: I see that as separate from what’s happening here, where you need to actually make 3 different ARR tables only using those 3 separate sources, so that in a SQL query, when you’re doing a variance analysis, those 3 tables do have to exist.

457 00:56:19.380 ⇒ 00:56:27.539 bpeiair: So I think they would need to exist for you to do an analysis on it, versus ephemeral, you wouldn’t be able to find in the database.

458 00:56:27.540 ⇒ 00:56:33.439 Caitlyn Vaughn: Yeah. Okay, I’m following. Okay, I think this is a separate question for later, but that is interesting.

459 00:56:34.560 ⇒ 00:56:36.300 bpeiair: Yeah, unless…

460 00:56:37.990 ⇒ 00:56:56.010 bpeiair: you do… unless you use dbt as a validation tool, where you’re using those tests that I mentioned, and you want to run your tests using dbt, as opposed to, like, writing a SQL query and, like, copy and sharing those results out to somebody.

461 00:56:56.220 ⇒ 00:56:56.670 Caitlyn Vaughn: Hmm.

462 00:56:56.790 ⇒ 00:56:58.740 bpeiair: But…

463 00:56:59.030 ⇒ 00:57:16.130 Greg Stoutenburg: Yeah, so in this particular case, it’s, that the data was, so Salesforce is the source, but then what was appearing in Omni was, a calculated… basically, the way that a field was calculated was using dbt, where it took…

464 00:57:16.190 ⇒ 00:57:19.469 Greg Stoutenburg: it took data from Salesforce, and then, like.

465 00:57:19.490 ⇒ 00:57:29.490 Greg Stoutenburg: basically added, like, a buffer. So, for example, there were some customers that were appearing in Omni as, like, their churn or renewal state.

466 00:57:29.490 ⇒ 00:57:45.350 Greg Stoutenburg: as reflected in Salesforce was affecting whether something countered as ARR or not, and that’s because in DBT, there was a… there was something that was like, you know, grab… grab this number, which would be, like, the ARR number, and then look at status.

467 00:57:45.580 ⇒ 00:58:00.409 Greg Stoutenburg: and renewal date. And if, you know, this much time has passed since the renewal date, then don’t count it as ARR. So it was like that kind of thing, where there was, like, sort of this intermediary here that was affecting what was showing up in the final reporting, and as a result, there was a discrepancy between

468 00:58:00.490 ⇒ 00:58:05.869 Greg Stoutenburg: an ARR number reflected in Salesforce as a total versus what was appearing in Omni.

469 00:58:06.240 ⇒ 00:58:07.360 bpeiair: Gotcha, okay.

470 00:58:08.410 ⇒ 00:58:15.330 bpeiair: Yeah, tail’s oldest time. That happens in every single company that’s ever existed. It’s really annoying. I’m sorry that’s happening.

471 00:58:21.020 ⇒ 00:58:22.550 Caitlyn Vaughn: Also, do you have a hard stop?

472 00:58:23.660 ⇒ 00:58:28.180 bpeiair: I… I also do, okay. I was only…

473 00:58:28.360 ⇒ 00:58:39.909 bpeiair: signed up by Greg and Utom to do this session, but if you want another session to review that, I think probably go through Greg. I’m kind of a sideliner here.

474 00:58:39.910 ⇒ 00:58:40.810 Greg Stoutenburg: Yeah.

475 00:58:40.810 ⇒ 00:58:43.710 bpeiair: I’d be happy to do any other follow-up.

476 00:58:43.710 ⇒ 00:58:44.160 Greg Stoutenburg: Yeah, thanks.

477 00:58:44.160 ⇒ 00:58:44.570 Caitlyn Vaughn: fiscal.

478 00:58:44.570 ⇒ 00:58:53.709 Greg Stoutenburg: Yeah, what I thought we’d do here is sort of, like, take this as the general overview, and make sure that Caitlin and Annika feel like they get dbt and how to work in here. And then…

479 00:58:53.710 ⇒ 00:59:14.029 Greg Stoutenburg: consolidate questions offline, and then move forward from there. So maybe the way that would look, Brian, is something like, we might say, hey, can we do another half an hour, like, specifically on this piece, or can you walk through this, you know, maybe here’s some specifics about the implementation that we have here. But that can… that can just be a follow-up, and so, yeah, thanks very much for the general overview, that was great.

480 00:59:14.380 ⇒ 00:59:15.440 bpeiair: You’re welcome. Thank you so much.

481 00:59:15.790 ⇒ 00:59:17.469 Nandika Jhunjhunwala: This is really fun, yeah.

482 00:59:17.470 ⇒ 00:59:22.430 bpeiair: Oh, I’m glad. I do want to say that, because I went through, like, every feature.

483 00:59:22.550 ⇒ 00:59:35.269 bpeiair: don’t feel like you have to overcomplicate doing every single thing that I said. I just wanted to show you everything that’s available, but I don’t want you to stress out going back and being like, there’s no ephemeral models, like, you know, it’s just.

484 00:59:35.270 ⇒ 00:59:37.230 Nandika Jhunjhunwala: No, totally.

485 00:59:37.230 ⇒ 00:59:38.770 bpeiair: That we can… that we can do in the future.

486 00:59:38.770 ⇒ 00:59:39.690 Nandika Jhunjhunwala: Totally,

487 00:59:40.140 ⇒ 00:59:41.940 bpeiair: But, yeah, no, this was great.

488 00:59:41.940 ⇒ 00:59:42.470 Caitlyn Vaughn: Yeah, fun!

489 00:59:42.470 ⇒ 00:59:43.880 bpeiair: This was fun. I appreciate it.

490 00:59:44.030 ⇒ 00:59:53.320 Greg Stoutenburg: Yeah, thanks a ton, Brian. And Caitlin and Annika, if you want to start, like, razzing the team in the client channel, say, like, hey, I didn’t… I looked around, I didn’t… why wasn’t there an ephemeral model for this, like.

491 00:59:53.750 ⇒ 00:59:55.210 Nandika Jhunjhunwala: Stuff like that.

492 00:59:55.210 ⇒ 00:59:56.190 Greg Stoutenburg: Cool.

493 00:59:56.190 ⇒ 00:59:57.330 bpeiair: Buzzwords.

494 00:59:57.330 ⇒ 01:00:11.680 Greg Stoutenburg: Yeah, exactly, just wherever possible, just roll it in the conversation. Yeah. Okay. All right, well, thanks, everyone. This is really great, and Caitlin and Annika, I opened that thread for us to consolidate some questions, and let’s just take those and go from there.

495 01:00:11.980 ⇒ 01:00:13.980 Caitlyn Vaughn: Amazing, thank you so much.

496 01:00:13.980 ⇒ 01:00:14.380 Nandika Jhunjhunwala: Wow.

497 01:00:14.380 ⇒ 01:00:15.090 bpeiair: Nice meeting you.

498 01:00:15.090 ⇒ 01:00:15.580 Greg Stoutenburg: Thanks again, Brian.

499 01:00:15.580 ⇒ 01:00:16.290 bpeiair: Thanks, everyone.

500 01:00:16.350 ⇒ 01:00:17.190 Greg Stoutenburg: Alright, bye.

Brainforge Knowledge

Explorer

2026-04-30_dbt_training_45cee55f

Graph View