Meeting Title: MotherDuck Content Piece Interview Date: 2025-09-30 Meeting participants: Jake Nathan, Jacob Matson
WEBVTT
1 00:05:34.410 ⇒ 00:05:35.240 Jake Nathan: Hey, Jacob.
2 00:05:37.550 ⇒ 00:05:39.000 Jacob Matson: Ugh, here we go. Alright.
3 00:05:39.230 ⇒ 00:05:41.140 Jacob Matson: Am I online? Yes, perfect.
4 00:05:42.120 ⇒ 00:05:43.370 Jake Nathan: Yes, how are you doing?
5 00:05:43.370 ⇒ 00:05:48.040 Jacob Matson: I’m doing well, I’m doing well. Sorry I’m late, I was distracted and the calendar invite didn’t pop up.
6 00:05:48.990 ⇒ 00:05:53.009 Jake Nathan: Oh, no worries at all. I like, I like your background. You got…
7 00:05:53.460 ⇒ 00:05:55.470 Jake Nathan: I can see you’re a big Star Wars fan.
8 00:05:56.270 ⇒ 00:05:59.320 Jacob Matson: Lots of fun stuff back there, for sure.
9 00:06:00.240 ⇒ 00:06:14.009 Jake Nathan: That’s awesome. Well, cool. Well, yeah, thank you again for making time. I think we slacked a little bit, and it sounded like you were saying you’re also working on, kind of, a content piece about,
10 00:06:14.310 ⇒ 00:06:18.359 Jake Nathan: just data warehouses and how Mother Duck fits into that?
11 00:06:19.040 ⇒ 00:06:20.150 Jacob Matson: Yeah, absolutely.
12 00:06:21.790 ⇒ 00:06:41.670 Jake Nathan: Sweet! Well, yeah, just for the sake of time, we can… we can get started. And yeah, like I said, we’ll be doing this interview, and then after that, I’ll kind of read over our transcript again, re-watch the interview, and then start making a content piece. Obviously, anything you say, like, I’ll… I’ll send y’all a draft first, so,
13 00:06:42.170 ⇒ 00:06:43.970 Jake Nathan: And then we can go from there.
14 00:06:45.340 ⇒ 00:06:57.769 Jake Nathan: So, yeah, the first question I had, was, in general, it seems like there’s this kind of explosion in, cloud warehouse costs, and so…
15 00:06:57.970 ⇒ 00:07:03.839 Jake Nathan: Why, you know, what’s contributing to that, and why are teams now starting to figure that out?
16 00:07:06.310 ⇒ 00:07:09.859 Jacob Matson: Okay, that’s a really good question, like, what contributed to that?
17 00:07:10.770 ⇒ 00:07:16.450 Jacob Matson: I think there’s multiple things… there’s multiple things at play here. It’s not just, like, one thing, it’s a lot of things.
18 00:07:16.670 ⇒ 00:07:19.829 Jacob Matson: You know, I think the first thing is,
19 00:07:20.730 ⇒ 00:07:26.770 Jacob Matson: We went from very, very constrained on data, from, like, let’s say… let’s say if you go back 15 years ago.
20 00:07:27.070 ⇒ 00:07:34.350 Jacob Matson: Pretty much, unless you could afford a on-premise data warehouse, like… Vertica.
21 00:07:35.190 ⇒ 00:07:44.590 Jacob Matson: you were running a pretty expensive other database, maybe Oracle or SQL Server, kind of as a data warehouse.
22 00:07:45.010 ⇒ 00:07:57.839 Jacob Matson: And you were running it on your own hardware, or maybe in Rackspace or something, but most likely on-premise, and it was very, very, very expensive to even, like, start doing it. Like, you know,
23 00:07:59.370 ⇒ 00:08:16.500 Jacob Matson: Yeah, very, very expensive, right? Because you need a big base load to handle the spiky nature of analytics, right? You need a lot of cores. And, you know, that means distributed systems, potentially, the hardware was not quite up to speed for what we have now, and so on. So it just was, like, very, very complex.
24 00:08:16.850 ⇒ 00:08:22.159 Jacob Matson: It got immediately easier with two things. The first one was AWS with Redshift.
25 00:08:22.460 ⇒ 00:08:28.160 Jacob Matson: Which took you from, you know, the first time I put in a data warehouse, it took me…
26 00:08:28.310 ⇒ 00:08:34.950 Jacob Matson: probably 6 months before I even had a single server in my… in my… in my, like, on-premise deployment.
27 00:08:35.090 ⇒ 00:08:36.570 Jacob Matson: Right? So…
28 00:08:36.780 ⇒ 00:08:42.309 Jacob Matson: I’m not talking about things being racked and stacked and ready to go, I’m talking about a single server, like, being in the box.
29 00:08:42.590 ⇒ 00:08:56.360 Jacob Matson: Right? Because I’m going through all the, like, here’s why we need this, here’s all the reasons, you know, here’s justifications, okay, now let’s do purchasing, you know, now let’s do procurement, etc, etc, right? Had to buy all these pieces, right? Had to get a SQL Server licenses.
30 00:08:56.360 ⇒ 00:09:04.429 Jacob Matson: You know, I had to, you know, partner across all these different teams to get this all happen… all this to happen, right? That goes to basically zero with AWS Redshift.
31 00:09:04.860 ⇒ 00:09:05.630 Jacob Matson: Right?
32 00:09:07.080 ⇒ 00:09:14.860 Jacob Matson: So, I can, I can create something that’s, that, I can start paying for and using immediately.
33 00:09:15.340 ⇒ 00:09:28.920 Jacob Matson: So that was kind of step one, is, like, we had this, like, really big transition from, like, really expensive to, like, very easy to start. And then Snowflake figured out how to make that really, really easy, and built really good abstractions so that you can spend a bunch of money.
34 00:09:29.090 ⇒ 00:09:36.550 Jacob Matson: Right? Just as a side example, we were dealing with a customer who was,
35 00:09:37.320 ⇒ 00:09:42.900 Jacob Matson: data scientist working in Hex on top of Mother Duck, and they’re like, hey, this is, like, really bad performance, like, what’s going on?
36 00:09:43.300 ⇒ 00:09:44.440 Jacob Matson: And…
37 00:09:45.080 ⇒ 00:09:55.719 Jacob Matson: We looked at it, and we said, oh, here, you need to flip a couple switches here in Hex, and then it’ll be fine, and they did that, and it was fine. But in Snowflake, what you do when that happens is you just turn the knob, and you get a bigger warehouse.
38 00:09:56.580 ⇒ 00:10:08.790 Jacob Matson: So you kind of just, like, skip the fact that you have to do, like, that you should, like, set up your tools right, you just turn the knob. You should be like, alright, need a bigger thing, I’m just gonna go to 3XL, it’s fine. And it works, and the data scientist is happy, and the finance team is sad.
39 00:10:08.870 ⇒ 00:10:22.630 Jacob Matson: But yeah, so it got really, really easy just to, like, spend money. And because of that, too, like, in the old days, like, 15 years ago, we needed to, like, have a real business case for why we were doing analytics, right? We needed to prove ROI.
40 00:10:22.630 ⇒ 00:10:34.660 Jacob Matson: And then we went to, like, you know, zero incremental cost to start, and it was like, I don’t know, just do it. Data is the new oil. We gotta do… we gotta do the data thing. This other company we compete with has data warehouse, we need a data warehouse.
41 00:10:34.770 ⇒ 00:10:54.720 Jacob Matson: just build the thing. And so we went from, like, you know, a big shift from, like, kind of a waterfall-style approach to, like, Agile, but, like, with people who are, like, probably not used to delivering value, because there’s no imperative there, right? Like, the imperative for data teams is, like, we need data. It’s not like we need revenue.
42 00:10:55.300 ⇒ 00:11:13.829 Jacob Matson: Right? And so the imperatives are misaligned also, I think. And a lot of data teams, you know, kind of are, like, shipping dashboards and metrics, and like, okay, do we need those? Like, are they actually additive? Who knows? It’s not… it hasn’t been important. Obviously, a lot changed in, like, 2022, 2023, right? Interest rates are now a lot higher.
43 00:11:13.830 ⇒ 00:11:16.550 Jacob Matson: Than they were for the last, like, you know, 10 years.
44 00:11:18.050 ⇒ 00:11:27.289 Jacob Matson: And so I think, what happened in the meantime is we just started spending a bunch more money, we didn’t know why, you know, were we spending more than we were before?
45 00:11:27.390 ⇒ 00:11:44.439 Jacob Matson: Probably not. Or, I mean, sorry, probably yes, but, like, more people were doing it because it was so much easier to get started, right? You can set up a very small stack really, really quickly with Snowflake, and then, you know, immediately the sales guys are upgrading… upselling you to the next size, which is great! I mean, they’ve got a great motion.
46 00:11:44.600 ⇒ 00:12:03.470 Jacob Matson: And same is true for Databricks and others, too. So, yeah, that’s kind of the long answer to that question, which is, like, we’re spending a bunch of money, like, you know, we just were doing it because we needed to keep up. It was way easier compared to, you know, 15 years ago when you did a huge project to do it.
47 00:12:03.630 ⇒ 00:12:08.170 Jacob Matson: And yeah, that’s kind of the… that’s kind of the… that’s kind of my perspective.
48 00:12:09.510 ⇒ 00:12:23.879 Jake Nathan: Yeah, I really appreciate that rundown, and I have a lot of follow-up questions there, but, you know, going off of what you said of a lot of times it seems like, the solution was just to, you know, turn the knob, increase the size, and…
49 00:12:23.880 ⇒ 00:12:32.359 Jake Nathan: Going off that point, like, what sort of architectural choices make these traditional warehouses, like, more expensive than
50 00:12:32.360 ⇒ 00:12:33.610 Jake Nathan: Usual.
51 00:12:34.890 ⇒ 00:12:41.410 Jacob Matson: Yeah, I wouldn’t say more expensive than usual, I would just say, like, you know, they basically…
52 00:12:41.680 ⇒ 00:12:45.099 Jacob Matson: They make it easy for you to trade off on
53 00:12:45.840 ⇒ 00:12:50.760 Jacob Matson: expensive developer time for a SaaS cost.
54 00:12:50.940 ⇒ 00:12:55.329 Jacob Matson: Right? So instead of me being a data scientist and, like, learning how my tools work.
55 00:12:55.680 ⇒ 00:13:03.190 Jacob Matson: Right? Or a data engineer, not picking on a data scientist, but, like, being a data professional and learning how my tools work, I just turn the knob and go about my day.
56 00:13:03.700 ⇒ 00:13:04.520 Jacob Matson: Right?
57 00:13:05.970 ⇒ 00:13:16.580 Jacob Matson: And so, like, that abstraction is really good, because it means I don’t have to think about what I’m doing, right? But it’s the same problem that we have today with, like, LLMs. It’s like, okay, Claude, how do I do whatever? It’s like, great, turn the brain off.
58 00:13:16.650 ⇒ 00:13:26.690 Jacob Matson: copy-paste the result into my Python or whatever, and away we go, right? Snowflake is great, like, and Redshift and others, right? Like, being able to increase the size of your
59 00:13:26.690 ⇒ 00:13:38.839 Jacob Matson: instance is amazing, right? For those who have never experienced the pain of being like, oh man, I gotta provision another… another server in my cluster, like, this is gonna take me 3 months, the business is, like, falling over.
60 00:13:38.840 ⇒ 00:13:52.549 Jacob Matson: Like, this is really bad. So it’s amazing, it’s amazing, it’s amazing. There’s a lot of pain there, right? There’s a lot of pain around the… and the way that you, solve your pain is by spending more money. That’s a great abstraction. Everyone should have a loop like that.
61 00:13:54.740 ⇒ 00:13:57.590 Jake Nathan: That makes sense. And, I…
62 00:13:57.600 ⇒ 00:14:09.979 Jake Nathan: I was looking at your LinkedIn, it seemed like you were, you majored in computer science and accounting, so I feel like you’ve alluded to this, kind of, in your answers of… it seems like a mistake that
63 00:14:09.980 ⇒ 00:14:18.350 Jake Nathan: data teams can make is not thinking about the financial impact of what they’re doing. So, is that something you see across,
64 00:14:18.660 ⇒ 00:14:27.789 Jake Nathan: different clients you work with, or is that, I guess, how would you encourage or talk about how data teams can be more, you know.
65 00:14:27.940 ⇒ 00:14:32.770 Jake Nathan: More tied to the revenue or to the financial impact of what they’re doing?
66 00:14:33.040 ⇒ 00:14:50.299 Jacob Matson: Yeah, that’s a really good question. You know, I think… I think the thing that I would say is, like, as soon as, like, you know, your CFO is asking about ROI of the data team, you’ve lost the game, right? So, like, I would never talk about ROI explicitly with a data team. What I would talk about is outcomes. Like, how are we helping drive outcomes for the business?
67 00:14:50.300 ⇒ 00:14:55.889 Jacob Matson: And, like, anchor it… always anchor it, towards that. Like, what things are we enabling that are not possible?
68 00:14:55.890 ⇒ 00:15:08.420 Jacob Matson: You know, how do we tie these pieces together in a way that lines up to, like, big picture, you know, whatever the big goals of the company are, or KPIs, or… or whatever, right?
69 00:15:09.170 ⇒ 00:15:15.170 Jacob Matson: Ultimately, you know, I think the idea with a data team is that you’re replacing
70 00:15:15.290 ⇒ 00:15:25.819 Jacob Matson: you’re replacing, you know, manual labor hours with software. And so, you know, having a perspective on, like, how you’re actually doing that, and not just adding spend for the sake of adding spend, I think is critical.
71 00:15:28.320 ⇒ 00:15:43.209 Jake Nathan: If that makes sense, and it seems like, like you said, if the CFO is kind of asking you, then it’s… you’re already in trouble, so would you kind of recommend, like you’re saying, tying them to outcomes and almost being proactive about it, and like.
72 00:15:43.210 ⇒ 00:15:53.739 Jake Nathan: Like, would you… would you try to present those outcomes to the CFI, or to… to the larger team, or do you keep those, like, as a stash, just in case? Like, how… how would you go about…
73 00:15:53.740 ⇒ 00:15:55.940 Jake Nathan: I guess, communicating those outcomes.
74 00:15:56.580 ⇒ 00:16:01.780 Jacob Matson: Yeah, I think, I think that, like, my perspective on this is, like, very simple.
75 00:16:01.990 ⇒ 00:16:04.580 Jacob Matson: Which is that, your business is complex.
76 00:16:04.990 ⇒ 00:16:13.409 Jacob Matson: We use data to understand it. But data is a shadow. It doesn’t necessarily reflect all of reality, it reflects some of reality.
77 00:16:13.660 ⇒ 00:16:21.319 Jacob Matson: Right? And so, what we need to do is we need to collect this data, and we need to actually look at it on a consistent basis, so that we can understand how our business works.
78 00:16:21.470 ⇒ 00:16:28.719 Jacob Matson: Right? So the outcome that we want to think about is, okay, what levers can we pull that we can see in our data
79 00:16:28.810 ⇒ 00:16:36.270 Jacob Matson: So that we can positively impact the business on whatever axis we want to impact it on, right?
80 00:16:36.330 ⇒ 00:16:50.699 Jacob Matson: I would focus around that. It’s about… it’s about, like, holistic approach. It’s not about, like, you know, buying a data warehouse and having a data capability. I don’t really care about your data capability. Like, if you’re not… you could have the best data engineers in the world. If you don’t look at the data every week, you’re not gonna have a good outcome.
81 00:16:51.150 ⇒ 00:16:53.480 Jacob Matson: Right? And so, so, like…
82 00:16:53.630 ⇒ 00:17:06.019 Jacob Matson: The first part for me is, like, you need to… you need to be able to, quickly respond to what the business asks are, but, like, also you need to do it within the frame of, like.
83 00:17:06.839 ⇒ 00:17:21.220 Jacob Matson: how do we… how do we build this into a rigorous kind of process? Not… rigorous is actually the wrong word, actually. I think, actually, by merely looking at the data on, like, a weekly basis, you’re probably in, like, the top 10% of, like, data teams, if you can drive that for your executive team.
84 00:17:21.520 ⇒ 00:17:26.179 Jacob Matson: The reality is they probably look at it at the board meeting, and then, like, maybe there’s some actions.
85 00:17:26.530 ⇒ 00:17:33.849 Jacob Matson: At least in my experience. So, like, anyways, I don’t know if that… that’s not an exact answer, I think, but, like…
86 00:17:34.060 ⇒ 00:17:41.770 Jacob Matson: You know, I would say, like, think about the system of which the data team is a part of, not about, like, the data team as a closed loop, it’s not.
87 00:17:42.310 ⇒ 00:17:49.499 Jake Nathan: Yeah, that makes sense to me, and like you said, it’s almost… it’s more about the cadence of actually looking at the data if you’re not…
88 00:17:49.500 ⇒ 00:17:50.080 Jacob Matson: Yep.
89 00:17:50.080 ⇒ 00:18:00.259 Jake Nathan: If you don’t have a cadence, then… And you mentioned, I like what you said, data is a shadow. Obviously it changes from business to business, but,
90 00:18:00.600 ⇒ 00:18:15.620 Jake Nathan: Are there any similarities that you’ve seen across different businesses of, like you said, there’s a lot of things that data obviously shows you and does… shows you really well, but then there’s, if it’s a shadow, there’s also things that data doesn’t show you, so is there… are there…
91 00:18:15.690 ⇒ 00:18:22.149 Jake Nathan: Kind of commonalities between the businesses that you see that maybe data doesn’t show you this side of things?
92 00:18:24.550 ⇒ 00:18:31.349 Jacob Matson: I mean, I think it’s really hard, to see and understand the impact of, like.
93 00:18:32.460 ⇒ 00:18:44.280 Jacob Matson: three or more variables at once going into something, right? Just broadly, like, the problem space expands so much. So, like, you know, what impacted… you know, sometimes you can see things very clearly, like.
94 00:18:44.400 ⇒ 00:18:49.319 Jacob Matson: Alright, well, like, why did revenue go up? It’s like, okay, we added more customers, great.
95 00:18:49.610 ⇒ 00:18:57.029 Jacob Matson: But, like, not everything is that simple. So when there’s, like, more complex, like.
96 00:18:57.320 ⇒ 00:19:01.669 Jacob Matson: causality, or causation? I think causation, I think, is the word, like, in your business.
97 00:19:01.850 ⇒ 00:19:16.350 Jacob Matson: like, you need, actually, that’s why you need a lot of metrics, and so that you can kind of, like, see which things change together that cause some outcome to change, right?
98 00:19:17.180 ⇒ 00:19:28.599 Jacob Matson: it’s really hard to do it. One failure mode that I would say is, like, North Star metrics is something that executive teams have heard talk about, and, I’ve talked about it, I thought it was a great idea,
99 00:19:28.710 ⇒ 00:19:40.209 Jacob Matson: you know, single source of truth type of, you know, also is another thing that I see, like, oh, we just need a single source of truth. Well, good luck, good luck on that, it’s a shadow. Like, let’s all acknowledge that it’s a shadow, but, like, it’s a great way to get a budget for a data team.
100 00:19:40.250 ⇒ 00:19:53.380 Jacob Matson: But the thing about Northstar metrics is, like, it often focuses on something too narrow, and then you end up in, like, a situation where, like, the team is just trying to min-max, you know, a single metric, or a small set of metrics. And the reality is, like.
101 00:19:53.380 ⇒ 00:20:02.509 Jacob Matson: You know, once it… once, you know, once it becomes… well, I can’t remember, what is it, Goodhardt’s Law or whatever? Like, once it’s… it becomes a… it becomes a target, and then it’s useless.
102 00:20:02.560 ⇒ 00:20:12.250 Jacob Matson: Anyway, so, like, I think, like, that’s the other, the other failure mode, too, is like, alright, here’s the metrics we care about, and it’s like, alright, well, I’m just gonna try to max that.
103 00:20:12.390 ⇒ 00:20:15.239 Jake Nathan: Okay, great. Well, now we have a different problem.
104 00:20:15.240 ⇒ 00:20:35.069 Jacob Matson: So, like, what we need is more metrics. But seriously, though, I think actually the answer is, like, you know, a broad set of metrics that are reviewed regularly, and then, you know, I think a lot… I’m a big believer in the approach, laid out in the Thinking Backwards book by the Amazon guys, Bill Carr and others.
105 00:20:36.440 ⇒ 00:20:47.210 Jacob Matson: Which is really kind of about, you know, okay, what outputs do you desire, what inputs influence that? And then, like, how do you build a cadence to really understand what… how to change your inputs?
106 00:20:47.560 ⇒ 00:20:49.410 Jacob Matson: Yeah.
107 00:20:49.800 ⇒ 00:21:01.249 Jake Nathan: Yeah, I… I really like that. It’s, it’s like, it’s a great answer, and it’s a tough answer, just because everyone’s looking for, like you’re saying, like, that one North Star metric that they’re missing, and oh, if I just…
108 00:21:01.530 ⇒ 00:21:03.460 Jake Nathan: Looked at that, then all of a sudden, like.
109 00:21:03.460 ⇒ 00:21:10.040 Jacob Matson: Yeah, I find it to be, like, no silver bullets, just thousands of small ones, and, like…
110 00:21:10.250 ⇒ 00:21:21.279 Jacob Matson: you know, the way that you build an advantage is by being really good just every day. And if you’re really good every day, then you build an advantage every day, and you get better every day, and, you know, that compounds.
111 00:21:22.680 ⇒ 00:21:27.969 Jake Nathan: Totally. Yeah, I feel like this is half, tactical, half almost just mental for, like, the.
112 00:21:27.970 ⇒ 00:21:31.230 Jacob Matson: Oh, totally, it totally is, yeah, exactly.
113 00:21:31.370 ⇒ 00:21:41.859 Jake Nathan: I love it. So yeah, we talked about, like, kind of traditional data warehouses. Now I’d love to talk about Mother Duck. So you’re, you know, we talked about how
114 00:21:42.370 ⇒ 00:21:54.670 Jake Nathan: kind of the makeup of data warehouses can potentially make them more expensive. How does Mother Duck’s business model work, and how does that kind of change the cost structure of things?
115 00:21:55.970 ⇒ 00:21:59.210 Jacob Matson: Yeah, so I think that the main thing about,
116 00:22:00.310 ⇒ 00:22:03.870 Jacob Matson: Mother Duck is… we were talking earlier about how…
117 00:22:04.370 ⇒ 00:22:10.169 Jacob Matson: you know, in other systems, you can just, like, turn your brain off and be like, how do I solve this problem?
118 00:22:10.370 ⇒ 00:22:16.080 Jacob Matson: Mother Duck gives you more controls, more knobs that you can pull on.
119 00:22:16.250 ⇒ 00:22:24.640 Jacob Matson: So, that means it is, more flexible in how you can deploy it.
120 00:22:24.900 ⇒ 00:22:32.720 Jacob Matson: Than other solutions, and underlying that is really the underlying, kind of, core DuckDB tech, right?
121 00:22:32.990 ⇒ 00:22:42.370 Jacob Matson: So, DuckDB, you know, is an in-process columnar, analytics engine, and,
122 00:22:42.820 ⇒ 00:22:56.750 Jacob Matson: basically, we give you something called a duckling that contains a DuckDB database on it, and a duckling is our compute instance. You can get as many of those as you want, with varying sizes. So we’ve got 5 sizes, kind of from a very small one to, like, a fairly big one.
123 00:22:56.860 ⇒ 00:22:58.690 Jacob Matson: You know.
124 00:22:59.260 ⇒ 00:23:13.140 Jacob Matson: We do not publish, like, how many cores or how much RAM they have, but I would say, like, the smallest instance is somewhere around, you know, an eighth of a laptop.
125 00:23:13.350 ⇒ 00:23:17.850 Jacob Matson: And a… one of our biggest instances is something like
126 00:23:18.120 ⇒ 00:23:22.130 Jacob Matson: 8 laptops or 16 laptops, depending on how big your laptop is.
127 00:23:22.510 ⇒ 00:23:32.690 Jacob Matson: Just, like, as a ballpark, like, for modern hardware. So, like, they go from very small to, like, very big, and you can get different sizes. Those are our duckling sizes.
128 00:23:32.800 ⇒ 00:23:41.749 Jacob Matson: And then you’re only billed for compute. It’s purely serverless, right? So when you’re running it, you get billed, otherwise you don’t get billed.
129 00:23:42.320 ⇒ 00:23:51.839 Jacob Matson: And that’s really nice, because analytics workloads are typically very spiky. Maybe you have, like, a… you, like, a data scientist is logging in.
130 00:23:51.840 ⇒ 00:24:03.620 Jacob Matson: And, they need 64 cores right now, right? So they’re gonna run it on a bigger instance, they’re going to, like, use it for an hour, maybe 2 hours, and then be done, right?
131 00:24:03.620 ⇒ 00:24:11.679 Jacob Matson: And then maybe that turns into a pipeline, you know, item that runs, you know, overnight, you know, for 5 or 10 minutes on a bigger…
132 00:24:11.820 ⇒ 00:24:16.710 Jacob Matson: On a bigger… a bigger note, or whatever, right? So, like.
133 00:24:17.120 ⇒ 00:24:28.989 Jacob Matson: In a traditional… where that kind of contrasts with a traditional system, is, like, if you’re doing that type of work in,
134 00:24:30.550 ⇒ 00:24:40.420 Jacob Matson: let’s say Redshift, you need to provision a really big node to do… to have all of that compute available to you, and it’s shared with all your other users who are also running jobs at the same time.
135 00:24:40.530 ⇒ 00:24:56.290 Jacob Matson: In Mother Duck, when I get my compute, it’s just dedicated for me. No other users are using it. And so, I can kind of have the flexibility to do, like, my crazy job while I… or, you know, run some insane queries without, like, tanking everyone else’s instance.
136 00:24:56.440 ⇒ 00:24:58.100 Jacob Matson: Right?
137 00:24:58.470 ⇒ 00:25:14.100 Jacob Matson: Which is very, very helpful, to actually allow you to, again, like, hey, I need to do this investigative analysis, we saw these set of metrics change, like, why? Okay, I need to do the deep dive, I need to take a look at, you know, potentially billions of rows to identify, you know, anomalies or whatever.
138 00:25:14.960 ⇒ 00:25:27.010 Jacob Matson: You know, so I think that… that’s one way we’re different, and then from, like, a… so… so, like I said, you build… you get billed for what you use, based on query time, and then, we also, you know, have a fairly lightweight model for billing for,
139 00:25:27.170 ⇒ 00:25:34.350 Jacob Matson: storage. We’re not trying to make a ton of money on storage, just kind of… obviously, we need it there so that you can enable your, compute.
140 00:25:34.880 ⇒ 00:25:35.920 Jacob Matson: usage.
141 00:25:37.370 ⇒ 00:25:39.359 Jake Nathan: That makes sense, and…
142 00:25:39.750 ⇒ 00:25:54.020 Jake Nathan: I’m guessing that someone who’s using Mother Duck there, like we talked about, they could be concerned with their traditional cloud warehouse, and maybe they’re concerned specifically about pricing.
143 00:25:54.500 ⇒ 00:26:02.430 Jake Nathan: are there any, I guess, trade-offs of… if you’re trying to optimize purely for… Cost,
144 00:26:02.900 ⇒ 00:26:05.909 Jake Nathan: You know, just how to think about that, and are there…
145 00:26:06.310 ⇒ 00:26:13.400 Jake Nathan: Are there, time… or mistakes you can avoid when you’re trying to optimize, like, purely for cost, if you’re thinking about
146 00:26:13.570 ⇒ 00:26:15.820 Jake Nathan: That factor.
147 00:26:16.430 ⇒ 00:26:27.539 Jacob Matson: Yeah, okay, great question. So, the best way to optimize cost at Mother Duck is, like, think about your data as you ingest it. And so, what I mean by that is partition it naturally.
148 00:26:27.800 ⇒ 00:26:37.010 Jacob Matson: The best way to use Mother Duck and DuckDB is… is make your big data small.
149 00:26:37.450 ⇒ 00:26:48.319 Jacob Matson: What I mean by that is, break it into logical chunks, often by partitioning, like by day, by week, by year, by customer, etc. It really just depends.
150 00:26:48.320 ⇒ 00:27:03.220 Jacob Matson: But also, like, you know, now that we have bigger nodes, like, the definition of what small data is, is into the terabytes, okay? So, like, you know, it’s not… we’re not saying, hey, you need to, like, you know, know everything that you’re doing in advance.
151 00:27:03.220 ⇒ 00:27:09.180 Jacob Matson: But, like, handling small data before it comes… before it becomes big data is, like, a very helpful…
152 00:27:09.210 ⇒ 00:27:20.449 Jacob Matson: way to frame the way that you can use, Mother Duck effectively. You know, obviously, you won’t know everything perfectly, all the time,
153 00:27:20.880 ⇒ 00:27:25.929 Jacob Matson: But, you know, it’s so fast to read and write from Mother Duck that, you know.
154 00:27:26.420 ⇒ 00:27:41.009 Jacob Matson: materializing data sets so they’re faster to query later, et cetera, is, like, a very… is a very easy thing to do. And because you get unlimited storage, because of how our model’s built, you can just store as much as you want. You know, obviously you need to, like, you know, keep… keep your storage in mind, like, don’t,
155 00:27:41.180 ⇒ 00:27:45.830 Jacob Matson: you know, don’t blow it out. Obviously, it could be very expensive if you do that, but, like,
156 00:27:46.200 ⇒ 00:27:48.869 Jacob Matson: You know, it’s very cheap to store, so just…
157 00:27:49.090 ⇒ 00:27:54.770 Jacob Matson: You know, don’t be afraid to store the data so it’s easy and fast for your users to actually utilize.
158 00:27:55.950 ⇒ 00:27:59.220 Jacob Matson: So that’s the kind of two things. I’m trying to think, like…
159 00:27:59.700 ⇒ 00:28:15.249 Jacob Matson: I think where it gets hairy is, like, when customers don’t think, or prospects don’t think about, kind of, their load enough, and they’re like, okay, we’ve got, like, one petabyte, and we’re like, yeah, okay, you need to… you need to, kind of, like, break that up into chunks so you can process it in Mother Duck. They’re just, like, if they don’t have the capacity to, like.
160 00:28:15.410 ⇒ 00:28:23.309 Jacob Matson: reason about that, it’s very difficult for us to help them. Obviously, like, I don’t know how you’d frame that in the article, but, like,
161 00:28:23.670 ⇒ 00:28:41.480 Jacob Matson: it’s not that it can’t work at that scale, it’s just that, like, you kind of have to be a skilled data engineer, kind of who really cares about the craft. Ultimately, I think a lot about what Mother Duck does is it enables people who really care about the craft to build amazing stuff, but you have to care about the craft.
162 00:28:41.800 ⇒ 00:28:51.150 Jacob Matson: Right? If you don’t care about the craft, then, I think the thing… the results you get, you know, you should just go to Snowflake and spend a bunch of money.
163 00:28:51.400 ⇒ 00:29:05.320 Jacob Matson: Don’t print that, but, like, you know, my perspective is, like, you know, we’re giving you all these really nice abstractions so that you can build amazing things, but, like, our expectation of users is, like, hey, you should care about this too.
164 00:29:05.520 ⇒ 00:29:16.039 Jacob Matson: Right? It’s not like, this is not intentionally, like, a dumb tool that’s just, like, you know, gonna solve all your problems. No, it’s built in a very specific way so that you can utilize it, you know.
165 00:29:16.100 ⇒ 00:29:27.130 Jacob Matson: Like, we’re not giving you a screwdriver that you can use to beat nails into the wall. Like, don’t do that, right? Like, like, we’re giving you a hammer. Like, use it with the right tools.
166 00:29:27.130 ⇒ 00:29:36.340 Jacob Matson: Or use it, you know, use the right things together. And so, you know, that’s a big challenge for my side on, like, the DevRel side, is, like, education, like, making sure people know how to hold the duck the right way.
167 00:29:37.260 ⇒ 00:29:50.259 Jake Nathan: Yeah, it’s not… it’s not a set-it-and-forget-it platform. It’s still… yeah, like you said, it almost reminds me of the metaphor you were saying earlier with Claude, like, just copy and past… like, people might be looking for, like, hey, just do this for me, or something, but you’re…
168 00:29:50.260 ⇒ 00:29:55.019 Jacob Matson: Yeah, yeah. By the way, I can go… I see we’re running close to time here. I can go long if you need to.
169 00:29:55.020 ⇒ 00:30:01.569 Jake Nathan: Okay, cool, yeah, just, a few, a few more questions. So, yeah, we’ve,
170 00:30:01.570 ⇒ 00:30:19.039 Jake Nathan: you’ve alluded to it a little bit, but, let’s say, you know, I’m deeply entrenched in Snowflake right now. It’s, you know, these sorts of things take, as you know, like, a lot of, it would take a lot of time, to migrate, so if I want to just start, I’m deeply entrenched in one
171 00:30:19.190 ⇒ 00:30:27.149 Jake Nathan: warehouse like Snowflake, what’s kind of the best way to start using Mother Duck without completely replacing my whole system?
172 00:30:27.540 ⇒ 00:30:34.110 Jacob Matson: Yeah, totally great question. So the best way, you know, the best way… I’ll tell you what sold me on DocDB the first time.
173 00:30:34.140 ⇒ 00:30:45.510 Jacob Matson: what sold me on DuckDB is I was working in an IoT company, we were getting all these CSVs, they were gigabytes in size, they were too big for Excel, they were too big for SQL Server.
174 00:30:45.510 ⇒ 00:30:58.469 Jacob Matson: When I say… when I say too big, I just mean, like, it took too long to load them into my database, right? I don’t mean, like, SQL Server couldn’t handle it, like, you know, SQL Server’s a great database, but, like, it just… it took too long. And so,
175 00:30:58.760 ⇒ 00:31:05.859 Jacob Matson: I just was like, okay, I have this… I have this… I’ve heard about this DuckDB thing, let me see if I can, like, parse my data and handle it with this… with this,
176 00:31:06.210 ⇒ 00:31:15.320 Jacob Matson: With this other engine here, and it, you know, took something that was taking me, like, literally 3 hours to parse into SQL Server, and it handled it, I’m not joking you, like, 3 seconds.
177 00:31:15.940 ⇒ 00:31:23.109 Jacob Matson: Right? Yeah, and it’s like, oh, shit. Whoa, this is… I can do this? Like, what?
178 00:31:23.170 ⇒ 00:31:30.890 Jacob Matson: Totally amazing experience. And so I think, like, the thing I would say is, like, just start on, like, the data you have now. If you’re using Snowflake, that’s cool.
179 00:31:30.890 ⇒ 00:31:45.879 Jacob Matson: You know, if you’ve got data in object storage, that’s a great way to start, is it can just read data out of object storage, has its own secrets manager, so you just can… you can pop a secret in there, obviously, like, be secure about what you’re… how you’re implementing it. And then just, like, read Parquet, JSON, CSV, Iceberg.
180 00:31:45.880 ⇒ 00:32:08.480 Jacob Matson: Delta right out of your object store and get started. Now, one thing that’s happened in the time since I’ve started using DuckDB and now is when I started using it, it was, like, kind of a command line only tool, or you could use, like, debaeaver or something with it to kind of, like, access the data. What’s happened recently is we’ve launched something called the UI, the DuckDB UI, and so you can launch it from the command line with duckDB-UI,
181 00:32:08.650 ⇒ 00:32:13.729 Jacob Matson: And it gives you this really nice IDE experience, right, as baked in as part of DuckDB.
182 00:32:14.440 ⇒ 00:32:23.319 Jacob Matson: And so, that part is free, you can just start using it, it’s a great experience, and then when you’re ready for Mother Duck, there’s a little button in the UI that says, sign in with Mother Duck.
183 00:32:25.020 ⇒ 00:32:32.470 Jacob Matson: So that kind of takes you… takes you to the next step, and then, you know, once… once you have data in DuckDB, it’s very, very easy to write it into…
184 00:32:32.660 ⇒ 00:32:49.739 Jacob Matson: into Mother Duck. We find that, like, you know, especially for, like, gold layer exploratory, you know, type of data, like serving, Mother Duck is very effective in that sitting on top of, things like, like, traditional, traditional kind of, solutions like Databricks or Snowflake. So, you know.
185 00:32:49.740 ⇒ 00:33:00.230 Jacob Matson: Yeah, just, I think, like, just use it, like, try it and see if it works for you. My kind of goal, from an overarching perspective, is that, like, using DuckDB ruins every other database for you.
186 00:33:00.560 ⇒ 00:33:19.670 Jacob Matson: Right? And so, like, just use it, like, you know, like a truck dealer. Just try it, man, it’s free. First one’s free. But, like, it’s so, like, it’s such a powerful experience the first time you see it, that, like, you know, it’s like, okay, well, how do I fit this more into this? And, you know, I think what I would say, like, going back to the point about…
187 00:33:19.820 ⇒ 00:33:24.939 Jacob Matson: like, teams using, you know, using data and, like, proving ROI and all these things.
188 00:33:25.320 ⇒ 00:33:34.499 Jacob Matson: DuckDB is super fast, but we shouldn’t care about the fact that it’s fast, right? It’s great that it’s fast. Why it’s good that it’s fast is because we can iterate more quickly.
189 00:33:34.750 ⇒ 00:33:49.059 Jacob Matson: Right? A lot of these, a lot of requirements and data are discovered. They are not given to us. They, you know, it’s not like people don’t know what they want until they see it, and so we want to make sure it’s as fast and as easy to get, kind of, into that iterative process so that you can build it
190 00:33:49.060 ⇒ 00:33:59.470 Jacob Matson: And so I think that, like, that is super, super, super powerful, and, you know, so yeah, use it however you want, but, I think, you know, start… starting with the UI is where I would start today.
191 00:34:00.260 ⇒ 00:34:14.370 Jake Nathan: I love that. That’s great. And, yeah, last question, wrapping things up, just, are there any features or things that you’re excited about Mother Duck, like, that either have come out recently, or that you’re kind of looking forward to ahead?
192 00:34:15.120 ⇒ 00:34:17.529 Jacob Matson: Yeah, I think there’s a couple things,
193 00:34:18.520 ⇒ 00:34:30.609 Jacob Matson: The first one that I’ll talk about is Duck Lake. So Duck Lake is an open table format, like Iceberg and Delta Lake, that instead of using metadata in…
194 00:34:30.920 ⇒ 00:34:39.670 Jacob Matson: your object storage stores the metadata in a database. You can use DuckDB, or SQLite, or MySQL, or Postgres today.
195 00:34:40.150 ⇒ 00:34:49.879 Jacob Matson: It enables lots of really, really cool things in terms of, just interoperability, but also scale. So,
196 00:34:50.650 ⇒ 00:34:56.439 Jacob Matson: Yeah, it… Jordan, I think Jordan showed a demo at Big Data London of, you know, handling
197 00:34:57.010 ⇒ 00:35:11.009 Jacob Matson: I can’t remember, many, many terabyte-sized data sets using Ducklake and querying it super, super fast. So that’s what I’m excited about. Again, so I would say that that falls into the category of making big data feel small.
198 00:35:11.120 ⇒ 00:35:27.389 Jacob Matson: But then also, the definition of small data continues to… that number continues to increase every day, right? And that’s super, super exciting. The second thing I would say is, that I’m excited about is all the stuff we’re doing with AI.
199 00:35:27.440 ⇒ 00:35:31.719 Jacob Matson: And, like, what I mean by that is not, like, oh, like, now there’s, like, a…
200 00:35:31.820 ⇒ 00:35:41.109 Jacob Matson: you know, MCP, or now there’s, like, chat in the app, or whatever. It’s more like… we’re definitely carefully considering… I think when we talked about craft.
201 00:35:41.270 ⇒ 00:35:53.909 Jacob Matson: earlier, one thing that I love about what we’re building at Mother Duck is, like, we care about the craft of software. And so, for us, we’re always thinking about when we’re working AI tools into our workflows, like, what’s the actual…
202 00:35:54.040 ⇒ 00:35:56.360 Jacob Matson: Workflow here for the practitioner.
203 00:35:56.450 ⇒ 00:36:02.590 Jacob Matson: Right? Like, how do we… how do we supercharge it? How do we make it feel native? How do we make it feel really nice to use?
204 00:36:02.650 ⇒ 00:36:12.499 Jacob Matson: Right? We don’t… we’re not interested in building experiences that are just, like, bolting AI in, right? And so where we… where we are using it is we’re finding very… like, a very powerful pattern.
205 00:36:12.540 ⇒ 00:36:27.939 Jacob Matson: Right? And I think that’s, like, really important, because it lets us have a perspective that is very selective about where we use AI, but also use it in places that are super powerful. I’ll give you one example. One example is we launched a feature called Instant SQL.
206 00:36:28.300 ⇒ 00:36:42.680 Jacob Matson: And what that does is it lets you use DuckDB to kind of render your query results as you type, right? Well, we’ve also… we overlaid a feature called Command-K, which lets you chat with your… would chat with an AI and just say, hey, like, modify this query to do X.
207 00:36:42.810 ⇒ 00:36:53.539 Jacob Matson: And what it does when you have instant SQL mode on, and you’re using this Command-K, is you get the results back instantly from the AI suggesting what the… what you… how you should change your query.
208 00:36:53.660 ⇒ 00:37:05.329 Jacob Matson: Right? And that’s a really powerful pattern, right? Because it lets me quickly kind of discover the actual underlying primitives of my business without getting too bogged down in the syntax of SQL.
209 00:37:05.460 ⇒ 00:37:17.309 Jacob Matson: Right? And I think that’s really, really kind of what we’re trying to get at here, which is, like, this will be the best place to do business analysis, and, like, we’ll put AI in the right places to enable that to happen in a very tasteful way.
210 00:37:18.480 ⇒ 00:37:28.360 Jake Nathan: I love that. I completely hear you. I feel like every day, like, some company that I use pretty much just bolts on, you know, something plus AI, and I’m just like…
211 00:37:28.410 ⇒ 00:37:30.100 Jacob Matson: Yeah. Like, it’s…
212 00:37:30.340 ⇒ 00:37:49.189 Jake Nathan: It’s meaningless, but that’s… yeah, it really does seem cool that y’all care about the craft, and that seems to be, like you said, yeah, there’s just kind of a common thread here of, like, yeah, you actually have to care, and actually have to… like, there’s no… you have to use your brain here, you can’t just,
213 00:37:49.240 ⇒ 00:38:01.140 Jake Nathan: there’s nothing set in and forget it. So, yeah, this was… this was awesome. It was… it was really, I think, like I said, tactical and mental. I’m, like, fired up, just from a… from a mental perspective, so…
214 00:38:01.140 ⇒ 00:38:12.970 Jake Nathan: Yeah, thank you again for making time to do this, and like I said, I’ll kind of take this now, rewatch it, and start working on the content piece, and then send it over to y’all to look at, and we can go from there.
215 00:38:13.340 ⇒ 00:38:14.399 Jacob Matson: Sounds great.
216 00:38:14.580 ⇒ 00:38:16.039 Jake Nathan: Awesome. Thanks, Jacob, nice to meet you.
217 00:38:16.040 ⇒ 00:38:17.750 Jacob Matson: Alright, Jake, chat later. Bye.