Meeting Title: Brainforge Interview w- Demilade Date: 2026-02-18 Meeting participants: Selenge Tulga, Demilade Agboola
WEBVTT
1 00:01:38.070 ⇒ 00:01:39.170 Selenge Tulga: Hello!
2 00:01:39.910 ⇒ 00:01:41.340 Demilade Agboola: Hello, how are you?
3 00:01:42.010 ⇒ 00:01:43.650 Selenge Tulga: I’m good, how are you doing?
4 00:01:43.830 ⇒ 00:01:48.039 Demilade Agboola: I’m doing very well. First things first, how do I pronounce your name?
5 00:01:48.390 ⇒ 00:01:50.140 Selenge Tulga: Is this slang?
6 00:01:50.610 ⇒ 00:01:51.760 Demilade Agboola: the length, nicely.
7 00:01:51.760 ⇒ 00:01:52.950 Selenge Tulga: Like, yes.
8 00:01:52.950 ⇒ 00:01:54.490 Demilade Agboola: It’s Demolite.
9 00:01:54.900 ⇒ 00:01:55.520 Demilade Agboola: Okay.
10 00:01:56.050 ⇒ 00:01:56.800 Selenge Tulga: Enjoy it.
11 00:01:57.300 ⇒ 00:02:04.539 Demilade Agboola: Okay, so yeah, I think you’ve met a couple of my colleagues so far.
12 00:02:04.940 ⇒ 00:02:06.890 Demilade Agboola: And so far, they… they…
13 00:02:07.210 ⇒ 00:02:12.389 Demilade Agboola: You know, you’ve made it this far, so, like, you must have been very impressed, so that’s good to hear.
14 00:02:12.790 ⇒ 00:02:18.990 Demilade Agboola: Just first things first, I believe you already know, like, Brain Forge, what we do, and how we…
15 00:02:19.380 ⇒ 00:02:22.620 Demilade Agboola: Do you have any questions about that, or should we just go straight into it?
16 00:02:23.280 ⇒ 00:02:28.700 Selenge Tulga: Yeah, I, I already met Juram and the lead engineer, yeah. I’m fine, yeah.
17 00:02:28.920 ⇒ 00:02:30.690 Selenge Tulga: Okay, alright, sounds good. Sweet.
18 00:02:31.700 ⇒ 00:02:39.210 Demilade Agboola: Alright, I think from my perspective, it would just be, like, can you just briefly, like, introduce yourself and what you’ve done?
19 00:02:41.000 ⇒ 00:02:42.279 Demilade Agboola: And then we can start from there.
20 00:02:42.810 ⇒ 00:02:44.609 Selenge Tulga: Yeah, okay,
21 00:02:44.770 ⇒ 00:03:00.970 Selenge Tulga: My name is Helene, and I’m data engineering over the 7 years of industry experience, I say it, because I started my career as a software engineer, and as my team needs a scalable data solution, it is naturally, I transitioned into a data engineering.
22 00:03:01.060 ⇒ 00:03:05.420 Selenge Tulga: Okay. And, I have, my core,
23 00:03:05.700 ⇒ 00:03:20.140 Selenge Tulga: techs that are the, of course, the SQL, Python, and dbt Snowflake, and sometimes it’s… if it’s a cloud, AWS services, yeah. And, I work with a railway company, and
24 00:03:20.540 ⇒ 00:03:28.909 Selenge Tulga: After that, I pursued my master’s degree in the US, and my most recent role is also a consulting company, and yeah.
25 00:03:30.150 ⇒ 00:03:31.369 Demilade Agboola: That’s pretty cool, that’s pretty cool.
26 00:03:31.370 ⇒ 00:03:31.870 Selenge Tulga: Oof.
27 00:03:31.870 ⇒ 00:03:37.180 Demilade Agboola: So, I think, just to follow up on that, I will…
28 00:03:38.140 ⇒ 00:03:41.140 Demilade Agboola: Just to have an idea of, like, technically what you’ve done.
29 00:03:41.750 ⇒ 00:03:42.240 Selenge Tulga: What would you say?
30 00:03:42.240 ⇒ 00:03:48.920 Demilade Agboola: your most complex, like, data pipeline has been? What was the one that you found the most challenging, and you were like, okay, this took…
31 00:03:49.040 ⇒ 00:03:57.819 Demilade Agboola: A lot of, like, brainpower, a lot of, like, you know, skills, whatever, like, it took for you to be able to come together and put, you know, the most complex data pipeline you’ve come up with.
32 00:03:58.500 ⇒ 00:04:06.390 Selenge Tulga: Complex… I think most complex data pipeline is a furious pipeline, right? It is… because,
33 00:04:06.470 ⇒ 00:04:23.499 Selenge Tulga: in a railway company, and I was… I said I was a software engineer, right? And, that’s time, and they… we just have all… all of TP systems, right, that we just need to… all data into data entry, but we already have all the,
34 00:04:23.610 ⇒ 00:04:29.230 Selenge Tulga: The railway signals data we already have, but we can’t translate the data, and
35 00:04:29.410 ⇒ 00:04:33.449 Selenge Tulga: They call this a data translator, and
36 00:04:33.780 ⇒ 00:04:46.709 Selenge Tulga: Okay, let me, give you a small insight. I came from the, country of Mongolia. The Mongolia is a country between China and the Russia, and we have a lot of the trans, transaction,
37 00:04:46.980 ⇒ 00:05:04.210 Selenge Tulga: the freight, and, every, locomotive has a small, this card and a translator, and even we have a capability to get all this data, and they just refuse to do that, and after that, each,
38 00:05:04.360 ⇒ 00:05:25.129 Selenge Tulga: the travel, they need to get all data to translator, and to see, okay, there is the data, and they need to do manual, and, have… they have all OTP system, and they need to do data entry. And, first time, they said, oh, we need to… because it was the, the Java… I think it’s Java or Delphi, it is,
39 00:05:25.630 ⇒ 00:05:31.990 Selenge Tulga: So, old-fashioned, system, and they just, just want to say, oh, we need to do some…
40 00:05:32.070 ⇒ 00:05:51.180 Selenge Tulga: React new UI, right? Because now React is so popular, we need to do that. And that time, and my team and we work on the root cause, because it’s not just UI, we need to automate this, because people are just spending 40 to 50 hours to just do them.
41 00:05:51.180 ⇒ 00:05:52.450 Selenge Tulga: data entry.
42 00:05:52.820 ⇒ 00:06:09.409 Selenge Tulga: And it is the most complex data work group, because, you know, Mongolia is a vast country, and it is so hard to get all data in a real time, because we don’t have a, sometimes, networking system between the stations, and my team and I,
43 00:06:09.610 ⇒ 00:06:28.089 Selenge Tulga: we architected, it is like a hybrid, and, we… because, it is, in a long way, it is, we need to keep all data in our on-premises, and we need to, we can’t use these, AWS services like that, and,
44 00:06:28.220 ⇒ 00:06:39.020 Selenge Tulga: for the real, I think it’s a real, time we used, Kafka and the Flink, and it was, yeah, it was a complex, and for that, we need to get all,
45 00:06:39.240 ⇒ 00:06:50.649 Selenge Tulga: the producing data using Kafka and Flink to produce, and if we don’t have any networking system, we need to batch all data, right?
46 00:06:51.710 ⇒ 00:07:02.240 Selenge Tulga: And, yeah, the… because, first, I think it’s almost first when we need to translate data, and to get a dictionary have to… because it’s just, like,
47 00:07:02.570 ⇒ 00:07:20.769 Selenge Tulga: the text files, and we need to, extract them, then the translate. It is the most difficult part, and after that, we need to hand with the late data, because… and after, after, the locomotive get, networking, and we need to handle this data.
48 00:07:21.390 ⇒ 00:07:35.820 Selenge Tulga: And, yeah, sometimes, it is, maybe this driver of the lock mode, they… if they can’t… sometimes they just forget to turn on the system, right? And after that, Travel V needs to get all data, yeah.
49 00:07:36.060 ⇒ 00:07:47.729 Selenge Tulga: It is… it is most complex, the data I work with, but the data was, like, clean, and if we can, extract data,
50 00:07:48.060 ⇒ 00:07:50.559 Selenge Tulga: Using a dictionary, it’s a cleaner, just…
51 00:07:51.160 ⇒ 00:07:56.810 Selenge Tulga: It is… the most difficult part is how to deal with, this… the networking.
52 00:07:56.810 ⇒ 00:08:00.299 Demilade Agboola: Yeah, it does sound like that part was very tricky, because.
53 00:08:00.300 ⇒ 00:08:01.190 Selenge Tulga: Yeah.
54 00:08:01.390 ⇒ 00:08:05.070 Demilade Agboola: Most, at least now, a lot of things are connected, so you
55 00:08:05.490 ⇒ 00:08:15.920 Demilade Agboola: As quickly as possible, but when you have to deal with the fact that it would… some might come now, some might come in the future, there might be a delay, that does add some complexity to it.
56 00:08:16.340 ⇒ 00:08:23.670 Demilade Agboola: I also want to, like, understand how… How you decide…
57 00:08:24.570 ⇒ 00:08:31.659 Demilade Agboola: to build out your pipelines and your, like, infrastructure. So, let’s say you’re… right now, we have a client.
58 00:08:32.679 ⇒ 00:08:35.829 Demilade Agboola: They have the Stripe, data source.
59 00:08:36.570 ⇒ 00:08:42.360 Demilade Agboola: They have, Salesforce, And then they have maybe Google Ads, data.
60 00:08:43.460 ⇒ 00:08:47.279 Demilade Agboola: And we have, you know, a warehouse, any warehouse, it doesn’t really matter which.
61 00:08:47.500 ⇒ 00:08:51.880 Demilade Agboola: And they want you to, like, build out the pipeline for them.
62 00:08:52.470 ⇒ 00:08:52.910 Selenge Tulga: Oh,
63 00:08:53.060 ⇒ 00:08:58.469 Demilade Agboola: go about the process of, you know, building out that pipeline for them? What requirements do you need?
64 00:08:59.370 ⇒ 00:09:02.180 Demilade Agboola: How would you ensure that you come up with a… the…
65 00:09:02.180 ⇒ 00:09:03.030 Selenge Tulga: Oh.
66 00:09:03.030 ⇒ 00:09:03.980 Demilade Agboola: those their needs.
67 00:09:04.550 ⇒ 00:09:11.990 Selenge Tulga: Okay, first it is, what is the latest requirements? Can I… the batch, or is they’re wanting real-time?
68 00:09:12.790 ⇒ 00:09:13.809 Demilade Agboola: That’s a very good question.
69 00:09:13.810 ⇒ 00:09:14.929 Selenge Tulga: Okay,
70 00:09:15.140 ⇒ 00:09:22.910 Demilade Agboola: So let’s just say… now, that’s… that’s perfect. That’s… I like that you asked that question. It shows that you’re thinking about it. So let’s say they want batch.
71 00:09:22.910 ⇒ 00:09:23.710 Selenge Tulga: Okay.
72 00:09:23.780 ⇒ 00:09:25.350 Demilade Agboola: Right. What is…
73 00:09:25.770 ⇒ 00:09:27.890 Selenge Tulga: Expected volume, right?
74 00:09:28.260 ⇒ 00:09:30.330 Selenge Tulga: It is, how big data is it?
75 00:09:30.680 ⇒ 00:09:35.530 Demilade Agboola: Alright, so let’s say we’re doing, about Emino and Rosa Day.
76 00:09:35.910 ⇒ 00:09:43.739 Selenge Tulga: Okay. First, we need to think about… have to… okay, you already saved the sources, and it is,
77 00:09:44.810 ⇒ 00:09:52.029 Selenge Tulga: sources and the B2B business services and, SAS. And, we need to think ingestion.
78 00:09:52.160 ⇒ 00:09:54.049 Selenge Tulga: And after that,
79 00:09:54.220 ⇒ 00:10:10.419 Selenge Tulga: The first we need to think, like, I think it’s a source and destination, and what… where do we need to get it, and what do we want to see? And, you know, if we think in the layers, and we need to think ingestions, and how to get ingested data, and where do we save it.
80 00:10:10.820 ⇒ 00:10:27.309 Selenge Tulga: And for the… and also, we need to transform in the processing, and the serving of data, and have to re-monitor the data quality. And I think, if we need to do the system design, I think you already said it is a stripe, and
81 00:10:27.690 ⇒ 00:10:33.159 Selenge Tulga: the Google Ads, I think fine for the in-system, I think I will use
82 00:10:33.610 ⇒ 00:10:41.279 Selenge Tulga: the 5 trend, because it is… they already have the connectors for that. Maybe if they have any, custom
83 00:10:41.560 ⇒ 00:10:47.949 Selenge Tulga: APIs, we can also thinking about, Airflow, maybe. And for the storage, and…
84 00:10:48.230 ⇒ 00:10:52.849 Selenge Tulga: And I am thinking more like, ELT.
85 00:10:53.490 ⇒ 00:11:10.719 Selenge Tulga: And after the ingesting all data through the Fire Strand, we can use the snowflake, and to insert all raw data, and because we need to see what’s the data is it. And after the, raw data, maybe we can use the DPT for the, transformation, and
86 00:11:11.030 ⇒ 00:11:26.899 Selenge Tulga: Now, we already have, raw data. However, in the dbt, we can do, staging, transformation, and, data mart, and for, dbt, I think, we need… we can do the, controlled version using, dbt.
87 00:11:27.200 ⇒ 00:11:28.629 Selenge Tulga: And also, we can…
88 00:11:28.830 ⇒ 00:11:36.089 Selenge Tulga: do maybe data quality says, like, if it’s a, not un… We do the unique, and…
89 00:11:36.510 ⇒ 00:11:43.639 Selenge Tulga: unique and not new, and all data quality checks. And, for this,
90 00:11:44.280 ⇒ 00:11:47.890 Selenge Tulga: For staging, maybe we can just rename the tables.
91 00:11:48.060 ⇒ 00:11:49.730 Selenge Tulga: And,
92 00:11:50.050 ⇒ 00:11:59.749 Selenge Tulga: not this, heavy transformation, and also, intermediate stage, and we can do some… the joints and the complexes, and for,
93 00:11:59.870 ⇒ 00:12:03.149 Selenge Tulga: Depending on the clients, we can also, no…
94 00:12:03.190 ⇒ 00:12:20.180 Selenge Tulga: We can do, data mars, and yeah, DBT can, do the Git, versions, and we can change, and also we can, create PRs, and the data quality checks, and the governance, and…
95 00:12:20.580 ⇒ 00:12:26.209 Selenge Tulga: And after the, transformation data, and if a client wants,
96 00:12:26.360 ⇒ 00:12:35.270 Selenge Tulga: the… the basis and the client’s need, we can use… so what data is it? I think if it’s a fire trend, then we can use,
97 00:12:35.460 ⇒ 00:12:37.599 Selenge Tulga: Yeah, some BI tools.
98 00:12:38.270 ⇒ 00:12:41.509 Selenge Tulga: the tough little, or the real Omni, yeah.
99 00:12:42.070 ⇒ 00:12:44.459 Selenge Tulga: And, yeah, it is…
100 00:12:44.580 ⇒ 00:12:53.389 Selenge Tulga: for the data governance, we need… we can see, oh, it is… what is the SLA, and when data needs to arrive, and what can be…
101 00:12:54.040 ⇒ 00:12:56.320 Selenge Tulga: Though, what’s the reason, is it? Yeah?
102 00:12:56.960 ⇒ 00:12:59.480 Demilade Agboola: Okay, alright, that’s fair, that’s fair.
103 00:12:59.980 ⇒ 00:13:02.889 Demilade Agboola: Okay, so let me… let me add a new layer to that question.
104 00:13:02.890 ⇒ 00:13:07.870 Selenge Tulga: You answered it very well. I like your answer to that question. I think the next layer to that question would be…
105 00:13:08.470 ⇒ 00:13:12.210 Demilade Agboola: If the player… if the… the…
106 00:13:13.050 ⇒ 00:13:18.399 Demilade Agboola: client has cost constraints, because 5chan can be very expensive sometimes, right?
107 00:13:18.540 ⇒ 00:13:19.179 Selenge Tulga: Mmm, yeah.
108 00:13:19.510 ⇒ 00:13:22.640 Demilade Agboola: the… Client has cost constraints.
109 00:13:23.030 ⇒ 00:13:29.530 Demilade Agboola: What tools have you used to be able to build cheaper ingestion.
110 00:13:29.530 ⇒ 00:13:30.180 Selenge Tulga: Mmm.
111 00:13:31.740 ⇒ 00:13:38.710 Selenge Tulga: Yeah, it is, yeah, it is. For the consulting company, we need to think about, right? And for my,
112 00:13:39.000 ⇒ 00:13:46.830 Selenge Tulga: the last project, I used the Airflow, because Airflow is open source, and open source, right? And…
113 00:13:47.280 ⇒ 00:13:49.370 Selenge Tulga: We just need to, maybe…
114 00:13:49.470 ⇒ 00:13:57.089 Selenge Tulga: do a docker with the airflow. And for the airflow, yeah, we can do the, extracting all…
115 00:13:57.300 ⇒ 00:13:59.709 Selenge Tulga: this API using the Python.
116 00:14:00.180 ⇒ 00:14:00.920 Demilade Agboola: Okay.
117 00:14:00.920 ⇒ 00:14:16.620 Selenge Tulga: And, then we can use the Python operator and, do the orchestration using the Airflow. And for this, it’s also, the, we are… if we are talking about cost, and yeah, it is a DPT core is also free, and Airflow is free, and…
118 00:14:17.790 ⇒ 00:14:26.430 Selenge Tulga: Yeah, I can change it, but I am… honestly, I am not sure how, the BI tools
119 00:14:26.610 ⇒ 00:14:28.220 Selenge Tulga: space it could be here.
120 00:14:28.930 ⇒ 00:14:37.400 Demilade Agboola: Okay, fair, fair. No, that’s fine. Also, like, I know you said you use Airflow. Have you used, like, Daxstar or Prefect, or is it just, like, Airflow that you’ve used?
121 00:14:37.400 ⇒ 00:14:37.790 Selenge Tulga: Oh.
122 00:14:38.560 ⇒ 00:14:47.150 Selenge Tulga: I use it the perfect. It is more pipe-based, right? It is easier, and if it’s one small pipeline, I…
123 00:14:47.400 ⇒ 00:14:50.760 Selenge Tulga: Her eye tried with a texture, yeah.
124 00:14:50.950 ⇒ 00:15:00.160 Selenge Tulga: It is more, like, yeah, the data stays, right? And Snowflake is more, like, scheduled, yeah. But I haven’t big experience with,
125 00:15:00.500 ⇒ 00:15:03.309 Selenge Tulga: the taxi, but I know how it works.
126 00:15:03.310 ⇒ 00:15:04.299 Demilade Agboola: investment, that was fine.
127 00:15:04.300 ⇒ 00:15:04.860 Selenge Tulga: Yeah.
128 00:15:05.410 ⇒ 00:15:12.169 Demilade Agboola: Alright, so let’s ask… so now we have the data, so let’s just say we have the data in our pipeline now. We have…
129 00:15:12.620 ⇒ 00:15:14.340 Demilade Agboola: Our infrastructure built out.
130 00:15:14.760 ⇒ 00:15:22.929 Demilade Agboola: Now, within our DBT, We have a, you know, a table, or…
131 00:15:23.170 ⇒ 00:15:25.890 Demilade Agboola: We have a model that has, like, 400…
132 00:15:26.000 ⇒ 00:15:29.030 Demilade Agboola: rows, 400 million rows, so it’s a heavy, you know…
133 00:15:30.440 ⇒ 00:15:36.329 Demilade Agboola: And every time it runs in the morning, or, you know, because we have batched it up, so we’re saying, okay.
134 00:15:37.590 ⇒ 00:15:40.770 Demilade Agboola: In… it takes a very long time to run.
135 00:15:42.400 ⇒ 00:15:47.119 Demilade Agboola: How do we optimize this query? Or how do we ensure that we speed up this query?
136 00:15:47.570 ⇒ 00:15:52.449 Selenge Tulga: Is it, okay, let me ask, is it, are you building…
137 00:15:52.670 ⇒ 00:15:58.739 Selenge Tulga: the VOR table, fully, or are you using incremental?
138 00:15:59.230 ⇒ 00:16:00.990 Selenge Tulga: Why it’s so slow?
139 00:16:01.170 ⇒ 00:16:05.520 Selenge Tulga: So, like, this is… this is part of, like, so just imagine you…
140 00:16:05.520 ⇒ 00:16:12.190 Demilade Agboola: You just got a new… the query, like, you just got handed… a dbt infrastructure, right?
141 00:16:13.210 ⇒ 00:16:15.770 Demilade Agboola: it was slow. What are the things you start looking out for?
142 00:16:15.770 ⇒ 00:16:17.120 Selenge Tulga: Okay.
143 00:16:17.120 ⇒ 00:16:20.489 Demilade Agboola: are the things you start looking and say, okay, let me check this, let me check that, let me.
144 00:16:21.520 ⇒ 00:16:22.990 Demilade Agboola: So… It’s…
145 00:16:22.990 ⇒ 00:16:41.620 Selenge Tulga: Yeah, that’s great. For the DP… okay, I’m thinking, firstly, I said how many data we are, scanning, right? Sometimes we’re just getting the unnecessary all the rooms, like a select all, something like that. It is, it is technically, it’s wrong, because
146 00:16:41.640 ⇒ 00:16:47.810 Selenge Tulga: Okay, it is… we are scanning a lot of serials, and the second one is… it is not error-prone.
147 00:16:48.150 ⇒ 00:17:01.370 Selenge Tulga: Because, yeah, we can see, even it’s, the, column change name, it is also very, very, danger. And, okay,
148 00:17:01.520 ⇒ 00:17:05.810 Selenge Tulga: Okay, first, I see the… what data they are getting.
149 00:17:06.010 ⇒ 00:17:09.219 Selenge Tulga: And how many data are carrying, and if…
150 00:17:09.520 ⇒ 00:17:14.539 Selenge Tulga: If it’s, the huge data we need, we always need to,
151 00:17:15.420 ⇒ 00:17:33.089 Selenge Tulga: think about the cost, and what else, if we have, this huge data, we need to always do, I think, incremental, and because if you already have maybe 4K data, and next day, maybe you just need to add the 10K data, but why are you doing this?
152 00:17:33.090 ⇒ 00:17:37.940 Selenge Tulga: the whole things again, right? Because it’s not sustainable, and incremental. And…
153 00:17:38.220 ⇒ 00:17:54.130 Selenge Tulga: maybe think about, if it’s… we are using a table. If it’s a table, we can build and rebuild again. Maybe, we can use the materialization to incremental, and if it changes, we can, yeah, absurd, merge, and…
154 00:17:55.290 ⇒ 00:18:02.510 Selenge Tulga: Yeah, of doing it incremental. Yeah, and also think about the… what is the joints?
155 00:18:02.690 ⇒ 00:18:04.379 Selenge Tulga: Yeah, it is.
156 00:18:04.630 ⇒ 00:18:07.729 Selenge Tulga: Is this unnecessary join or not? Yeah.
157 00:18:08.220 ⇒ 00:18:08.840 Demilade Agboola: Okay.
158 00:18:09.510 ⇒ 00:18:12.599 Demilade Agboola: There is one thing that I know that really helps that we could think of.
159 00:18:13.370 ⇒ 00:18:14.810 Demilade Agboola: Indexing. Indexing is also…
160 00:18:14.810 ⇒ 00:18:16.639 Selenge Tulga: Oh, yeah, indexing. Yeah.
161 00:18:16.640 ⇒ 00:18:18.049 Demilade Agboola: I’m gonna say very good.
162 00:18:18.050 ⇒ 00:18:18.870 Selenge Tulga: Yeah.
163 00:18:19.950 ⇒ 00:18:22.130 Demilade Agboola: But yeah, I do like that. It’s…
164 00:18:22.950 ⇒ 00:18:27.089 Demilade Agboola: Important that whenever you’re doing, optimization.
165 00:18:27.090 ⇒ 00:18:27.990 Selenge Tulga: Yeah.
166 00:18:27.990 ⇒ 00:18:33.030 Demilade Agboola: You look at, like, all these things, and ensure that you are able to…
167 00:18:34.940 ⇒ 00:18:38.040 Demilade Agboola: Able to find out, like, anywhere you can save
168 00:18:38.160 ⇒ 00:18:41.799 Demilade Agboola: Table scans and the amount of data you’re reading.
169 00:18:42.320 ⇒ 00:18:49.890 Demilade Agboola: Okay, so now we have all our data in there, we have everything within the pipeline, we’ve updated
170 00:18:50.570 ⇒ 00:18:54.299 Demilade Agboola: EPT, Things are running fast, right?
171 00:18:54.570 ⇒ 00:18:58.180 Demilade Agboola: In terms of building out the mods, how do you
172 00:18:58.560 ⇒ 00:19:02.480 Demilade Agboola: that what the material building. Okay, so first things first.
173 00:19:03.200 ⇒ 00:19:05.450 Demilade Agboola: When you’re building out your mat,
174 00:19:05.840 ⇒ 00:19:14.399 Demilade Agboola: how do you num… how do you think of building out your schemas within the mat? Do you think of doing star schemas, or do you think of doing, like, big tables?
175 00:19:14.990 ⇒ 00:19:15.730 Demilade Agboola: And when.
176 00:19:15.730 ⇒ 00:19:16.315 Selenge Tulga: Ugh.
177 00:19:17.330 ⇒ 00:19:19.160 Demilade Agboola: Like, what situations do you use?
178 00:19:19.320 ⇒ 00:19:23.029 Demilade Agboola: Oh, like, sometimes you also do a combination of both, but when.
179 00:19:23.650 ⇒ 00:19:28.949 Demilade Agboola: a normalized or flat table schema when I’m using the Snowflake schema.
180 00:19:29.590 ⇒ 00:19:35.509 Selenge Tulga: Yeah, it… It is… mostly I choose the, star.
181 00:19:35.920 ⇒ 00:19:50.569 Selenge Tulga: Because, star is, you just need to create your, in a, the mark, you need to create your data modeling, and for the, star is, you have effect tables and denormalized, dimensional tables, right?
182 00:19:50.620 ⇒ 00:19:57.559 Selenge Tulga: In our own OTP, maybe you need to remove all this redundancy, but in a…
183 00:19:57.660 ⇒ 00:20:11.830 Selenge Tulga: then we have a huge data in our, in our warehouses, and we need to gather all data. I think it’s, Snowflakes, this, join with just… maybe you have a customer data, but, customer type is…
184 00:20:12.190 ⇒ 00:20:18.970 Selenge Tulga: However, another table, it’s like a snowflake, but I think it’s not efficient, and in a…
185 00:20:19.200 ⇒ 00:20:26.289 Selenge Tulga: the math, I mostly choose the, star schema, and the fact, and the dimensionals table, and because it is the…
186 00:20:26.650 ⇒ 00:20:31.869 Selenge Tulga: Sometimes the saving data is cheaper than these complex joints.
187 00:20:32.310 ⇒ 00:20:32.980 Selenge Tulga: Yeah.
188 00:20:35.530 ⇒ 00:20:36.589 Selenge Tulga: I can hear you.
189 00:20:36.590 ⇒ 00:20:42.860 Demilade Agboola: Oh, sorry, I muted myself. Okay. Okay, that’s fine, that’s fair. Okay, so that’s the first part. Second part is…
190 00:20:46.200 ⇒ 00:20:50.319 Demilade Agboola: If you have not yet gotten the metrics from, you know, the client that they need.
191 00:20:51.330 ⇒ 00:20:59.190 Demilade Agboola: How do you go about modeling the math, or how do you ensure… like, because again, when consulting, so we have different clients, you know, clients…
192 00:21:00.260 ⇒ 00:21:07.430 Demilade Agboola: Some give you, like, I work with some clients that when you join the team, they give you, like, an… like, you join the project, they give you an onboarding packet.
193 00:21:08.970 ⇒ 00:21:11.230 Demilade Agboola: So I’m gonna give you, like, a lot of stuff, right?
194 00:21:11.410 ⇒ 00:21:13.809 Demilade Agboola: Yeah. So how do you ensure that, like.
195 00:21:14.020 ⇒ 00:21:19.339 Demilade Agboola: You get everything you need, so that you’re not, like, just working, you know, blindly.
196 00:21:20.310 ⇒ 00:21:25.100 Selenge Tulga: Yeah, it is… it is a very good question, because sometimes client…
197 00:21:25.290 ⇒ 00:21:33.840 Selenge Tulga: Yeah, I work with some clients, a lot of the clients, not a lot of, yeah, it’s a couple of the clients, but sometimes they just don’t know what they want, right?
198 00:21:33.910 ⇒ 00:21:36.940 Demilade Agboola: Okay. They just… they just want to… okay.
199 00:21:36.940 ⇒ 00:21:41.989 Selenge Tulga: This is the data, we want the dashboard, but they don’t know what dashboard they want.
200 00:21:42.110 ⇒ 00:21:44.749 Demilade Agboola: It is the what KPIs they need, right?
201 00:21:45.140 ⇒ 00:21:56.979 Selenge Tulga: And, I think for the, okay, for the MART is a layer of the… we are using with your BI on the web app, right? And what they see. And for,
202 00:21:57.510 ⇒ 00:22:11.190 Selenge Tulga: I think, first thing first, I think, I want to work with the client, and what is… want to… exactly what they want, and what metrics they say. Do they want a daily,
203 00:22:11.780 ⇒ 00:22:24.729 Selenge Tulga: do they want, KPI by daily, or the, by the 3 months, and, what, the measurement they was… they are wanting, revenue, and are they, conversion rates, and…
204 00:22:25.570 ⇒ 00:22:31.539 Selenge Tulga: And, first, I wanted to see what is the sources, what is that they want, and…
205 00:22:31.780 ⇒ 00:22:35.480 Selenge Tulga: from, I’m thinking to backward, and
206 00:22:35.690 ⇒ 00:22:41.869 Selenge Tulga: if they need this KPI, and have this KPI calculated.
207 00:22:42.010 ⇒ 00:22:49.260 Selenge Tulga: And, yeah, if they have maybe some… because in my, last project, and all…
208 00:22:49.870 ⇒ 00:22:59.390 Selenge Tulga: the calculations in the, the, not Power BI, and they are all doing these calculations in the Power BI, and
209 00:22:59.640 ⇒ 00:23:13.819 Selenge Tulga: I want to see, I want to, with the client, what they want, and what measurement I need to create, and how this, the measurements and the revenue can calculate it, and how to ensure
210 00:23:14.030 ⇒ 00:23:19.499 Selenge Tulga: this… after the, I’m modeling all the data and the creating dashboard, how…
211 00:23:19.620 ⇒ 00:23:23.909 Selenge Tulga: Can we know it is accurate, and what is the calculation, is it?
212 00:23:24.720 ⇒ 00:23:27.129 Demilade Agboola: Yep, that’s fair, that’s fair.
213 00:23:27.870 ⇒ 00:23:36.379 Demilade Agboola: Okay, so now we’ve done all of that. I think my final question is, okay, how do we ensure that data quality is of the highest level possible?
214 00:23:36.620 ⇒ 00:23:39.880 Demilade Agboola: And how do we ensure that if things go bad.
215 00:23:40.310 ⇒ 00:23:42.390 Demilade Agboola: We know before the client knows.
216 00:23:43.260 ⇒ 00:23:45.959 Selenge Tulga: Yeah, I think even…
217 00:23:46.090 ⇒ 00:23:58.149 Selenge Tulga: I… even we have a 10X engineer with the AI, and I think now most critical issue is still the data quality, right? And because AI can’t know this data isn’t right, okay.
218 00:23:58.330 ⇒ 00:24:05.429 Selenge Tulga: I think DBT can introduce a lot of the testing with the old stages, and this is,
219 00:24:05.770 ⇒ 00:24:16.919 Selenge Tulga: We need to do all this data quality check, data quality check, and the SLAs, and, each layers, and, okay, for the raw data, and
220 00:24:17.550 ⇒ 00:24:30.650 Selenge Tulga: Is it the unique, and this is, idempotent, and because we need to… lot of, we need to sometimes do the backflowing, or the, rerun the back, and if it’s an idempotent, because
221 00:24:30.760 ⇒ 00:24:44.730 Selenge Tulga: maybe we can do the three times a day. We can double, triple this data, right? And we need to do all testing in each layer, and what is, the DPTs, maybe single testing, or…
222 00:24:45.480 ⇒ 00:24:56.810 Selenge Tulga: just, unique, nodes, and also, we can create, also, thinking about SLA, and data needs to be the fresh, and…
223 00:24:56.950 ⇒ 00:24:58.659 Selenge Tulga: I think,
224 00:24:59.340 ⇒ 00:25:16.119 Selenge Tulga: the… because, we need to, like, ready to all… always the schema changes, and we need to do all with the data quality thing. In this, instead of the silent filler, we need to monitor the all data when it’s a bricks, and
225 00:25:17.010 ⇒ 00:25:25.050 Selenge Tulga: to monitor. And because sometimes the most, yeah, failure is we don’t know when the data fall fails.
226 00:25:25.550 ⇒ 00:25:35.530 Selenge Tulga: And yeah, they do their all quality checks in a DBT, and also can, can implement some checking in the warehouses.
227 00:25:35.830 ⇒ 00:25:42.150 Selenge Tulga: And for the… maybe dark, we can also do what is the freshness and the… where we…
228 00:25:42.330 ⇒ 00:25:43.390 Selenge Tulga: To see that.
229 00:25:44.050 ⇒ 00:25:45.779 Selenge Tulga: Implement the monitoring, yeah.
230 00:25:46.970 ⇒ 00:25:53.150 Demilade Agboola: Okay, that’s fair. Okay, that’s fair.
231 00:25:53.490 ⇒ 00:25:55.960 Demilade Agboola: Okay, do you have any questions about, you know.
232 00:25:55.960 ⇒ 00:25:57.819 Selenge Tulga: Mmm, yeah, okay.
233 00:25:57.870 ⇒ 00:26:08.739 Selenge Tulga: I just… yeah, weird. So… okay, I just… I’m just curious, and, do you think… are there, because now I am learning a lot of the things, and…
234 00:26:08.740 ⇒ 00:26:22.839 Selenge Tulga: they have to use in a client, because sometimes we are just stuck with our current stack, right? Because sometimes I’m just, so have an anxiety or fears to test new tools with the clients, because if it’s a
235 00:26:23.360 ⇒ 00:26:40.470 Selenge Tulga: the wrong it is, because, sometimes I just stick with the snowflake, because I work with… comfortable with the snowflake, but how do you usually decide, which new tech is actually worth using the client, and when we need to refuse?
236 00:26:42.070 ⇒ 00:26:48.929 Demilade Agboola: So, usually we determine on a number of things. One is cost. I think that’s.
237 00:26:49.610 ⇒ 00:26:50.160 Demilade Agboola: most important.
238 00:26:51.280 ⇒ 00:26:54.460 Demilade Agboola: So if we know the client will…
239 00:26:55.280 ⇒ 00:26:58.199 Demilade Agboola: Clients, we know the client’s budgets, we know that the content.
240 00:26:58.200 ⇒ 00:26:58.880 Selenge Tulga: Unfortunately.
241 00:26:58.880 ⇒ 00:27:04.200 Demilade Agboola: tools. If we can have them a cheaper tool, or a tool that fits their budget, we will do that.
242 00:27:04.370 ⇒ 00:27:08.579 Demilade Agboola: Two is also, skillset on the team.
243 00:27:09.460 ⇒ 00:27:15.549 Demilade Agboola: Usually for warehouses, that’s not an issue, but sometimes for, like, BI tools, it.
244 00:27:16.100 ⇒ 00:27:17.309 Demilade Agboola: That can’t be an issue.
245 00:27:17.410 ⇒ 00:27:23.739 Demilade Agboola: because, you know, across warehouses, we’ve used Mother Dog, Snowflake, Redshift.
246 00:27:24.150 ⇒ 00:27:28.209 Demilade Agboola: Like, we can all kind of go across different teams and figure that out.
247 00:27:29.040 ⇒ 00:27:37.890 Demilade Agboola: But with BI tools, if we’re going to give you a BI tool that we need you to use every day, and you have, you know, multiple use cases, we need to be sure that there’s someone on the team.
248 00:27:38.240 ⇒ 00:27:40.090 Demilade Agboola: That can consistently answer your question.
249 00:27:40.090 ⇒ 00:27:40.610 Selenge Tulga: Jones, I think.
250 00:27:40.610 ⇒ 00:27:42.320 Demilade Agboola: About the dashboards you need to.
251 00:27:42.550 ⇒ 00:27:45.210 Demilade Agboola: We try to keep it within
252 00:27:45.510 ⇒ 00:27:52.769 Demilade Agboola: setting tool. So we use Real a lot, we use Omni a lot, we do use a little bit of Tableau, but we’re moving away from Tableau.
253 00:27:56.350 ⇒ 00:28:03.200 Demilade Agboola: What else? Usually, yeah, those are the kind of things we think about. So, the skill set on the team, the cost of the client, like, the cost…
254 00:28:03.430 ⇒ 00:28:04.619 Demilade Agboola: The budget of the clients.
255 00:28:06.040 ⇒ 00:28:16.799 Demilade Agboola: As well as the actual technical use case. So, there’s no point using a warehouse or a tool like Fivetran that doesn’t have the connector that they need. There’s no point.
256 00:28:17.370 ⇒ 00:28:21.350 Demilade Agboola: And it would take forever for them to build out the connector, because Firetran takes a while.
257 00:28:22.140 ⇒ 00:28:27.249 Demilade Agboola: It’s always better for us, for instance, to use polyatomic, because polyatomic, we can talk to polyatomic.
258 00:28:27.880 ⇒ 00:28:30.299 Demilade Agboola: On polyatomic will then give us the…
259 00:28:31.210 ⇒ 00:28:35.430 Demilade Agboola: Or they might not have it, but Polysomic will create a custom connector in 7 days.
260 00:28:35.430 ⇒ 00:28:36.260 Selenge Tulga: Mmm. Right.
261 00:28:36.260 ⇒ 00:28:39.549 Demilade Agboola: So that’s better for us and the use case of the clients, because
262 00:28:40.440 ⇒ 00:28:42.020 Demilade Agboola: They would get what they need.
263 00:28:42.320 ⇒ 00:28:43.880 Demilade Agboola: With a custom connector.
264 00:28:44.950 ⇒ 00:28:48.580 Demilade Agboola: Whereas with Fivetran, even if you get them that Fivetran.
265 00:28:49.050 ⇒ 00:28:51.270 Demilade Agboola: If the connector is not there, there’s no point.
266 00:28:51.270 ⇒ 00:28:52.980 Selenge Tulga: Yeah. Absolutely, I do.
267 00:28:52.980 ⇒ 00:28:57.880 Demilade Agboola: Yeah, so we just use use case, budgets, as well as just, like, the skill set on the team.
268 00:28:57.880 ⇒ 00:29:01.510 Selenge Tulga: Yeah, that’s a great answer, thank you for that.
269 00:29:01.640 ⇒ 00:29:02.680 Selenge Tulga: I have no problem.
270 00:29:03.260 ⇒ 00:29:05.210 Demilade Agboola: But yeah, I think…
271 00:29:05.690 ⇒ 00:29:11.860 Demilade Agboola: Do you have any other questions? And if not, we… No! Yeah, you were already answered them. Alright, that’s all good.
272 00:29:11.860 ⇒ 00:29:13.130 Selenge Tulga: Thank you.
273 00:29:13.130 ⇒ 00:29:15.300 Demilade Agboola: Yeah, so I’ll give my feedback to the team.
274 00:29:15.900 ⇒ 00:29:17.609 Demilade Agboola: I’m sure they’ll be in touch with you.
275 00:29:17.740 ⇒ 00:29:19.350 Demilade Agboola: Over the next couple of days.
276 00:29:19.820 ⇒ 00:29:21.500 Selenge Tulga: Yeah, thank you, Dong.
277 00:29:21.500 ⇒ 00:29:22.220 Demilade Agboola: Alright then. Take care.
278 00:29:22.220 ⇒ 00:29:24.609 Selenge Tulga: You’re made right there. Yeah, thank you, bye-bye!
279 00:29:24.610 ⇒ 00:29:25.280 Demilade Agboola: Bye.