Meeting Title: Brainforge Interview w- Demilade Date: 2026-04-17 Meeting participants: Rhodes, Demilade Agboola
WEBVTT
1 00:02:00.340 ⇒ 00:02:01.170 Demilade Agboola: Heroes.
2 00:02:01.500 ⇒ 00:02:02.420 Rhodes: How you doing?
3 00:02:02.630 ⇒ 00:02:04.010 Demilade Agboola: I’m pretty good, how are you?
4 00:02:04.310 ⇒ 00:02:05.550 Rhodes: Good.
5 00:02:05.950 ⇒ 00:02:09.859 Demilade Agboola: That’s good to hear. Can you hear me, Clayton?
6 00:02:10.610 ⇒ 00:02:12.579 Rhodes: Yeah, I can hear you good.
7 00:02:12.580 ⇒ 00:02:29.440 Demilade Agboola: Alright, that’s all good then. Alright, so, I am Dim Ladier. I work with the Brainford team as a Senior Analytics Engineer, and for this interview, we’ll just kind of be going through
8 00:02:32.250 ⇒ 00:02:44.149 Demilade Agboola: a bit of, like, an architectural design sort of phase, where we’ll be talking about different things you will do, and just trying to understand how you work and how you think through data problems. Okay.
9 00:02:45.120 ⇒ 00:02:49.980 Demilade Agboola: So that sounds good. So I think we can just start from the top, where we can kind of…
10 00:02:50.460 ⇒ 00:02:57.390 Demilade Agboola: Can we get, like, can I get an understanding of, like, your experience and, your stack?
11 00:02:58.090 ⇒ 00:03:03.210 Rhodes: Yeah, 100%. So, just to give you kind of a high…
12 00:03:03.480 ⇒ 00:03:13.129 Rhodes: level experience. I most recently worked at Corios as a data analytics engineer in a consulting capacity.
13 00:03:13.240 ⇒ 00:03:25.119 Rhodes: I work across multiple client environments where the primary focus was migrating legacy systems to modern lake houses using Spark-based architectures.
14 00:03:25.700 ⇒ 00:03:26.030 Demilade Agboola: Okay.
15 00:03:27.080 ⇒ 00:03:31.709 Rhodes: I would say kind of, like, the hardest things that I worked on was in…
16 00:03:31.820 ⇒ 00:03:35.690 Rhodes: Like, next to the pipeline implementations was kind of taking…
17 00:03:36.190 ⇒ 00:03:49.989 Rhodes: maybe, like, implicit or inconsistent business logic, and making that into, like, structured data models, so I had to resolve a lot of things, like grain mismatches, aligning definitions across teams.
18 00:03:50.060 ⇒ 00:03:56.490 Rhodes: Making non-determinaisms, like sorting issues, explicit.
19 00:03:56.930 ⇒ 00:04:06.529 Rhodes: And then, kind of, a big part of my role is working with stakeholders to validate those transformations and make sure it supported decision making.
20 00:04:07.040 ⇒ 00:04:17.230 Rhodes: And then, outside of pipelines, I also contributed to things like building validation frameworks, dependency mapping, and documentation standards.
21 00:04:17.730 ⇒ 00:04:25.009 Demilade Agboola: I’ll get this answered. A follow-up question to that would be, what would you say the most complex
22 00:04:25.280 ⇒ 00:04:30.459 Demilade Agboola: thing you’ve worked on was, and what made you comp… like, complicated or quite complex?
23 00:04:31.630 ⇒ 00:04:36.180 Rhodes: Yeah, well, probably when I was working with Sun Life.
24 00:04:36.350 ⇒ 00:04:44.959 Rhodes: We were seeing a lot of inconsistencies, across, the dental pricing unit that I worked on.
25 00:04:45.240 ⇒ 00:04:48.609 Rhodes: There was a lot of…
26 00:04:49.090 ⇒ 00:04:52.409 Rhodes: Well, I kind of separated the issues into two.
27 00:04:52.470 ⇒ 00:05:03.690 Rhodes: sort of routes. One was, definition issues. So, for example, in one case for an end-of-quarter pricing report, different teams
28 00:05:03.750 ⇒ 00:05:15.180 Rhodes: We’re calculating revenue at different grains. So, like, one was using build definition, and the other was applying the collections rate on top of it.
29 00:05:15.650 ⇒ 00:05:18.269 Rhodes: And then those are being aggregated together.
30 00:05:18.530 ⇒ 00:05:26.579 Rhodes: So, I kind of had to trace that and map it to where that divergence happened, and show a side-by-side comparison.
31 00:05:27.140 ⇒ 00:05:34.780 Rhodes: Kind of shift that conversation from what’s right to what each represents and what we wanted to standardize.
32 00:05:35.350 ⇒ 00:05:36.470 Rhodes: And then…
33 00:05:36.640 ⇒ 00:05:49.679 Rhodes: rebuilding the transformation logic in Spark, there’s also a lot of cases where we dealt with not enough sorting or non-deterministic, like, mode calculations for risk adjustment.
34 00:05:49.770 ⇒ 00:05:58.200 Rhodes: So, aligning on those metrics and kind of restoring trust in that pipeline was probably the hardest thing I worked on.
35 00:06:02.160 ⇒ 00:06:03.869 Demilade Agboola: Sorry, I just sneeze coming in.
36 00:06:04.080 ⇒ 00:06:06.029 Rhodes: On the night, it happens.
37 00:06:07.270 ⇒ 00:06:10.929 Demilade Agboola: I was gonna then ask, okay, that was, that’s good,
38 00:06:11.240 ⇒ 00:06:13.159 Demilade Agboola: Just, like, following up on that.
39 00:06:14.480 ⇒ 00:06:20.989 Demilade Agboola: If we’re to… this is now where we start, like, the scenario, like, walking through how you would, like…
40 00:06:21.270 ⇒ 00:06:24.009 Demilade Agboola: Build out its, data infrastructure.
41 00:06:24.240 ⇒ 00:06:29.659 Demilade Agboola: So just in here, just, like, the idea of this is I’ll ask you a question.
42 00:06:29.840 ⇒ 00:06:36.439 Demilade Agboola: As much as possible, I would like to know what assumptions you make, what questions you want to ask.
43 00:06:36.810 ⇒ 00:06:39.989 Demilade Agboola: How you think through the problem.
44 00:06:40.970 ⇒ 00:06:41.470 Rhodes: Okay.
45 00:06:41.470 ⇒ 00:06:49.170 Demilade Agboola: Again, ultimately, there isn’t, like, a fixed answer, but the idea is just truly understanding how you think through, like, data problems, right?
46 00:06:49.470 ⇒ 00:06:50.090 Rhodes: Okay.
47 00:06:50.260 ⇒ 00:06:59.539 Demilade Agboola: Alright, so… If we had a client, that wants to build out a daily, like, revenue mart.
48 00:07:01.250 ⇒ 00:07:07.209 Demilade Agboola: And there are 3 different sources, so we have Stripe, Salesforce, and Google Ads.
49 00:07:07.940 ⇒ 00:07:08.770 Rhodes: Okay.
50 00:07:09.120 ⇒ 00:07:12.240 Demilade Agboola: How would you design the solution?
51 00:07:13.990 ⇒ 00:07:14.990 Rhodes: Okay.
52 00:07:16.920 ⇒ 00:07:23.939 Rhodes: So… I guess, right, so first…
53 00:07:24.530 ⇒ 00:07:27.479 Rhodes: I guess I’m kind of asking, like, what…
54 00:07:28.240 ⇒ 00:07:34.240 Rhodes: Decisions… What decisions is this supposed to support?
55 00:07:34.580 ⇒ 00:07:41.480 Rhodes: You know, do we want something more real-time? Do we want streaming? Do we want…
56 00:07:41.810 ⇒ 00:07:43.800 Rhodes: Batch, do we want something?
57 00:07:44.100 ⇒ 00:07:49.640 Rhodes: adjusted… Who’s using it?
58 00:07:50.990 ⇒ 00:08:02.560 Rhodes: I mean, I think, right, like, the first thing I look to do is kind of standardize the definition, so, like, is this recognized revenue, the cash collected?
59 00:08:02.760 ⇒ 00:08:06.359 Rhodes: Something like that, kind of similar to the Sun Life situation.
60 00:08:06.940 ⇒ 00:08:13.919 Rhodes: And then, maybe as well, right, like, how are we gonna model it? Do we need…
61 00:08:14.250 ⇒ 00:08:19.279 Rhodes: The level of detail to be on the transaction side.
62 00:08:19.480 ⇒ 00:08:25.800 Rhodes: Or maybe, like, the customer by the product, something like that, so…
63 00:08:26.050 ⇒ 00:08:35.719 Rhodes: On the infrastructure end as well, right? Like, I was saying, like, streaming versus batch, like, I guess in that way, I’m thinking of…
64 00:08:35.870 ⇒ 00:08:37.000 Rhodes: performance.
65 00:08:37.510 ⇒ 00:08:38.950 Rhodes: So…
66 00:08:40.659 ⇒ 00:08:40.999 Demilade Agboola: Okay.
67 00:08:41.000 ⇒ 00:08:49.220 Rhodes: Yeah, so now I’m just kind of thinking of, how I would go through, designing the, the pipeline.
68 00:08:51.480 ⇒ 00:08:58.929 Rhodes: So, I guess first, I want to… right, so first I want to define the grain, because that’s going to determine the model.
69 00:08:59.270 ⇒ 00:09:05.819 Rhodes: You said it’s a… it’s a mart, right? So it’s, like, gold level, so we probably…
70 00:09:06.370 ⇒ 00:09:14.030 Rhodes: want… a core fact table grain to look like.
71 00:09:14.030 ⇒ 00:09:23.499 Demilade Agboola: Before we get into the mod aspect of it, how would you handle, like, ingestion, and how would you handle storage and transformation, like, the…
72 00:09:23.500 ⇒ 00:09:23.890 Rhodes: Okay.
73 00:09:24.180 ⇒ 00:09:25.909 Demilade Agboola: Probably gets all of them marked.
74 00:09:26.700 ⇒ 00:09:33.289 Rhodes: Okay, gotcha. So, this is…
75 00:09:33.620 ⇒ 00:09:38.350 Rhodes: I’m assuming this is, like, kind of like a…
76 00:09:38.580 ⇒ 00:09:47.059 Rhodes: ETL versus, like, an ELT, or I don’t want to assume, but, I guess, right, I would kind of start with…
77 00:09:47.280 ⇒ 00:09:52.100 Rhodes: The staging layer, so… Let’s see…
78 00:09:52.410 ⇒ 00:09:56.619 Rhodes: Or, sorry, I’ll start with the ingestion layer, so…
79 00:09:59.740 ⇒ 00:10:07.840 Rhodes: Let’s see… maybe… Something like… Managed connectors, or, like.
80 00:10:08.530 ⇒ 00:10:13.610 Rhodes: an EL… just an EL tool that’s, like, available and fits their stack.
81 00:10:13.860 ⇒ 00:10:24.070 Rhodes: I guess at first, maybe, like, 5TRAN, since it’s, like, kind of for different sources,
82 00:10:24.690 ⇒ 00:10:26.420 Rhodes: But I’d kinda wanna…
83 00:10:26.610 ⇒ 00:10:35.910 Rhodes: ingest each into, like, a raw layer with minimal, like, without doing much transformation, so just start with the raw tables.
84 00:10:36.090 ⇒ 00:10:39.720 Rhodes: And keep that close to the source so that we can kind of trace.
85 00:10:39.870 ⇒ 00:10:41.809 Rhodes: What’s coming in and out.
86 00:10:43.430 ⇒ 00:10:45.950 Rhodes: And then, yeah, so I’m thinking, right, like…
87 00:10:47.530 ⇒ 00:10:51.650 Rhodes: What cadence do we want based on the business needs, so…
88 00:10:52.060 ⇒ 00:10:56.750 Rhodes: Stripe is like a… it’s like a payment, source, right?
89 00:10:57.050 ⇒ 00:10:57.390 Demilade Agboola: Yes.
90 00:10:57.390 ⇒ 00:11:05.739 Rhodes: Okay, so probably, like, more frequent, or, like, more real-time, for that.
91 00:11:05.910 ⇒ 00:11:12.450 Rhodes: I think, from what I understand of Salesforce, that can be more, like, scheduled based on…
92 00:11:12.720 ⇒ 00:11:15.249 Rhodes: Pipeline reporting needs,
93 00:11:16.000 ⇒ 00:11:23.060 Rhodes: Not sure, I guess, the use case for Google would be a little ambiguous, maybe… Like, it’s…
94 00:11:23.630 ⇒ 00:11:28.809 Rhodes: I don’t know, I guess it could be either, depending on what it’s for,
95 00:11:28.990 ⇒ 00:11:32.550 Rhodes: Maybe there’s, like, a strong real-time, you know, case for it.
96 00:11:32.760 ⇒ 00:11:39.499 Rhodes: But yeah, so, just to clarify a couple things, right, I look, like, full refresh versus incremental.
97 00:11:39.610 ⇒ 00:11:42.180 Rhodes: That’s something that I’ve worked on.
98 00:11:42.510 ⇒ 00:11:45.449 Rhodes: Extensively, my experience,
99 00:11:45.750 ⇒ 00:11:53.389 Rhodes: Right, so, like, do we need the full, records, for, like, analysis? Can we just append the table?
100 00:11:53.530 ⇒ 00:12:03.449 Rhodes: In one case, they wanted both, one for historical records, one for analysis, so I split that into a slowly changing dimension type 4.
101 00:12:05.580 ⇒ 00:12:09.790 Rhodes: on the API side, maybe, like, rate limits, something like that.
102 00:12:09.890 ⇒ 00:12:13.910 Rhodes: And then make sure that it’s catch… capturing metadata.
103 00:12:14.130 ⇒ 00:12:21.999 Rhodes: Like a local timestamp, source system, maybe like a… like a batch ID, something like that.
104 00:12:22.960 ⇒ 00:12:23.929 Demilade Agboola: Okay, alright.
105 00:12:25.150 ⇒ 00:12:28.439 Rhodes: Yeah. Do you want me to move into the transformation?
106 00:12:28.970 ⇒ 00:12:31.570 Demilade Agboola: Yes, you can talk, we can talk transformation now.
107 00:12:31.920 ⇒ 00:12:32.960 Rhodes: Okay, cool.
108 00:12:33.450 ⇒ 00:12:39.280 Rhodes: So… For transformation layers… I kind of…
109 00:12:40.020 ⇒ 00:12:42.840 Rhodes: Right, I guess I kind of think of it…
110 00:12:42.980 ⇒ 00:12:48.970 Rhodes: How can we keep the logic kind of, like, modular and easy to validate?
111 00:12:49.410 ⇒ 00:12:59.909 Rhodes: So… I kind of think about, right, like, cleaning and standardizing each source independently, maybe, like, normalized timestamps.
112 00:13:02.380 ⇒ 00:13:11.800 Rhodes: I don’t know, standard, like, let’s say we have payments coming in from multiple regions, like normalized, or standardized currencies in that case.
113 00:13:12.020 ⇒ 00:13:16.400 Rhodes: Okay. Deduplicate, if you have to deduplicate.
114 00:13:17.580 ⇒ 00:13:19.550 Rhodes: Or else you’ll end up with,
115 00:13:19.910 ⇒ 00:13:29.259 Rhodes: silent explosions or wrong aggregation, stuff like that. And then also making keys consistent is really important. That’s one thing I’ve ran into.
116 00:13:29.780 ⇒ 00:13:37.060 Rhodes: It’s, you know, you have, especially in the particular case I was working in.
117 00:13:37.200 ⇒ 00:13:42.970 Rhodes: like, SAS treats any variation of the same… You know.
118 00:13:43.690 ⇒ 00:13:55.970 Rhodes: characters as the same thing, but, I think most systems, it doesn’t work like that, so making sure that’s consistent as well, and renaming fields so that they’re clear and usable is also important.
119 00:13:56.260 ⇒ 00:13:58.020 Rhodes: Okay. Yep.
120 00:13:59.670 ⇒ 00:14:00.400 Demilade Agboola: Alright.
121 00:14:04.020 ⇒ 00:14:10.189 Demilade Agboola: So, I just want to ask a question based off that. So, say the MAC models, would you want to use a star schema?
122 00:14:10.310 ⇒ 00:14:12.310 Demilade Agboola: Or a normalized schema.
123 00:14:12.840 ⇒ 00:14:16.849 Demilade Agboola: And… When do you choose…
124 00:14:17.550 ⇒ 00:14:23.089 Demilade Agboola: Which, like, when would you lean on a star schema versus when would you lean on a normalized schema?
125 00:14:24.040 ⇒ 00:14:26.810 Rhodes: Yeah, that’s a good question.
126 00:14:27.060 ⇒ 00:14:32.490 Rhodes: I think… for… Right, so…
127 00:14:32.590 ⇒ 00:14:50.099 Rhodes: I mean, the general difference, right, is that a star schema has a little bit less of a specified grain, which makes it a little bit easier, to query on the analytics side. I think
128 00:14:50.910 ⇒ 00:15:10.129 Rhodes: you know, at least most of the cases that I… that I had worked in, at Corios, right, like, it was business users without a lot of, like, engineering knowledge, using it, so we would generally defer to star schema, just because it keeps, like, the model easier to understand.
129 00:15:10.420 ⇒ 00:15:17.819 Rhodes: you know, just having, like, one… one fact table for the business event or measurement, kind of.
130 00:15:19.330 ⇒ 00:15:26.629 Rhodes: like, I guess so the primary goal is reporting or analytics, or, you know, users need, like, simple,
131 00:15:27.000 ⇒ 00:15:33.949 Rhodes: like, access to the metrics. Like, I would use a star schema. Also, if I want to reduce the number of joins,
132 00:15:34.480 ⇒ 00:15:40.249 Rhodes: Or, you know, performance, for performance reasons as well, or usability.
133 00:15:40.540 ⇒ 00:15:46.140 Rhodes: then I think that would be the way to go, in terms of, yeah, like, a normalized schema.
134 00:15:46.670 ⇒ 00:15:55.199 Rhodes: Guess that makes sense, maybe, if, the workflows are closer to, like, engineering or operations.
135 00:15:55.760 ⇒ 00:16:03.650 Rhodes: I think in that case, right, like, you can… it does make sense in some cases to have
136 00:16:03.760 ⇒ 00:16:10.710 Rhodes: A more specified grain, like, especially if…
137 00:16:11.760 ⇒ 00:16:18.890 Rhodes: Maybe, like, if duplication would be really difficult to manage, or the use case is more system-oriented than
138 00:16:19.240 ⇒ 00:16:22.619 Rhodes: You know, it might be more straightforward in that way.
139 00:16:23.190 ⇒ 00:16:24.130 Rhodes: Okay.
140 00:16:25.130 ⇒ 00:16:32.030 Demilade Agboola: So just off… This, in a normalized schema.
141 00:16:32.840 ⇒ 00:16:37.100 Demilade Agboola: how would your output be? So, for the revenue model, what’s your, like…
142 00:16:37.280 ⇒ 00:16:47.899 Demilade Agboola: what would your outputs be? So, like, table names, like, just an idea of… I want to be able to fully key into how you visualize it, versus if we’re using a star schema, what would the outputs be?
143 00:16:48.750 ⇒ 00:16:59.110 Rhodes: Yeah, yeah, so… I guess, like, for this case… Right? Like… Let’s just take the…
144 00:16:59.230 ⇒ 00:17:03.189 Rhodes: transactions table, like, from Stripe.
145 00:17:03.370 ⇒ 00:17:11.000 Rhodes: So… you would have… Like, a normalized output might look like.
146 00:17:11.490 ⇒ 00:17:20.079 Rhodes: It would be, like, the fact table would be… like, transactions… per region.
147 00:17:20.190 ⇒ 00:17:28.620 Rhodes: Be like, yeah, like, transactions per product per region, or, like, transactions per…
148 00:17:29.220 ⇒ 00:17:36.710 Rhodes: customer ID per region versus, like, in a star schema, right? Like, it might end up being just, like.
149 00:17:37.120 ⇒ 00:17:38.800 Rhodes: transactions.
150 00:17:39.090 ⇒ 00:17:45.470 Rhodes: like, joined with product, or something like that. Like, it would be less, less specific.
151 00:17:46.090 ⇒ 00:17:46.440 Demilade Agboola: Anytime.
152 00:17:48.430 ⇒ 00:17:49.280 Rhodes: Alright.
153 00:17:50.170 ⇒ 00:18:00.530 Demilade Agboola: So another question now is, say, we’ve started to build out these models, and these, like, queries.
154 00:18:01.860 ⇒ 00:18:07.679 Demilade Agboola: And we noticed that over time, we’ve gotten to a point where we have 400 million rows in one of these tables.
155 00:18:08.090 ⇒ 00:18:12.480 Demilade Agboola: And the query’s taken… the query runtime has exploded, basically.
156 00:18:13.080 ⇒ 00:18:14.049 Demilade Agboola: How do we…
157 00:18:14.570 ⇒ 00:18:20.779 Demilade Agboola: work on optimizing that query? How would you try to work on the efficiency of that query?
158 00:18:21.000 ⇒ 00:18:33.719 Demilade Agboola: And I assume worst case scenario, so, like, you’re going in there, and it’s the worst possible way it could have been set up. How would you try and, like, what would you be looking out for? How would you be trying to ensure that you can get that runtime down?
159 00:18:35.000 ⇒ 00:18:36.060 Rhodes: Okay.
160 00:18:36.220 ⇒ 00:18:41.900 Rhodes: Yeah, that’s a good question. I mean… I guess the first…
161 00:18:42.210 ⇒ 00:18:49.019 Rhodes: I mean, in general, right, like, the first thing I do is avoid Jumping to a conclusion,
162 00:18:49.150 ⇒ 00:18:50.319 Rhodes: I think, like.
163 00:18:50.790 ⇒ 00:19:09.960 Rhodes: generally… well, you know, when I worked in Spark, it was, like, kind of easy to use the Spark UI to kind of find where the slowdowns happen, but let’s take that out of the equation. I’d want to look for things like where is too many… too much data being scanned?
164 00:19:10.290 ⇒ 00:19:17.010 Rhodes: Where are, like, inefficient joints happening? Where is there a lack of deduplication?
165 00:19:17.310 ⇒ 00:19:23.660 Rhodes: Or, you know, whereas, like, a grain mismatch or overly wide models?
166 00:19:24.050 ⇒ 00:19:25.820 Rhodes: So…
167 00:19:26.350 ⇒ 00:19:44.030 Rhodes: I mean, also, I’ve, like, faced cases where users are querying directly from intermediate tables instead of, like, the final goal table, but I guess, right, like, my first step is sort of diagnosing it and kind of going backwards to find…
168 00:19:44.330 ⇒ 00:19:51.560 Rhodes: Where the slowdowns actually… actually happening, so… First.
169 00:19:51.880 ⇒ 00:19:56.739 Rhodes: I kind of want to… Maybe look at model design.
170 00:19:57.080 ⇒ 00:20:01.820 Rhodes: Maybe we need narrower fact tables.
171 00:20:02.200 ⇒ 00:20:06.550 Rhodes: I think, like, sometimes the… Right.
172 00:20:06.710 ⇒ 00:20:17.639 Rhodes: like, choice. It might not be engine level, but maybe a better final model, or maybe we have, like, a flat table that can, like, in ingestion, or…
173 00:20:17.920 ⇒ 00:20:28.299 Rhodes: Yeah, like, maybe, like, in a silver layer that can be broken into, like, another fact and dimension table, right? Like, maybe it’s overly wide in that way.
174 00:20:30.040 ⇒ 00:20:35.230 Rhodes: Another thing I would do would be to… maybe…
175 00:20:35.830 ⇒ 00:20:49.399 Rhodes: like, reduce scan volume, so I think one thing that happens often is that too much data gets brought too far, so often what that looks like is columns you don’t need.
176 00:20:49.590 ⇒ 00:20:56.550 Rhodes: Or… Like, dates that are strings, or…
177 00:20:56.820 ⇒ 00:21:13.159 Rhodes: I’m trying to think of, like, another… trying to think of another example, but that’s kind of what comes to mind. Or actually, yeah, another thing that comes to mind, too, with that is that sometimes in workflows, you have intermediate tables in, like, the same step.
178 00:21:13.390 ⇒ 00:21:18.320 Rhodes: That don’t need to be, like, they can… you can move to…
179 00:21:18.660 ⇒ 00:21:29.319 Rhodes: the final table that’s, like, joined with another final table, right? But then, like, garbage collect the intermediate tables, before that point, instead of keeping them all together.
180 00:21:29.720 ⇒ 00:21:39.489 Rhodes: And then… And then, yeah, also, yeah, joints, so join strategy. So, I look for things like…
181 00:21:39.720 ⇒ 00:21:43.140 Rhodes: Are we joining on strings instead of…
182 00:21:43.380 ⇒ 00:21:48.380 Rhodes: Like, integer keys, are in missing deduplications?
183 00:21:48.490 ⇒ 00:21:54.890 Rhodes: And with that, right, I look for, like, many-to-many relationships, because those will cause blow-ups.
184 00:21:55.230 ⇒ 00:21:59.140 Rhodes: And I look to use, broadcast joins as well.
185 00:21:59.440 ⇒ 00:22:03.549 Rhodes: Or… Smaller dimensions. And then…
186 00:22:04.460 ⇒ 00:22:10.300 Rhodes: Okay, yeah, sorry, I know this is, like, going long, but okay, two… two more things,
187 00:22:11.300 ⇒ 00:22:16.749 Rhodes: One, reducing unnecessary shuffles and skew.
188 00:22:16.950 ⇒ 00:22:28.480 Rhodes: And two, using incremental, incremental, Why am I forgetting this?
189 00:22:28.650 ⇒ 00:22:36.300 Rhodes: okay, sorry. I’ll end there.
190 00:22:36.300 ⇒ 00:22:38.340 Demilade Agboola: So, the incremental materialization.
191 00:22:39.520 ⇒ 00:22:41.699 Rhodes: Hmm, yeah, yeah, yeah, so.
192 00:22:42.910 ⇒ 00:22:45.960 Demilade Agboola: You materialize the… you materialize it incrementally.
193 00:22:46.610 ⇒ 00:22:51.430 Rhodes: Yeah, yeah, yeah, yeah. So kind of,
194 00:22:51.650 ⇒ 00:22:56.030 Rhodes: Can we process… yeah, so can we process only new or changed data?
195 00:22:56.330 ⇒ 00:22:59.850 Rhodes: While keeping the historical ones, stable.
196 00:23:00.370 ⇒ 00:23:01.969 Demilade Agboola: Okay. Sounds good.
197 00:23:02.130 ⇒ 00:23:11.689 Demilade Agboola: So final question, and after this, once this will bring, like, a 2-minute sort of question, we shall have about 5 more minutes, and you can ask me any questions you want to ask.
198 00:23:12.050 ⇒ 00:23:12.700 Rhodes: Okay.
199 00:23:13.260 ⇒ 00:23:15.150 Demilade Agboola: So…
200 00:23:15.320 ⇒ 00:23:23.080 Demilade Agboola: If we have a client that… so, this is a way from, like, the scenario, like, that we’ve built so far. This is just, like, if we have a client that has
201 00:23:23.660 ⇒ 00:23:25.449 Demilade Agboola: A need for a dashboard.
202 00:23:25.810 ⇒ 00:23:33.399 Demilade Agboola: But they don’t like… Clearly state what metrics they need, or, like, clearly bring what that dashboard
203 00:23:33.590 ⇒ 00:23:44.590 Demilade Agboola: they don’t clearly bring the dashboard needs over to you. How are you able to interact with the client, and how are you able to crystallize that before you get into
204 00:23:44.700 ⇒ 00:23:48.160 Demilade Agboola: Model development and ingestion and everything.
205 00:23:48.460 ⇒ 00:23:55.010 Demilade Agboola: And two, if that same client also has a fast turnaround time.
206 00:23:55.370 ⇒ 00:23:56.050 Rhodes: Okay.
207 00:23:56.050 ⇒ 00:24:07.799 Demilade Agboola: but you’re aware that, this would affect… potentially affect technical excellence, or, like, the quality of what goes out the door. How do you balance those two things?
208 00:24:09.070 ⇒ 00:24:16.980 Rhodes: Yeah, no, that’s, pretty, pretty common scenario, so, okay, so first…
209 00:24:17.240 ⇒ 00:24:25.980 Rhodes: Right, so first, kind of before any technical work, a client comes to me, and they say, we need a dashboard.
210 00:24:26.170 ⇒ 00:24:30.329 Rhodes: I would start… I wouldn’t start with…
211 00:24:30.510 ⇒ 00:24:37.860 Rhodes: like, tools or data, it starts with decisions and kind of aligning, goals, right? So kind of asking.
212 00:24:38.120 ⇒ 00:24:41.279 Rhodes: You know, what do you need to support?
213 00:24:41.470 ⇒ 00:24:48.109 Rhodes: That’s gonna define everything else. I wanna make sure that we’re on the same page, I want to make sure that
214 00:24:48.360 ⇒ 00:24:55.860 Rhodes: What their business use case is, is what’s prioritized and what it’s optimized for.
215 00:24:56.520 ⇒ 00:25:01.069 Rhodes: And then also, right, I’m also kind of asking…
216 00:25:01.680 ⇒ 00:25:06.889 Rhodes: You know, who’s the primary user in this? How often do you need to…
217 00:25:07.020 ⇒ 00:25:12.530 Rhodes: Use this dashboard, so we can, like, design how often it pings the server.
218 00:25:12.830 ⇒ 00:25:16.540 Rhodes: And, yeah, kind of look at, like, hey, like.
219 00:25:16.660 ⇒ 00:25:19.810 Rhodes: Do you have something similar that,
220 00:25:20.030 ⇒ 00:25:29.720 Rhodes: supports this today, versus, like, what are the gaps in that? Or if they don’t have anything, right? Like, kind of maybe try to anchor it on some specific decision.
221 00:25:30.020 ⇒ 00:25:41.310 Rhodes: Then, yeah, so then, like, once I kind of understand the use case, I’ll try to, instead of just defining a full dashboard up front, proposing, like, a small
222 00:25:41.420 ⇒ 00:25:44.250 Rhodes: kind of MVP style.
223 00:25:44.400 ⇒ 00:25:48.849 Rhodes: Focused on a few key metrics, and validated early.
224 00:25:49.120 ⇒ 00:25:53.230 Rhodes: And then… Yeah, and then make sure that you just walk it through.
225 00:25:53.440 ⇒ 00:25:58.109 Rhodes: To kind of confirm we’re solving the right problem and scale from there.
226 00:25:58.580 ⇒ 00:26:07.269 Rhodes: And then, as far as your second question about, Turnaround time versus quality.
227 00:26:07.730 ⇒ 00:26:09.809 Rhodes: You know, I think,
228 00:26:11.130 ⇒ 00:26:27.939 Rhodes: I think I try to not push back directly, or try not to, like, make the decision for them, but what I do want to do is make the risks of moving forward immediately clear, and kind of frame
229 00:26:28.320 ⇒ 00:26:29.880 Rhodes: the option…
230 00:26:30.300 ⇒ 00:26:47.379 Rhodes: you know, around that speed and quality trade-off, right? So, like, maybe we can define what’s good enough, so we can deliver that, like, initial quick win and iterate from there. But definitely, I think, giving them the options of, like, hey.
231 00:26:47.540 ⇒ 00:26:51.039 Rhodes: We can do a, like, smaller…
232 00:26:51.220 ⇒ 00:26:58.340 Rhodes: You know, high… higher confidence in the quality set in the same timeframe, or… we can…
233 00:26:58.460 ⇒ 00:27:02.800 Rhodes: You know, expand the timeline a bit, but make it, like, fully…
234 00:27:03.140 ⇒ 00:27:05.729 Rhodes: Fully integrate what you’re looking for there.
235 00:27:06.210 ⇒ 00:27:07.230 Rhodes: Okay.
236 00:27:07.770 ⇒ 00:27:08.810 Demilade Agboola: Fair enough, fair enough.
237 00:27:09.170 ⇒ 00:27:12.080 Demilade Agboola: Alright, thank you very much for the answers to your question.
238 00:27:12.080 ⇒ 00:27:12.630 Rhodes: That’s junk.
239 00:27:12.630 ⇒ 00:27:17.000 Demilade Agboola: I’m not sure if you have any questions for me as well.
240 00:27:17.180 ⇒ 00:27:35.419 Rhodes: Yeah, yeah, def… I do. So, I think first, I was kind of… I kind of want to get a better sense of how you see this role, this data engineering role, interacting with your role as a… as the analytics engineer.
241 00:27:36.190 ⇒ 00:27:46.000 Demilade Agboola: Yeah, sure. So, in terms of how your role will interact with my role, we do have, like, verticals within the data team.
242 00:27:46.160 ⇒ 00:27:58.059 Demilade Agboola: So one vertical is the data platform team, the other vertical is the data modeling team. So I kind of, like, lead things around data modeling.
243 00:27:58.200 ⇒ 00:28:08.789 Demilade Agboola: So in terms of, like, the data, platforms and infrastructure, that’s kind of our wishes side of things, and the way that would work would be,
244 00:28:10.390 ⇒ 00:28:12.609 Demilade Agboola: Things around, like,
245 00:28:12.710 ⇒ 00:28:26.439 Demilade Agboola: monitoring things around, like, infra setup. So, we just got a new client, we’re gonna set up, like, ingestion pipelines, set up, like, GitHub action, set up just things around that sort of nature.
246 00:28:26.840 ⇒ 00:28:30.389 Demilade Agboola: As well as also just consistently creating, like.
247 00:28:30.530 ⇒ 00:28:36.340 Demilade Agboola: new tools internally. For instance, this week we’ve come up with a data diff tool.
248 00:28:36.500 ⇒ 00:28:38.880 Demilade Agboola: That, based off every.
249 00:28:39.650 ⇒ 00:28:40.700 Rhodes: PR.
250 00:28:40.860 ⇒ 00:28:47.580 Demilade Agboola: works and runs specifically just the models that have changed, just to ensure that, so…
251 00:28:47.740 ⇒ 00:28:53.279 Demilade Agboola: And we can deploy it across all our clients all at once, so that we can ensure that…
252 00:28:54.210 ⇒ 00:28:56.919 Demilade Agboola: We are doing, like, what needs to be done.
253 00:28:57.190 ⇒ 00:29:08.570 Demilade Agboola: So yeah, that would be, what your role will, like, look like in that sense. It would be more of, like, the data platforms, while, like, my role will be more of, like.
254 00:29:08.750 ⇒ 00:29:15.109 Demilade Agboola: on the actual modeling and transformations, and I try my best to come up with rules around, like.
255 00:29:15.270 ⇒ 00:29:25.439 Demilade Agboola: How we model and what we do to ensure that things are typecasting, where we do that, all that stuff, so that we ensure that the quality that goes out is still good.
256 00:29:25.730 ⇒ 00:29:28.399 Demilade Agboola: Well, like, the platform team
257 00:29:28.890 ⇒ 00:29:38.760 Demilade Agboola: works around that in terms of being sure that everything is… And also, there are also, like, new cases. Sometimes the AI team needs to load some data or move data across different things.
258 00:29:38.960 ⇒ 00:29:44.409 Demilade Agboola: Or sometimes we… I know, like, for one of our clients, we helped them set up a…
259 00:29:44.660 ⇒ 00:29:47.589 Demilade Agboola: We’ll help them with the process of setting up their…
260 00:29:48.360 ⇒ 00:29:51.129 Demilade Agboola: new OS, like, the new software.
261 00:29:51.280 ⇒ 00:30:00.049 Demilade Agboola: But the word data thing came up in was because they were defining things from scratch, we were able to say, hey, these are the things we struggled with on the previous
262 00:30:00.430 ⇒ 00:30:01.670 Demilade Agboola: infrastructure.
263 00:30:01.780 ⇒ 00:30:15.390 Demilade Agboola: that we will need you to create, like, like, structure for that in this new software that you’re building, so that it will be easier to calculate this. So again, these are the kind of things that you would work on, largely around, like, infrastructure and, like, platforms.
264 00:30:15.930 ⇒ 00:30:17.340 Rhodes: Okay, awesome.
265 00:30:17.520 ⇒ 00:30:20.679 Rhodes: And, one last question, like.
266 00:30:20.890 ⇒ 00:30:24.550 Rhodes: Across, so, across engagements, that you…
267 00:30:24.920 ⇒ 00:30:31.699 Rhodes: Like, what kind of… what tends to be the most difficult problems? Like, where do things tend to break down?
268 00:30:33.650 ⇒ 00:30:38.440 Demilade Agboola: That’s a bit of a tricky one, because things break down for different reasons.
269 00:30:39.250 ⇒ 00:30:40.310 Rhodes: That’s fair.
270 00:30:40.440 ⇒ 00:30:50.549 Demilade Agboola: But I do feel like one of the most common themes that will always help and can prevent a lot of breakdowns is clear, proper communication.
271 00:30:50.840 ⇒ 00:30:56.769 Demilade Agboola: So being able to communicate, like, whatever issue… whenever issues are coming up, being able to…
272 00:30:56.930 ⇒ 00:31:03.560 Demilade Agboola: Communicate, like, change in deadlines, being able to communicate, like, change in scope, being able to communicate, like.
273 00:31:04.710 ⇒ 00:31:05.949 Demilade Agboola: Just…
274 00:31:06.860 ⇒ 00:31:15.390 Demilade Agboola: what the difficulties are, very important in, like, consulting, because, I mean, ultimately, clients are never really immersed into everything.
275 00:31:15.630 ⇒ 00:31:24.750 Demilade Agboola: And so sometimes, when they feel like things are going sideways, they really aren’t, but because we’ve not communicated and been in front of the clients, it’d be like, hey.
276 00:31:26.330 ⇒ 00:31:42.560 Demilade Agboola: This… this is not… this is delayed by a week because, you know, we need access to this. This is delayed by two weeks because, we had the assumption that this was the scope of things, but it’s actually worse than what we thought it was prior to, like, scoping.
277 00:31:42.680 ⇒ 00:31:51.109 Demilade Agboola: I’ve only seen situations where, like, clients were upset that a dashboard looked tardy, like, it just looked all over the place.
278 00:31:51.710 ⇒ 00:31:52.850 Demilade Agboola: But…
279 00:31:52.980 ⇒ 00:32:06.099 Demilade Agboola: it was a dashboard that was in development. So, like, just literally just putting it in a development folder would have made the client know that this isn’t ready, this isn’t necessary for my view yet,
280 00:32:06.430 ⇒ 00:32:18.049 Demilade Agboola: And, yeah, just that kind of thing. When you are able to have a spirit of communication, that translates into things like documentation, that translates into things like
281 00:32:19.250 ⇒ 00:32:20.160 Rhodes: Claire.
282 00:32:20.160 ⇒ 00:32:30.230 Demilade Agboola: end-of-day reports, or just clear end-of-week reports, and just being in front of the client. I feel like that makes a huge difference, because for the most part, we do know what to do for the most part.
283 00:32:30.230 ⇒ 00:32:30.590 Rhodes: Pharmacy.
284 00:32:30.590 ⇒ 00:32:32.070 Demilade Agboola: are, like, competent.
285 00:32:32.190 ⇒ 00:32:37.710 Demilade Agboola: But being able to show that competence is, you know, the very important part.
286 00:32:39.510 ⇒ 00:32:40.530 Rhodes: Yep, awesome.
287 00:32:41.200 ⇒ 00:32:48.660 Rhodes: Yeah, well, in the interest of time, I’ll hold my questions there, but, really enjoyed the conversation.
288 00:32:48.660 ⇒ 00:32:55.220 Demilade Agboola: Oh, I did, I did as well. I will definitely give my feedback to Kayla, and I’m sure she’ll be in touch with you soon.
289 00:32:55.660 ⇒ 00:32:57.649 Rhodes: Awesome. Have a good rest of your day.
290 00:32:57.650 ⇒ 00:32:59.060 Demilade Agboola: You too, bye.