Meeting Title: CTA Semantic Model Discussion Date: 2026-04-09 Meeting participants: Awaish Kumar, Amber Lin


WEBVTT

1 00:00:56.590 00:00:57.810 Amber Lin: Hello!

2 00:00:58.920 00:00:59.650 Awaish Kumar: B.

3 00:01:00.230 00:01:08.070 Amber Lin: Hey, I’ve… got… on CTA, got, like, a first semantic model.

4 00:01:08.070 00:01:17.169 Amber Lin: in there, but then they’re pulling from all these different reports, which is… is my goals of working today. I want to ask you, like.

5 00:01:17.170 00:01:28.589 Amber Lin: what model should I use? Because there’s, like, reports, and then there’s also, like, aggregate. I’m not, like, 100% sure what you recommend.

6 00:01:28.640 00:01:30.190 Amber Lin: So, I’m gonna…

7 00:01:30.190 00:01:33.610 Awaish Kumar: Thing is, okay, let’s… So, you can…

8 00:01:33.610 00:01:36.650 Amber Lin: account. Wesley.

9 00:01:37.220 00:01:38.800 Awaish Kumar: Well, I’m saying that you can.

10 00:01:40.010 00:01:42.429 Awaish Kumar: Anything which is in ProdMarts.

11 00:01:42.810 00:01:43.770 Awaish Kumar: Right?

12 00:01:44.520 00:01:45.940 Amber Lin: Yeah, there’s also, like.

13 00:01:45.940 00:01:47.160 Awaish Kumar: Can be organized.

14 00:01:47.160 00:01:49.470 Amber Lin: Is that also what it is?

15 00:01:49.470 00:01:49.790 Awaish Kumar: Sweet.

16 00:01:51.510 00:01:54.100 Awaish Kumar: I understand what you are saying, I…

17 00:01:54.820 00:02:02.190 Awaish Kumar: I completely understand what you’re trying to say. There are reports, there are RPT models, there are aggregates.

18 00:02:02.410 00:02:03.100 Awaish Kumar: Get on.

19 00:02:03.100 00:02:06.109 Amber Lin: Do you want to share screen with my… logging in.

20 00:02:06.110 00:02:08.590 Awaish Kumar: Okay, I mean, not in laptop, but.

21 00:02:08.590 00:02:11.220 Amber Lin: Oh, oh, cool, cool, then that’s fine, I can…

22 00:02:11.220 00:02:12.369 Awaish Kumar: But… but what I’m…

23 00:02:12.370 00:02:12.790 Amber Lin: Here.

24 00:02:13.660 00:02:22.029 Awaish Kumar: But I’m… I understand, you are saying there are DIMFAC tables, there are egg tables, and then there are RPT tables, right? Report tables.

25 00:02:22.300 00:02:27.970 Awaish Kumar: Well, the thing is, why we have it in such a way is that… so the basic…

26 00:02:29.190 00:02:32.619 Awaish Kumar: from any raw data is… are dim and fake tables.

27 00:02:33.020 00:02:35.749 Awaish Kumar: So we have a dimensions, we have a fact. And…

28 00:02:36.920 00:02:37.790 Awaish Kumar: Glitely…

29 00:02:38.840 00:02:54.270 Awaish Kumar: It’s more like… so, when you want to answer a question, like, there is an event, a registration event happened, right? We have a fact table for that. Then, we have a DIM fact table, which says DIM person. So, you are going to join this

30 00:02:54.550 00:02:58.650 Awaish Kumar: check table with a dim table to identify. In this registration.

31 00:02:59.430 00:03:04.840 Awaish Kumar: Who was the person that… that joined… that… that actually made that registration, right?

32 00:03:05.030 00:03:10.410 Awaish Kumar: So, in the RPT, we basically simplify that by joining it Beforehand.

33 00:03:11.200 00:03:27.690 Amber Lin: Yeah, I guess my question is, should I be relying on that in the, in the semantic view, or should I actually reference, like, deeper, like, more base-level ones? Because right now, I’m…

34 00:03:27.690 00:03:28.649 Awaish Kumar: Using things in…

35 00:03:28.650 00:03:30.580 Amber Lin: reports, and… Right.

36 00:03:31.160 00:03:33.610 Awaish Kumar: Yeah, let me complete what I was saying.

37 00:03:33.610 00:03:34.789 Amber Lin: Yeah, yeah, sorry.

38 00:03:34.790 00:03:37.160 Awaish Kumar: Then we also have a semantic layer.

39 00:03:37.300 00:03:50.070 Awaish Kumar: Right? After the reports. Semantic layers are the views that are basically standardizing the definitions. So, for CTA, we want to make

40 00:03:50.270 00:03:52.850 Awaish Kumar: We want to use this,

41 00:03:52.850 00:03:53.320 Amber Lin: Mmm.

42 00:03:53.320 00:03:55.240 Awaish Kumar: semantically. Why?

43 00:03:55.240 00:03:55.560 Amber Lin: Okay.

44 00:03:55.560 00:04:02.149 Awaish Kumar: Because using Cortex code, or the Snowflake’s Cortex code, or chatting.

45 00:04:02.380 00:04:08.859 Awaish Kumar: like, I don’t know how, like, how you will standardize. Somebody asks, A question, like.

46 00:04:09.010 00:04:13.799 Awaish Kumar: Return the top 20, list of companies

47 00:04:14.220 00:04:17.550 Awaish Kumar: that joined… that were present on the CS event.

48 00:04:17.779 00:04:22.339 Awaish Kumar: Which are also the Fortune 500, right? So…

49 00:04:23.460 00:04:28.230 Awaish Kumar: Like, you might have different answers for that.

50 00:04:28.510 00:04:31.689 Awaish Kumar: And that is the reason, right? Like, not for the.

51 00:04:31.690 00:04:32.290 Amber Lin: Cool, cool.

52 00:04:32.290 00:04:33.710 Awaish Kumar: It doesn’t look good.

53 00:04:34.040 00:04:39.560 Awaish Kumar: But there are some examples where, The definitions change.

54 00:04:39.660 00:04:47.530 Awaish Kumar: For… for different… different… Questions, there are different ways to get the answer.

55 00:04:47.880 00:05:05.629 Awaish Kumar: And we want to standardize that by semantic layers. So, simple way is, you can use DIMF fact table, right? That is the main ideal thing, that using DIM and fact table, you can basically, with very granular level, you can connect the Cortex code, and it can give you the answers.

56 00:05:06.120 00:05:14.269 Awaish Kumar: That’s the ideal situation. But the kind of data we are getting in for CTA is a little bit of messy, and also, there are uncertain…

57 00:05:14.400 00:05:18.340 Awaish Kumar: Their answers change based on who is asking.

58 00:05:19.200 00:05:19.710 Amber Lin: Mmm.

59 00:05:19.710 00:05:25.340 Awaish Kumar: So, for example, if… If CEO is asking, or if a marketing team is asking.

60 00:05:26.630 00:05:31.549 Awaish Kumar: Member engagement team is asking. So, based on the different,

61 00:05:32.180 00:05:40.039 Awaish Kumar: person, like, persona, who is asking the question? Based on that, your answer changes. Your definition changes.

62 00:05:40.150 00:05:41.040 Awaish Kumar: That’s why.

63 00:05:41.470 00:05:55.100 Awaish Kumar: It’s hard to get those things from AI, because then you have to maybe write some guidelines around it. I don’t know if you have to put some… like, in our cursor, we have playbooks, right? I’m not sure if we can put the playbooks

64 00:05:55.260 00:06:03.010 Awaish Kumar: In Cortex. Then, like, if it is a question by marketing team, then use this definition. If it is a question by…

65 00:06:03.560 00:06:07.909 Awaish Kumar: some other, team, then… Yeah.

66 00:06:07.910 00:06:12.340 Amber Lin: I see. I do think that’s possible down the line, but yeah, continue.

67 00:06:12.340 00:06:19.850 Awaish Kumar: Yeah, and then one more thing, and also, it’s… there are, like, in this,

68 00:06:21.720 00:06:29.429 Awaish Kumar: In this ag, not everything might be just, like, some… there may be some filter, maybe some,

69 00:06:29.670 00:06:34.970 Awaish Kumar: some pivoting because of the shape of the answer, like,

70 00:06:35.430 00:06:41.460 Awaish Kumar: It is… so if… what percentage of the media companies joined NCS?

71 00:06:42.450 00:06:52.520 Awaish Kumar: Find the total, you need to find the media, and for the media, like, there are also different types of categories, so different media categories, and then for each of those categories, you are going

72 00:06:52.840 00:06:56.229 Awaish Kumar: Find the percentages. So, these are… aggregates are doing that.

73 00:06:56.370 00:07:08.279 Awaish Kumar: So, it might help you with the standardization, but if you find a way using Snowflake that, okay, we don’t need to create these aggregations, but instead, if we can write

74 00:07:08.750 00:07:12.399 Awaish Kumar: That down as a playbook, it will work, then that’s good.

75 00:07:12.530 00:07:13.309 Awaish Kumar: That is also…

76 00:07:13.310 00:07:14.720 Amber Lin: Yeah. Okay.

77 00:07:14.720 00:07:15.760 Awaish Kumar: Okay, we can vote.

78 00:07:15.760 00:07:30.329 Amber Lin: Sounds… sounds good. Right now, I’m using these aggregate tables and these, like, reports, which are also, like, pre-aggregated things. I think what I’m gonna try today is,

79 00:07:30.700 00:07:37.820 Amber Lin: I have it in here. Like, right now, it’s all… it’s all based on these aggregated tables.

80 00:07:38.490 00:07:44.729 Amber Lin: Sorry, this is very messy, but I can define… and then I can define these metrics.

81 00:07:44.790 00:08:03.040 Amber Lin: based on that. So, I think it is possible for me to directly use DIMP tables, but I need to check if that’s the best practice. But, like, I… that’s kind of what I was, like, asking you was helpful, because I’m gonna… I’ll experiment with these two approaches.

82 00:08:03.170 00:08:07.459 Amber Lin: I feel like we’re gonna end up using aggregate tables.

83 00:08:07.940 00:08:24.029 Amber Lin: It’s just that it’s harder to join between these aggregate tables, so I, like, I wouldn’t be able to answer product categories immediately, or from… like, we’ll see, but, like, that’s helpful to know that these are usable.

84 00:08:25.260 00:08:25.709 Awaish Kumar: Yeah, I know.

85 00:08:25.710 00:08:33.350 Amber Lin: this says audit, and I’m not sure, like, if this was just, like, one… if this was just, like, a one-time table.

86 00:08:33.710 00:08:41.390 Awaish Kumar: Yeah, this is basically based on just one of the reports they wanted to create, and that is called CES Audit Report.

87 00:08:41.780 00:08:42.400 Amber Lin: Well, they don’t.

88 00:08:42.409 00:08:53.189 Awaish Kumar: So we wanted to standardize the definitions, so we come up with these, aggregated views, but, like, that’s my point, like, if you can come up with

89 00:08:53.349 00:09:05.359 Awaish Kumar: This definition inside of… the… The context itself, right? If it is a… if somebody is asking for… media…

90 00:09:05.459 00:09:08.959 Awaish Kumar: companies and their percentages in CES events.

91 00:09:09.179 00:09:24.539 Awaish Kumar: And then what is the answer? And if it is asked by one team versus the other, then what are the different, like, the variations we can do? If we can do that using AI, we don’t need to get these aggregations, we can just go from base tables.

92 00:09:24.860 00:09:39.209 Amber Lin: Yeah, yeah, okay, cool. I… I… that’s what I’ll experiment… experiment with today, but, like, at least they have a semantic thing they can use to ask from these. Like, I’ll… I’ll try and probably also…

93 00:09:39.580 00:09:50.110 Amber Lin: check online what the best practices are. Yeah. That was all my questions. I think Utam said he’ll join, so I’ll stay on, but, like, that’s all I had.

94 00:09:50.510 00:09:50.980 Amber Lin: To add.

95 00:09:50.980 00:09:52.409 Awaish Kumar: Okay, cool, thank you.

96 00:09:52.410 00:09:53.120 Amber Lin: Thank you.