Meeting Title: DataOps Planning Date: Jan 26 Meeting participants: Ashwini Sharma, Katherine Bayless, Kyle Wandel, Chi Quinn

Transcript:

Them: Be pretty dangerous, but I would love to have you guys and their team kind of maybe get on one of the calls and just talk about, like, what’s possible, because I think a lot of times they’re sort of, you know, somewhat limited by our ability to kind of deliver technically on our side as well, but. If we’re going to start persisting the data ops ID kind of across systems, then they want to be able to leverage it too. And I do have commitment from the marketing team that they will start using UTMS properly this year.
Me: Cool. Cool. Yeah, maybe it’s just like. I don’t know if they’ve given you a plan or if we can even help to help with putting that all together, but, yeah, maybe a call on my team, Zoran, he runs a lot of tag manager pixels, so I’m happy just for us to get. On the phone. And then sort of help be the middle layer.
Them: Yeah, that’d be awesome. We’ll definitely take you up on that. Okay, the improve on site attendance tracking. I mean I think we could all agree it would be nice if we didn’t have broken scatter data. There’s got to be better ways then the same with the special programs processing. And so this was the sort of process that I was running up through CES this past year to determine, like, you know, which reg path and benefits you were eligible for. Frankly, I think some of the improvements are going to come from, like, just not having 30 different reg paths. I mean, really like, that, to me, seems like the easiest place to fix some of the chaos, but probably we’ll still have 20. And so figuring out, you know, what a better kind of pipeline looks like for that also a dependency there is. I’m hoping that the registration vendor Merits will be able to set up a web hook for us so that it’s not just this, like, nightly drop of a CSV file onto an FTP, but we can actually send these across in real time. They said they would be able to. We’ll see what they come back with this year. And then post session surveys. I know that obviously this kind of overlaps a little bit with the market research team in terms of like the surveying piece and potentially the analysis. But frankly, I mean, I think we should have those basic four question, you know, QR code at the end of every session at ces, type surveys in place. And so I think if this team is the one that has to kind of, like, push to make it happen, that’s fine, because I just think we need to start getting more survey data kind of coming through the flywheel so that we’ve got that qualitative and quantitative picture. Okay, then under should do the aws, the landing zone and control tower work. This is probably going to happen. I don’t know why. I’m just still dragging my feet about it a little bit. Same with the Shopify stuff. I’m assuming that this will end up going forward. I just am waiting for confirmation from the marketing team that they’re willing to kind of kick that one over the fence. The Zoom AI companion piece. Is sort of saying, like, the pipeline that I want to put in place for, like, our team calls and all of the sort of digital exhaust of our conversations. I’d like to kind of scale that out to some of the other teams. And there are use cases on the membership team in particular, where they are doing sort of, like, very specific division board meetings that have a certain, like, sort of template for their meeting minutes and that kind of stuff. And so, like, I just see an opportunity for AI to bring some Zen to all of these workflows and sort of with us as the guinea pig. Salesforce and DocuSign. This is something I’m hoping we’ll be able to tackle a little bit this year. Truthfully, it probably looks more like bringing in somebody from the outside to just kind of really go deep with that team and understand exactly where the issues are with the current config and salesforce and docusign and what can be improved. So I see us in sort of an advisory and steward role for this work, just kind of making it happen, but not necessarily being the ones to do the technical lift. Okay. Yeah. Consolidating data platforms with market research. I assume that as long as we are, you know, continue to be friendly with them, that they’ll be fine with this. But I just think if we can bring all of the organization’s data onto the platform world building, that makes more sense than continuing to maintain dozens of SQL servers running around. The next couple do get into the CES stuff. So event point we currently use for speakers and content at the conference. On the conference side of the event. There’s interest in using it for more of the event registration capabilities. So potentially replacing CVENT and some of the stuff that gets captured there and inform Stack eventually maybe making a move to replace Merits, but that would be further down the road at least kind of corralling some of the smaller events I think is a good short term goal, and then as part of that, Probably starting to see if event point makes sense as a CRM for CES right now. Merits keeps like one or two years back of our data, but then they do delete it and so like our copies, those CSVs that we have are all that exists. I think we can probably figure a way to use event point as the, you know, sort of CRM system of record on an ongoing basis, so that it’s not just the data warehouse that has this source of information. And I think the team would be pretty friendly to that, too. Mobile app navigation and UI enhancements. To be honest. I think our mobile app’s OK enough. I mean, I don’t love mobile apps. I’m just right. I’m not the right audience here, but it does sound like there’s a lot of, like, interest in, like, some tiny things that could be better. And so if the CES tech stuff comes this direction, then kind of figuring those out will be part of our purview with the vendor. Of course. Like, we wouldn’t have to be coding the mobile app per se. And then the last one is kind of just maybe a little bit more nebulous, but like better data capture from our exhibitors. I see this as twofold. One is I know there’s more stuff we’d like to know about them. Like this year we find or we first started asking about, like, funding status more directly via the exhibitor dashboard. I think there’s probably other data points we could, you know, optimistically put in front of. Them and see if they respond to them. But I also think there are a lot of touch points to exhibitors that are sort of outside the sales funnel, like the Torch, for example. I know market research team did a lot of very specific outreach to exhibitors around the tours. That’s probably data. We could try to like more passively collect so that market research can just use it versus need to chase it down. And also from a cohesion and branding standpoint, like, probably makes sense to have all communications coming from one channel. Because I know Chris did mention that sometimes people are like, is this real or is this phishing? So yeah, I think just kind of improving some of these non sale related touch points with the exhibitors. Okay. And then just to kind of round it out under. Could do. We’ll see what happens. Box implementation. I think Jay is still open to this. I’m not sure where. He’s kind of thinking about it at the moment, but I hope that we will be able to do some of that work in this year. Similarly kind of up to J and where the roadmap wants to go sort of holistically, but separating concerns between the workforce okta so like the employee management and SSO and all those things and then the customer side. Frankly, I think there’s merit to separating the two. We’d still leave them like, linked within Octave. But in terms of just like the oversight of the strategy, I think there’s some merit in separating them because they really are very different use cases. Managing a workforce versus managing an audience. I think this would free us up to look at alternative platforms like Clerk Also to see if maybe, you know, if we are going to continue using Okta, could we leverage the dashboard better? I know that not a lot of our audience tends to sign in, but seems like if we’ve got the ability to serve them, at least something kind of nice when they do make sense to try and get value out of that. And then there’s also seems to be some persistent like at least people think there are limitations around okta and like utms and stuff like that. So some of these like technical weeds to disentangle. And then the the last one that addresses the Mafia show API limitations. Obviously probably ought to be under won’t do, because I just don’t know what we can realistically do about another company’s API. But there is a lot of frustration with their API and the rate limits and the bandwidth and it going down occasionally during the show. So, like, I don’t know if there’s things we can do, but we can at least keep it on our, you know, roadmap and explore. And if there is something we can bring to the table, fine. But I don’t expect us to be able to do much with that, to be honest. Probably just replace map your show and then under won’t do the Gary bot. Which Kyle will know what that means. Nobody else will. This is actually Kyle. Why don’t you explain this one? Sure. Very quickly. Is our CEOs attempt to basically want to recreate himself in AI so that. That way he’s also there. Yeah.
Me: Common ask. Heard this ask before.
Them: Yeah, I’m not surprised at all. Yeah. He just. I mean, I don’t. I don’t know how much I can say. Who knows how long he’ll be around. Much longer. So things like this is one thing that he really wants. He really wants to get out. But it’s something that I don’t think. It’s super necessary, but, I mean, he does have 47 years of institutional knowledge, so who knows? Yeah. Yeah, I think there could be some value in, like, the work insofar as it would force us to really digitize and collect and formalize some of the old stuff that we have running around. But, yeah, beyond that, I’m pretty sure everybody’s content to let it stay just a glean bot that you know, gets it right most of the time. Similarly, the finance team would like Concur and Iron Fud integrated. As a user of both systems, I can understand why. Doesn’t feel like the most important fire for the team to put out, but if we get some free time throughout the year, why not? And then the Expo Cat upgrade. This one’s actually in won’t do. Not because there’s no value, but because increasingly, the more I start to learn about the system, the more I’m like, I don’t know, this might be a case where old is better than new. I mean, it’s a really clunky antiquated. Platform, but I’m not sure that just like, migrating to the cloud version of it is going to make it better. In fact, it might kind of be worse in a weird way. So I don’t know. I’m still kind of learning and exploring the realities of Expo Cat, but we also have layers. On top of it. Like, true interaction with Expo Cat is mostly just Tom. Like beyond that, the cloud versions already what’s being used with the sales team because it’s integrated into Salesforce CRM. And then for the attendees on site, we use Map youp Show. And so like, I don’t know, I’M not feeling the same urgency around tinkering with Expo cad as I was maybe back when I had started and it seemed like everybody was just complaining about it constantly. So. Yeah, so that’s at a high level. The things that I think we can tackle and move the needle on in this year. Ambitious, probably, but got to start somewhere.
Me: I feel good, too. I think we’ve done a lot with. I’m excited for the Zoom piece we’ve worked with as a Zoom SDK. So there is probably some workflows that we can build around that. That would be really cool. Yeah, and I wonder. If sort of. Maybe some of the AWS control tower too, we can knock out pretty quickly. Once we sort of scope that out.
Them: Yeah. I think probably that work becomes a must do if the CES tech stack stuff really does come this direction. Like, I think if there’s a chance that we’re going to be building stuff beyond sort of like, you know, data pipelines, then I’d really rather build it in the fresh accounts versus the old ones and then migrate it later. Just seems like it’ll be easier to build it there versus lift and shift, but yeah. O. Kay. Cool. And I shall turn off the screen. Share. Okay, so let’s talk about our friends in membership, which then kind of opens conversation to some of the other pieces. So I admit. When we got the email from Alicia on Friday, I was like. Like, I don’t know. I just. I feel like Friday, my inbox just went sideways on me. Like, every email was somebody that was, like, looking for a thing and been waiting and all that. I was just like, oh, my God, I’m very overwhelmed. Also, not for nothing, but I think this team is basically, we’re all the same person. In this way, we totally bit off the biggest one first. Rather than starting small, we’re going to deliver the report that has 36 different data sources. Right. Okay. Dream big. So I think I spend a lot of time over the weekend trying to figure out, like, okay, can we just, like, smash this thing across the finish line? What is the finish line even look like? But I think I probably made more progress in identifying the roadblocks than I did in writing any code. Because where I really kind of kept circling back to was the Power BI question, and I did find that old inventory of our Power BI reports and their usage, and I’ll put that in slack after this. But there’s 82 of them according to this, and there’s probably a handful more that have cropped up since whenever the spreadsheet was last updated. I’m sure most of them aren’t super necessary. I mean, kind of glancing through it, I do see a lot that are like, you know, 2024 conferences, 2025 conferences. And those could be one conferences report with, you know, a filter. I don’t love the idea of updating stuff in Power bi, but I also, you know, going back to the conversations that we had had around, like, we’re just not quite ready for a new tool yet. I kind of landed on probably makes the most sense, even though it’s unglamorous for a lot of these sort of like post show power BI reports that just need updating. Like the conferences one. Connecting it to Snowflake is probably overkill, especially if there isn’t a plan to use Power BI on an ongoing basis. Right? So, like, I think if we can get the data into a CSV and connect that out to the old dashboard, because it won’t be changing, right? Like, the data is the data. I mean, I guess with the audit there might be a handful that we would have to replace, but I think it’s probably a lighter lift to connect power BI to CSVs rather than deal with getting it into Snowflake. I also kind of like this from a messaging perspective because it sort of, you know,
Me: Okay?
Them: Works in the direction ultimately of moving off of it. Right. It’s like, yeah, look, we’re, we’re trying to get away from power bi. We’re going to continue putting some, you know, data behind a few of these that are really critical that are still in use, but the goal is to get everything. Into Snowflake as quickly as possible. So then I was starting this things for that and I was like, oh, gosh, ok, the member engagement report, if we deliver it via Snowflake, I mean, they’re probably going to have it open like all day on a bunch of different workstations. And we do pay for the consumption for Snowflake. Like, that’s the pricing model there. I don’t think that there’s going to be so much usage of this one dashboard that we would, like, totally go crazy on snowflake billing. But I do think it’s probably something to keep an eye to, because if we do start to see us really increasing costs on the snowflake side, then maybe it does make more sense to move up the, like, bi tool conversation and try to, like, push things more quickly in a different direction. But yeah, I just don’t really know what to expect, to be honest. But I could see where the team might just kind of have this report, like, open all day long. Especially once they find out it does refresh every four hours. I mean, that’s just fun. They’re going to want to see that happen.
Me: So there’s a lot of things on the snowflake side that we can do with caching, and I actually don’t think reads are going to be the most expensive.
Them: Kyle.
Me: And we can tune the raid warehouse. But I actually don’t think it’s going to be a whole lot of cash. Until it’s really going to be the modeling. And once we ingest smoke, we pay when the warehouse is up. And so during business hours is when people are querying. Otherwise, it’s just on. The transform is where there’s going to be a lot of compute.
Them: Yeah. Yeah. I mean, it makes sense. Like, even if they have it up, if they’re not engaging with it, then, yeah.
Me: Yeah.
Them: Yeah. Yeah. So then I was like, okay, great decisions made. We will. Persist power bi via spreadsheet for the ones that are absolutely necessary otherwise. Moving into Snowflake, then I was like, okay, well now the next hurdle is the RBAC and like, okay, I can ask Ian today to let you know all of the membership team into Snowflake, but I wasn’t sure, like, well, what role do we want to assign them and what database do they need permissions on? And I dabbled with setting up a streamlit app, which we can circle back to in a second, but I was like, I don’t even know what kind of permissions I need to give that, so I just left. It with, like, I’m the account admin and I’m building this for the moment, so I think we don’t need to necessarily figure out all of our back, but I do think we need at least to come up with the decision around, like, what is this initial stage where it’s probably a little chaotic and messy look like so that we aren’t stuck cleaning up a bunch of permissions later, but we can get our, you know, colleagues in there without a bunch of fuss.
Me: Yeah. So we have a typical role based asset control that we typically suggest. I can send you both the script and we have a little bit of a write up on how we think about it. Maybe, Katherine, if you want to just take a look at that and basically say, this is overkill, or we need to adjust, We could go ahead and execute that script. It just creates all the roles and then grants the right privileges. I don’t know, Ashwini. If that’s what we already did coming in. But if not, then this would be the path forward there. So I can send you. A write up on that.
Them: Okay? And then I guess, probably as part of it, it makes sense to just, like, ask Ian to, like, add lots of people to Snowflake, and then we can sort of get out of that public role. Yeah.
Me: Yes. Just due to Grol grants. Yeah. So we’ll put them into the role. And then basically, kind of the way we architect it is each environment as a write and a read role. And then we then grant those roles to a few other kind of group roles, which is team or by title. Like, if it’s like an analyst role that gets read and write on certain things. And that way, yes. And then we just grant people into those high, lower roles.
Them: Okay? Okay?
Me: The biggest thing here is we just don’t want to have to deal with people saying, oh, I don’t have access to this. I don’t want that. Instead,
Them: Okay?
Me: It’s like we have global access to a few schemas, and then anytime we drop something in there, they sort of get access to that. And then also let’s say the business changes. Well the fundamental read write rules on the environments don’t change, just a higher level ones we can swap. Okay. We have now we have like data engineers. They need write access on broader things or we have a different flavor of analysts. They need to change a little bit. So I’ll send you our share script and doc on N and. I think go from there.
Them: Cool. So then I was like, okay, all right. Come on. Surely I can solve one problem this weekend. No, I can’t. So, the Master Data Management NTI resolution piece. I think this is actually the blocking factor. Not just, like, in blocking in terms of, like, getting things delivered, but not creating a mountain of tech debt. Because this is where I kind of got stuck was like, okay, we have all of the data that we need in this membership engagement report. But that the dim organization table, for example, like, those are just the organizations that are in remembers as they are named and labeled and remembers. And I can’t remember if it’s only active ones or if that’s filtered out later, okay? Yeah. So at least it’s all of them. But once we start bringing in these other data sources from the exhibitors, from the conference sessions, from the registration, like we’re going to have companies that don’t exist in remembers at all. Now, often those are going to be cases where they’re going to want to create a record because they’ll wind up using it for prospecting. But until that happens, we need some way for these companies to exist and get matched. And then I just tied myself in about eight hours worth of knots. As to where on earth do we start with this. And I think. I think what we can do. Because we. It’s another. Yet another boil the ocean type problem. Right. I think if we start with the. The data from Remembers as sort of the backbone for this. And then we’re kind of adding either adding things in that don’t exist in there and, you know, ideally tying things to it. The reason I think we can start with that as a backbone is because if nothing else, the membership team has done a lot of work to capture all of these identifiers for the different companies. Not maybe in the same way that we would through entity resolution work more formally, but they do have all of the Expo Cat IDs for the companies, wherever those are, you know, known and present in one company can have at least as far as David’s report would suggest, up to seven expo cat IDs. I do think like small side note on that, it seems like there is a one of the issues that’s coming out of the Salesforce stuff is that there’s a proliferation of duplicates happening into Expo cad, which I think is where we’re getting some of these extra IDs. But still, we are looking at, you know, one remember’s ID keying out to multiple exhibitor IDs in a lot of cases, but they’ve also been capturing the domains, which is what we really use for the majority of the matching. Because as far as like system identifiers for a company, it really comes down to that. Remember’s ID and the Expo CAD IDs. There aren’t a ton of other places where we have ID for a company. Most everything else is being matched on though either web or email domain of a person that we want to tie the activity back to a company for. So I think the membership team, again, to their credit, they have a lot of work that they’ve done on this and it is all I did verify wherever it is in the Snowflake data share, I think it’s customer alias, I would say, or customer link, maybe both of those. But like the data that they’ve been capturing for these companies exists at Snowflake. And so I think if we start with building a DBT model around company aliases or something along those lines, or, you know, company matching using that data, and then find a way to kind of augment it as we go and identify companies that are like, genuinely not present yet in our data, but should be recognized. I think that at least gives us a place to start talking from, because we know that our team can’t solve this, like, you know, holistically, permanently. We’re going to need all of the teams to care about managing their data and, you know, tying it into this and not making more chaos as we go, but we’ve got to kind of draw a line in the sand to start somewhere. So I mean, from my perspective, pretty much the entire day I’ve been look, they also thinking about this from a perspective and I did add all of the 2023-26 registration data to Snowflake and started to playing around with that and trying to basically tie it into members. And I think we can just do the email domain address situation. But I mean, Katherine point, I do think that the identity resolution is the starting point. Is a nice little starting point. And so I think if we can start to figure that out, that would help my process out a lot. But that really is like, the starting point is, like, what, how companies tie into our different systems and what are the identifiers? And I think at some point, like, Katherine right, we just kind of, like, redo it, basically, and so just have a list of companies and I don’t know, the best starting point for that. I would argue that I think like a combination of CES and REMEMBERS is the best starting point. But that, I feel like, is a good first step, and that gets us to where we want to start going, basically. Because I like Katherine was put the entire day I was trying to figure out how to tie in CES registration to make it work with remembers and like then I was looking back at Ashwini pyrologic determine, okay, what is an Active Member vs what is a what is the current most recent? Member ID all that fun stuff. So. I feel like that dinner resolution is the first part. Yeah. Yeah, I think it’s just if we don’t have that, at least, again, like, something to scaffold against, I feel like we’re going to wind up building a lot of, like, duplicative models because we’ll be kind of doing the same thing each place, and it’ll be a Bayer to join them together. So I think, yeah.
Me: Okay? Yeah. I also think that once we see broken joins, that’ll be, like, the catalyst. Go ask people to.
Them: Yeah.
Me: Go. Update in the source system.
Them: Exactly, Exactly. Like, I think that if we can get to a point where, you know, they’re looking at the data in the dashboard and saying, like, I swear there should be more, you know, Samsung attendees in 2024. And it’s like, OK, well, let’s take a look. And like, oh, well, it looks like, though, you know, domain that was used then was slightly different, and so it’s not matching, but if we add it here, then it’ll come through. And I think that is another advantage to leaning on the REMEMBERS data as the starting point, because we can tell people specifically what to do, like, go to that record. In remembers, add that domain in the way, you know, whatever sort of business process N has got defined for that so that it will flow through into our data. Because that was another thought I had, was like, oh, God, we need to, like, manage some sort of, like, queue for, like, updates and edits. And then I’m like, nice. This is getting crazy. I gotta. I gotta go to work on Monday. I gotta have something to say to the team. So I think leaning on Remembers as the system, that will at least be mostly the source of truth for this is where we can start. Okay, so then the last piece on the membership stuff, I think really is just, you know, garden variety communications work, which is my job, which is ironic because it might be the thing I am absolutely the worst at. But, yeah, I do think this is on me to kind of help manage the, like, the narrative and the understanding of. Like, yes, it’s gonna take a little time. We’re working on building things out as rapidly as we can. We want to work with the team on prioritizing the right things because, yes, there’s a mountain of data that’s in remembers. Not all of it is stuff they need right away, and then I think working with the team to kind of deliver iteratively as we go, which segue to the silly little dashboard that I created. Let me see if I can pull this up while I’m talking. Like, I think if we can paint the picture of we are building, we are delivering stuff as it comes. Available, but we’re not necessarily going to do, you know, sort of like a, you know, big bang release of like, ta da. It’s all done. And so that communications is something I can work on. Okay, let’s see. And honestly, like, please, this is not necessarily intended to be impressive. I just kind of took like a one shot at having something come together in streamlit to replace it. And so it basically pulled together in the World wars color palette. You know, we’ve got our company drop down for the filtering. It pulled through the bits. That we do still have or that we already have from Ashwini active members report model. And then, yeah, just kind of coming soon for the other pieces. I think stuff like this will help build trust, like, and it’ll help also build understanding of like. Okay, well, we’re working through the data and. Getting it all modeled and pushed out. And so we are aware of this need. We just haven’t gotten there yet. But, yeah, definitely could use a little bit of love and improvement. But it, you know, just as a suggestion of how we could kind of handle this. This from a day that like file data set or is this straight from remembers basically? Yeah. So this is all from the REMEMBERS data, which is why all of it’s really showing up. Is the active member. I was going to say. Yeah, the active members is the starting point. And then. Okay. Yeah. And so I didn’t pull in anything that wasn’t in, like, the prodmarts. So if they’re like, the stuff that you’ve been working on for the CES data isn’t here because it’s not in the, like, prod or it’s zone yet, but, yeah. Actually small side note, too, as part of this. I think, like, another thing that this will do, delivering, you know, kind of incrementally as we get stuff ready, is surface the pieces that are kind of funkily missing. So, for example, one of the components of that member engagement report that I don’t think is the most important piece of it by any stretch. But is the, like, primary, primary category or primary business tag that they have. I can find the category data in the data share, and I can find, you know, the company and all of the categories it’s associated to. But I don’t see anywhere that we know which one is primary. I have a sneaking suspicion it is simply whatever is the oldest alphabetically first that’s coming like through his primary on the UI side. But there is a place for a user to like check the box for primary when adding the data so it’s clearly like the very least, the UI thinks this data point exists, but it’s not on the backend for our purposes. I think all we need to do is say, like, can’t find that? And then, you know, go back out to remembers and I can, you know, handle that piece right, and just say, like, hey, there’s a, you know, a data flag that we don’t seem to see in the tables. Can you help us understand? Like, do you just need to add it? Where is it hiding? Why doesn’t it come through? Is the UI a lot clarification from them, but what I’ve been told, not from them in the past, but from my understanding, just the tech industry, that everything changes a lot. And so when you have Samsung or like Google, who’s in like four or five, like probably 10 to 15 different verticals. Like, how do you classify them? Are they a retailer, are they a manufacturer? What are they, basically? Because, I mean. I mean, ideally, they’re probably a manufacture. They’re probably manufactured, but at the same time, they probably don’t want to be viewed that way. So I don’t. Know. Yeah. Yeah, yeah, I agree. That’s why I’m like, I don’t think this is the most high value data point, but I do think surfacing it as like something that remembers, needs to know is not coming through in their data share. Yeah, like, kind of interested at the moment. But yeah. I feel like there’s one other piece I was going to. I don’t know. It’ll come back to me. Anyway. So, yeah, those were. Those are kind of all of my thoughts around the membership piece. Other thoughts? Questions, disagreement? Concerns.
Me: Yeah, I guess so. That’s what we can focus some of our modeling time on this week. And just continue to land as many data sources into Snowflake as possible. And maybe Ashwini we can work together on like trying to create a first version of maybe these Join IDS.
Them: And I have. I do have stuff going, so if you guys want help on and you want to spread it out, please let me know. Like, I have metric names coming out where the the source data is. Not necessarily the exact table, because I haven’t put it in a snowflake yet, but I have put some of the stuff into Snowflake as well. Mainly just to see yes reg data, but I can put more in. Really, whatever we want to do. Definitely help. Would love to help you guys do this.
Me: Cool.
Them: Yeah. Yeah, I mean, I honestly, like, I. I don’t mind helping either, like, to the extent possible. I mean, I know theoretically quite busy, but, like, I was. When I was working on it this weekend, like, it does feel pretty easy to pull together these DBT models. Like I said, I just kept running into, like. But we need to give them access and we need to handle matching and, like. Right. But, like, it seems like, yeah. Like, if we have the data structure and the mappings and, like, we’re pretty familiar already with the type of SQL we were writing against it to do reporting and analytics. And so it’s like, it translates pretty nicely and AI helps with following sort of the pattern. So, yeah, I was telling them on Friday, Katherine that Swini did on Thursday, and I’ve now been able to do it by myself a few more times. So you’re right, it’s not too too difficult. It really is. Just figuring out the SQL. And what I’ve been doing and doing the postgres in data warehouse, doing the SQL there and then converting it over. So, I mean. It really is just like trying to figure out the exact business logic behind everything and how everything maps and do we use straight up email or domain or do we try to have a better version of how to connect it? And so that’s. And then where I got stuck on this morning is like, what is obviously not obviously. But in the organizations table, in the customers table, like, there’s some organizations that have, like four or five different impacts you might use. It’s like, why is that? And like, how the heck. And this is where I was trying to look through the logic for screening in terms of, like, what is the most recent modified whatever version of this name, company or whatever. So, like, little stuff like that, which we don’t know. Would be great to know. Yeah, I know. Actually, it’s when I was, like. When I was working with the data briefly, was down a very silly rabbit hole that I abandoned, but I happened upon the, like, records for Amazon and I was like, okay, these are, like, neither an incorrectly consolidated set of records. Like, it’s not just Amazon.com and Amazon Web Services and then the rest, it’s like basically whoever we’ve had to sign a contract with. Kind of seems to be the logic, right? Like, if we had to sign a document with a company with this name, they got a record. And I don’t, I don’t have any answers, as to whether or not that’s how we should continue to do things. Part of me is like, yeah, I guess the need for a document to have a certain name is really the business driver of record creation. So maybe that is the differentiator, even if it’s not really reflective of the reality of their actual legal status? I don’t know. But I do think if we can start pushing things out, we’ll start figuring out how much we really care about some of these questions. Like, you know what I mean? Like, I feel like sometimes when you’re, like, looking at everything.
Me: Yeah. You just want someone else to review and make the tie break.
Them: Right? That was the rabbit hole I went down. Was. Could we just build a little streamlit app where they could go in and. And confirm or deny matches? And I was like, no, this carries. Well, why would we do this? This is not the way to do it. But, yeah, like, I think. I think we’ll figure. It out as we go, but it’s going to be kind of messy. Messy, messy. Oh, I know what I was going to say. This is a tiny note, but I need to put it into the ASANA board as well. But with the session scan data, Jackie Black from the conferences team reached out and had an interesting idea which was could we take the data from the events in Cvent and essentially kind of fake scans because I guess they do record the, like, attendance in cvent for the lit dinner and some of these breakfasts and stuff like that. And so I think, I mean, it’s probably a pretty light lift. Just a matter of formatting the C vent data to match the scanner data and then bringing it. In that way and somehow identifying, like, that this wasn’t really from a scanner, but. But, yeah, I thought. I mean, not a bad idea. And it does kind of help again, with, like, that narrative around, you know, we do want all of this to be in one place eventually, so we can do some manual work initially, but. You know, wouldn’t it be great if everything did just come through naturally? So, yeah. Okay? So with that, then I think the other pieces I had were actually. Yeah, so we already kind of talked about the Zoom pipeline thing a little bit. I think really the only action item for the moment is just, can we get that scheduled? Because I realized it was. I wanted to kind of figure out what things I want to. Put in place before. I set these up to be like, recurring with the AI companion on by default and blah, blah, blah. Just figured I’ll get. So if you’ve got time this week to help with that piece, then we got that one out.
Me: Okay? Great.
Them: Okay? And then.
Me: Do you happen to have Zoom, like admin on your side?
Them: I. So I guess I don’t know if I’m like an admin per se on the account, although I can become one if I need to. But I do know I have the fancy pants license for all the AI features because they bought like five of them and I got one.
Me: Like super, super admin. Okay? O. Kay. So maybe, yeah, maybe we can talk about what the Scope is. And part of it is like they’re getting some stuff into Snowflake. Part of it made. Yeah, maybe you just saw their processing. So, okay, that makes sense.
Them: Yeah.
Me: Cool.
Them: Okay? Polyatomic, do we? Do we know? I mean, I guess. Should I just reach out to GALLB and ask for status on that? Because I think we were waiting for him to list it in the marketplace.
Me: Yes. I just checked on that, actually, like, 20 minutes ago. He also. I forgot where he mentioned the latest status. But, yes, we should just follow up. I can do that if you’d like me to.
Them: Yeah, if you don’t mind, if you’re already betting, but yeah, because. Yeah, I mean, I think as soon as he has it listed in the marketplace, we could at least start using it for some of the, like, low hanging fruit connectors. Like, even if we don’t necessarily model the data right away at least we can start pulling it in and landing it.
Me: Okay?
Them: Yeah. Okay? So then, last thing on my brain was just to kind of go through the asana board. Partly to start building the muscle of reviewing the board during planning, but also to get genuinely status on some of the updates or some of the items. So let me see, I can share my screen.
Me: And then I can also add some of these follow ups to the Sonic board today. Katherine, that’s like we’re going to keep everything centralized.
Them: Yeah. Yeah, that’d be awesome, actually. Yeah, I kind of. I went back and forth. This is another one I tie myself in. Not so. It’s like, you know, how many things should I put on here? Should I put on the things that it takes more time to add to asana than it does to just do them? And probably sometimes I should, because otherwise I walk away from my computer and I just completely forget what I was supposed to be doing. But, yeah. So right now, right now, this is mostly being powered by request form that we’re starting to socialize, having all of our colleagues use to put a like request into the team. That way versus Email or Slack. I’ve actually, I’ve been pretty impressed so far. People seem totally chill with using it. I was afraid there’d be, you know, some crankiness. I’m sure we will encounter it eventually. But. Yeah. Okay. So let’s just kind of go through these. I think the pending one. This was the one I tackled for Dave with the global impact data, and he sent this for review earlier, so we’re going to assume this is done. Okay, so then. Scanner reporting. Kyle, do you have the latest on this one? Oh, you’re.
Me: Hi. I’m here.
Them: Sorry. Thank you. Yeah, it’s done. I did it to the good to point the correct data set. But the power BI is ready and Johanna has to take a look at it, so it’s good. Okay, cool. Do you want to put it in pending or should you just move it all the way to complete? Yeah, I’ll put it in compendium or. Wow. Cup ending complete. I like it. Depending. It honestly should be a status. Oh, yeah, the exhibitor investor interest response is I did grab these and I put them in the S3 bucket. I forget the folder structure that they’re under, but I think exhibitor details or something like that. I think as far as I understand from the request, it’s mostly just this will be included somehow in member engagement report or other data for membership. I think this is the only ask that we’ve had for them so far. What is this data? This was the. For the Investor Partnership program. The, like two questions that the exhibitors could answer in their dashboard. They’re like, are you interested in funding and how much funding do you have currently? Nothing. The data lake you have going. Yeah, I think it’s under, like, exhibitor. Details is where I parked it. Okay? Okay. Yeah. So it is in there. I think, actually, Michael Brown wanted to use it as well for, like, a marketing outreach piece. But I’ll leave it as in progress. What we work on, figuring out where it goes. Would we at least have the data? Okay? Okay, report VP and above attendee accounts for bcg. Oh, this is an old one. This one’s probably done, so it’s probably you, Kyle. I looked it up a while ago, and I don’t remember anything, but I’m pretty sure it is done. Yeah. I mean, if not, it’s at least forgotten. Yeah. Complete. Sorry, Mark. That is complete. Okay, Shopify sponsorship purchases. Y. This is a good question. So, yeah, so Lindsey had reached out and said, now that we are using Shopify for selling the sponsorships, like, do you want us to put that data anywhere? And I’m like, I definitely don’t think I need to ask you guys to do it, but it is something to keep on.
Me: Yeah, we should just. We should just bring this in via balsamic. Is this j request? Because we’ll need an API key created. For shopify.
Them: I don’t think it has access to Shopify.
Me: Yeah. Okay, then. Okay.
Them: I got you.
Me: Okay, cool. So then I’ll just send you. I’ll send you instructions of what we need. And we’ll at least just land this.
Them: Okay? Perfect. Yeah. Yeah, I think Jay hates it enough that I don’t think he even has access. Oh yeah, this one I know is older. The virtual press conference impact. Kyle, you might have tackled this one back in the fall. If not, I think we can just shuffle it over to complete because the idea was just, you know what it sounds like, like just virtual press conference. People also come to CES and I think it was. Just not the most exciting question. I don’t think I got. I don’t think I got shown this. So first. I’m not seeing this one. Okay, I’m going to pop it over to complete for the moment. It was like one of those curiosities that just got lost in the shuffle and is not a problem. Okay, update power bi reports.
Me: Random question, Katherine. Is there, like, an asana Slack integration?
Them: Is there one? Yes. Are we leveraging it? I don’t think.
Me: Okay. Just so, like, because I feel like I’m a culprit. I’m just like. I put in slack and it would be nice to just create ticket from message.
Them: Yeah.
Me: That they have to have that functionality.
Them: That’s a really good point. It does have that functionality because I have used that. Yeah.
Me: Okay? Okay?
Them: Yeah. Yeah, we should absolutely set that up. Yeah. Create sauna task. Yeah, yeah, it actually, it’s a great point because, yeah, I tend to work in slack way better than asana, so, yes.
Me: I just. I like to chat, keep everything in slack and then, like, just create from there.
Them: Yep.
Me: Otherwise, if I I don’t like. Because then you could get in the hole of, like, leaving comments. Oh, I left a comment in the Asana board. Boy. That’s a comment the Google Doc. It’s like there’s too many places with notifications.
Them: Actually. I mean, kyle and I were.
Me: Everything has comments, has a feature. It’s like, the worst.
Them: Like. Yeah, you’ve reduced user friction up front and caused chaos on the.
Me: So, like, turn off the comments feature, like, haven’t sent this lag or something.
Them: Yeah. Honestly, I feel like a unified inbox for, like, everything that needs my attention is, like, the number one tech thing I need to exist. Side note on that, I was trying to get Claude Koerk to help with my email.
Me: Yeah.
Them: Cleanup. Over the weekend as well because again, I was determined to solve one problem and I failed. Not kidding, guys. It came back and was like, so I’m gonna with you if it takes me 30 seconds per email. This is gon take four hours and I just don’t think that’s a good use of my time, and I was like, you’re the robot.
Me: I actually had helped me with something because I have my personal laptop I’ve been using since 2020, and it’s starting to slow down. And I was like, I have like, you know, everybody has like a messy down. I’m like, I guess I have a messy downloads file and documents folder. And then I was like, okay, well, anything work related, can you first flag? And then I’m just going to upload all that to Google Drive so we can delete it off my laptop. Second piece. I was like, can you just scan everything and then, like, propose a new, like, organization structure for documents folder? And it did it. And it like, that’s great. It’s essentially like a Sunday mindless task that I was never going to get around to doing.
Them: Yeah.
Me: And it worked. And it’s all local, I guess. Well, yeah, I guess most of it is local or somehow works, but.
Them: Yeah. It is really good at that because I did the same thing. I like let it loose on my downloads folder and asked it to categorize like probably junk, probably should put somewhere.
Me: Yes. Exactly.
Them: Important. Yeah.
Me: And then how does it delete? And it was like, are you sure? It’s asking, like, five times. Are you sure you want to let it delete? I’m like, just delete it. Whatever. At least it’s making the decision. I’m like, don’t ask me too many times. If I’m nervous about this, then I’m going to have to go one by one anyways.
Them: Exactly. Oh, my God. All of a. My data hoarding kicks in. I actually did create an S3 bucket called Data Horde, because I was like, I just. There’s files that I have that I have no idea if we should keep or not, but I’m just going to dump them in there, and that’s. A retention policy question for later. It auto deleted after 10 years and nobody noticed. And I think that’s. That’s all the answer we needed. Okay, so this one. This is a list of power BI reports that do need probably at least someone to look at them, and then, yeah, maybe, like, putting a CSV behind them, if that’s enough to get the job done for the moment and. Or if they are things that are, like, close to ready in snowflake. Then we can just kind of see about pointing them there. I mean, I do think our mileage will vary with the different teams and, like, willingness to go to Snowflake for data. And as much as I’m totally that person who’s like, come on, we’re all adults. We can learn how to use new software. We do it in our personal lives all the time. Probably should start organizing some trainings is that kind of thing. So, like I said, kind of the comms piece for me to tackle, but. But, yeah, maybe. Kyle, if you want to take a look through these. Yeah, I started to look into them and they were pretty very basic just how many people attended, how many media, how many industry attendees. So it’s, it’s very like high level basic questions. It goes into the details more like at the maybe country territory level or whatever, some of the categories that they had to fill. Out during registration. So including, like, what type of influence are you? So it seemed very straightforward. I did take a look at that, but I can look at that and just kind of make a list of the items that we specifically need for those reports. Okay. Yeah. And maybe with a focus, too, on, like, where do they, like, kind of overlap or could overlap or, you know, could be one report with a filter for year instead of four. Report. Yeah. Yeah. Actually, I also learned that quad Cowork can read Pbix files, and so, like, you can ask it to go. Through because it’s like, oh, it’s just a JSON archive. And like, oh, okay, I wouldn’t have known that. And so I was able to get it to, like, go through the file and extract all of the fields, used the calculations and their logic. And then, like, the tables that were the fields were. Sourced from. So kind of handy, honestly. Member engagement report. That was how I found out. It’s 36 data sources and 125 data points. And I was like, Okay? Great. No big deal. That one first. Why wouldn’t we do that one first? Sure. One thing I did, I put in the chat the table name or in the schema name that it’s in Snowflake. So now that the data is in Snowflake, you can use that or you can use Postgres to answer those questions. Yeah, okay. Yeah, I admit. I have also guiltily been still using the Postgres database and then jumping to Snowflake from there, which is just funny to me because in theory there is no difference. But like, the behavior is so strong, my gravitation towards the DB verb, like interface. Trying to get more familiar with Snowflake, but change is hard. You just know where everything is. Yeah, that’s. That’s what I know. Exactly. Oh, yeah, the final attendee list to More I can’t remember. Did we tackle this one? I don’t think we did. I’ll put this one in pending. I don’t think so. I can grab that one. Okay? Okay, so this was just kind of duplicative with the scanner stuff. Is it the scanner data? Yeah. It is interesting, actually, that she shared the, like, staff person who ordered the scanner like. Interesting. So we, we did have an issue with the foundry. Like, I guess there were some sessions from IBM that did not have a scanner present at them. And so even, even though their contract says that they’ll receive the attendees from their session, If we don’t know who those attendees are, we can’t share them. So I think. I don’t know. I guess the sales team is kind of working on figuring out what we can do. For IBM to make them happy, it might mean giving them some data that’s not from their sessions, which I. Don’t love from a legal perspective, but, I mean. I don’t know. Probably they’re the same people that would have gone to it anyway. I don’t know anyway. But the reason that comes up is because I guess if this is the list of, like, who ordered scanners for what, in theory, we would see the gap around somebody ordering them for. IBM for the Foundry again, though. All the more reason. We just need a better way to track attendees on site. Okay? Up next. Was that a duplicate? Because I remember I saw something about the Samsung or there might have been something different, but I saw the email today in regards to. I think that was from membership. Yeah, this is a duplicate. I did this last week. So this was the. I don’t know if it came in twice from that’s the thing is that I think, like, it’s coming in twice. One came from membership, one came from Kyle, because I added the one from Kyley in, and then this may have came from membership, actually. Yeah, we’ll probably see a lot of that happening as we get folks used to, like, single point of entry. It’s also kind of informative to find out how many, like, duplicate requests we do get. Okay, Foundry. We’re not just do book a request but duplicate effort. So clearly they didn’t really communicate to that n straight to our data team. And then they also went to the membership team. Right, Right, exactly. It’s. Yeah, it’s a good point. It’s duplicate work on the request putting in and potentially the request getting answered. Yeah. Yeah. Okay, this looks pretty straightforward for Foundry sessions. I can tackle this one, Kyle, if you want. Or if you want to knock it out. Either way, that’s just from the batch games, right? Yeah. I can take these honest from Dave. Yeah. Do you want to put me as the person or assign it to me? Oh, y. Eah. I forgot about that. Yes, yes, it helps me a lot. Yeah. Sure I should go through the other ones and do that. Good call. Okay? And then the rest, I think. Oh, look, there you go. Shopify data Snowflake via polyatomic. The rest are just the backlog y ones that I had dropped in before. So I don’t think we need to go through those. Okay? Anything else. That either isn’t on the board and should be, or isn’t on the board. And could be or.
Me: Yeah, I was going to say I wrote down a couple things. So one is. Oh, also S3 access for a wish. He’s got into Snowflake. He just means they’ll roll arn some access for press 3.
Them: Okay? Okay?
Me: And then I’m going to go create tickets just in the backlog around. Basically creating maybe like a one pager on tagging attribution like ga, and then also on the zoom thing. And then, yeah, depending on whatever is urgent. Like, I think, Katherine, some of those projects, basically, maybe we should take one each week to, like, scope out further. If Zoom is this week, then maybe we can meet for, like, an hour or two and just, like, get everything Zoom related into one doc. And then if we want to do something on ga, that way we can sort of start to help the other team Orange Spark. Is that right? If we can start to help them, maybe, like, getting insights or kind of understand their roadmap, maybe that next week. So, like, that’s kind of how I think we can handle some of the less defined, larger rocks, you know?
Them: Yeah. Yeah, I think that makes sense. I think probably for the. In order to, you know, be nice to the membership team. Maybe the focus for this week and next is like, we’ve got to get that engagement report out the door. I mean, to be fair, they do get these briefers kind of constantly. Like, I got two or three more this morning that they, like, need to do. All the, like, work on researching the company for. So, like, I do think it makes sense to put that above all. All the other stuff is just, let’s get that member engagement report. As much of it as possible delivered. I think if we can deliver all the pieces where we have the data and the limitation is just around that matching, then I think if we can unblock the matching and deliver the data we do have, then we can work on, you know, tracking down some of the more fringe pieces, like. Some of the engagement stuff that’s kind of nebulous and things like that. But I do want to that into a good place before we start having other adventures. My biggest thing, and maybe what we can do for this week at least, is to focus on how CES engagements tie into it and then also maybe, like, committee stuff. That’d be the two maybe areas. It’s not overstretching that could get us a lot of work done. But what? I don’t know. What do you think? I agree. I mean, I think the committee stuff, to me is a good one because it’s, it’s probably pretty easy once we’ve figured out those joins, which I think, Ashwini, you’ve made some progress there and then. Yeah, I think, like, if we can get that CES data modeled and. Put all the way through to a prodmart that’s going to come in handy in so many places. So. Yeah, I agree, Kyle. I would prioritize the CES piece. And then for this is a question for everybody in the room. When doing the, like, the CS registration stuff, is it better to. I guess this is the conversation that we’ve had last week as well. But is it better to just import all four years of raw data and then have a staging model or staging DBT model to put them all together and then dive into more of like a. Like a CES master, only pulling the relevant columns? You need? Or is it better just to keep everything separate and then just go straight into the staging table? I actually have the same question because, yeah, when I was messing with it over the weekend, I was like,
Me: Yeah, maybe. Ashwini, you go first. Yeah, maybe I can hear your answer and can think about it.
Them: Kyle, could you. Could you repeat your question? Yeah, definitely. So. So we have the four different years of registration data. Would it be better to go from the four years straight into, like, a staging table, or is it better to combine those four tables in, like, a raw.
Me: I’ll go for the second one.
Them: Okay? So you combine the four table, then have, like, a data set, and then.
Me: Combine them, you may have to. Figure out unions and types and stuff like that, so I would combine them.
Them: Okay?
Me: Either way, we’re going to combine them eventually, so the further upstream we can do the combination and then basically apply the same logic.
Them: Yeah.
Me: I think this is where, like, if things are messy, When you do that union or you need any help? Like, let me know. But that’s because. Because eventually we don’t want anyone to think about stitching these together.
Them: Yeah.
Me: Ever again. Like we wanted to do this once and nail it. You know?
Them: Yeah. And. And based on the. So combine the four ingest and then once you have the four ingest unionized, then you can start develop like your whatever your staging master table to look at the different years of. Okay, cool. I will modify that a little bit, but that sounds. That’s. That’s what I. Was thinking about. Okay, cool.
Me: Y. Eah.
Them: And I think the biggest thing there is just tying it into the customer’s organization table, at least from my standpoint on membership engagement.
Me: Yes.
Them: And so I don’t know.
Me: So we just want to make sure that there’s at least a join, and then, yeah, we just. We can either create a table, or you can say, like, these two tables exist. For a power bi dash. We need to do the join. We can just do that.
Them: Y. Eah.
Me: Create a view or something. So once we get to that point, we can talk about the ergonomics, probably.
Them: Okay? Yeah. Okay? Cool. And then maybe I can send you. What? I’ll. I’ll send Eunice Ashwini what I have so far, and then maybe we can start to think about how we can, like, basically join customers and registration stuff. Sure. Yeah. The join piece will be interesting because that was one another one that I was like, It’s definitely based on company name, like, and. Or I guess, like, we have some exact matches, like here and there, which I think is. It’s good. But then it’s based on company name and domain, so domain name, like, whatever. And I think that does a pretty good job. I think we missed a couple. Of people who put their Gmails in there, but otherwise I think it does a pretty good job of matching most of them and grabbing most of the people who actually are working for those companies. So. Yeah, it was just like I was thinking, like, I wanted to kind of, like, fall back. Like, you know, check first for this and then for this and then for this, which is, you know, a terribly un optimized way to join tables. And so then I was like, well, how can we kind of smash these into one? Giant identifier column and, like, I don’t know, just match it on anything that pings, but, yeah. We’re doing the Fortune 500 matching. The. The best is definitely on the domain. Yeah. Yeah. Yeah, I think the domains are very much the golden ticket. But again, still not a great way to figure out what is actually a distinct company. Because even if they are different legal entities, sometimes they still share a website.
Me: Yes. Yes. So that’s why. I don’t know, like, maybe. Yeah, it’s going to be interesting. Once we get a first pass, we have edge cases that we can think about. Like there’s any way for us to figure that out. Maybe we just have to have, like, more of a custom mapping sheet in between. That is made is maintained to kind of do splits where we can’t figure that out from the data itself.
Them: Yeah. That was. Yeah, I went that direction too, where it was like, okay, well, maybe we could create a seed, I learned this term, of, like, you know, established matches. And then we could have a way for people to, like, add established matches. But I think when I realized, like, that data and impacts or remembers is basically that it’s just been captured and stored kind of differently than we would have done it from the outset, but at least like that is essentially what they’ve tried to do.
Me: Yes. Exactly. Okay, then. Yeah. I mean, if it’s. If it’s can come from there, then I would push to just. We use that. And then if I was not even broken, we’re like, you have to update that. In the storage system.
Them: Yeah. Y. Eah, yeah. I mean, because truthfully, the biggest thing I want to avoid other than just, you know, bad data, is the behavior of expecting the data team to manually clean matches because we have better things to do with our time. Much like Claude doesn’t want to clean my email. I don’t want to clean up people’s matches. And so, yeah, I think the more we can, like, point to a system and say, like, if you enter the data here, it will come through correctly, the better. Well, that’s why I think I like your idea of using members, because I think you, you and I have talked about it’s literally, it’s literally physically impossible for two people or three people to basically own literally every single data set or every single variable, understand every single business logic about everything. It’s just. It’s just too much. We’re only human. What else is on everybody’s mind. I did. I put down 90 minutes because I wasn’t sure if it would run the full time. We can also. I think you hit the nail on the head. The engagement reports, by far the most important right now. And then going off that it’s the identity resolution or, and, or how do these different data sets connect. And so I think for, for the first, at least for the first week of this week, we should really focus on trying to get registration connect. And that’s. And I’ll. We can do that via domain and email. And then where in pextime and ID is not included or member ID is not included, and then for committee data. I’ll let Aswin. I’ll let you take the lead on committee data of that, if that would be great. Okay?
Me: I think that’s it from my side.
Them: Y. Eah. Okay? Cool. All right, well, we can all take some time back then. I will look into the Asana ticket create from Slack because, yeah, I think, yeah, there’s definitely ways to do it. You probably could even have it set up to be just, like, a person that you talk to, like, send a message to that. Like user.
Me: Yeah.
Them: Yeah.
Me: And then also Katherine, can I get access to Glean or. Or, like, maybe we can create a separate channel with Glean, so I can just, like, kind of test somethings.
Them: Yeah, sure, sure, sure.
Me: Or we can throw them in our channel if it’s, like, not too noisy, whatever.
Them: Yeah, we. I mean, we could always throw it in and then kick it out. Although I don’t know how it would work, actually, since our channel is, like, the external thing.
Me: Oh, yeah. Well, I guess it could be, like, on your side.
Them: Yeah.
Me: Yeah. I don’t know.
Them: We can definitely get you access to it. What will be interesting is, like, glean is only as good as the things you have access to underneath it, right? And so to a certain extent, we might also need to give you access to, like, SharePoint and stuff like that in order for Glean to have your permissions. To see the content, so otherwise you’ll just kind of be talking to an empty LLM, basically.
Me: Okay?
Them: Yeah, it’s funny. I mean, this is like the thing that is, like, J, like, goes crazy trying to, like, get people to understand. And I kind of get it, like, because end users will either they’ll see something in a glean result and panic, right? Like, oh, God, anybody can see this, or it’ll. Be the opposite. They’ll be like, why can’t see any of this stuff? It doesn’t exist. And it’s like,
Me: Yeah.
Them: It only sees everything you have.
Me: What do you already have? Yeah.
Them: Yeah. Yeah. First day at ces, we got, like, a frantic email from Kinsey that was like, people can see my W4s. And I was like, it’s okay. Although Jay did infamously in a training at one point say that if you ask it for Social Security numbers, it shouldn’t find anything. And it came back with several Social Security. Number.
Me: I guess that’s positive. You can now block it out, but there’s some testing before.
Them: Yeah, right. They should be testing. Yeah. Generally speaking, this organization did not do testing very well. Yeah. No. I gotta not ask you to see if it knows my Social Security number? That’s really funny. Yeah. I was watching a YouTube video over the weekend from like a, I don’t know, AI and security sort of. I watched the weirdest stuff. My brain was in a strange place all weekend, but like a guy was basically kind of walking through how they do some of these, like, red teaming attacks on LLMs to get them to, like, cough up data. And it’s funny, like, I guess in my mind they were doing something really like, you know, hackery and sophisticated, but, like, as he’s walking through it, he’s literally just like, more. And then it’d be like, okay, fine, here’s all the stuff you’re not supposed to have, like, really and like, like feeding it, like, text that looks like text you think it has, and then it’ll, like, give you back the, you know, quote unquote corrected text. So, like, you could feed it like. Kyle Wandel. Social Security number is 1, 2, 3, 4, 5, 6, 7, 8. N. And it’ll come back with Kyle Wandel Social Security number. Right. Like that kind of thing. I was like, this is alarmingly unsophisticated. But also. Yeah, there’s a lot of data that’s baked into these things that we can’t really do anything about. I don’t. Know. Yeah.
Me: Yeah. A friend of mine just started a startup that’s doing a lot of that. It’s, like, really crazy. As a service.
Them: Yeah, yeah. I mean. Like. Yeah. And, like, what do you even do, like, with some of the stuff that probably shouldn’t be in there? I don’t know. Yeah. Anyway, that’s enough of Katherine cybersecurity paranoia. How was really funny. Somebody from Legal, the new girl from legal, asked me about that at the CES show floor and was like, what do you think about, like, all this data privacy? And I was like, well, there’s probably more that I’m missing, quite frankly, but if, if somebody wanted my information, they can get it. Like, quite frankly, if somebody really, truly wanted Kyle Wandel information, they could get it. It’s true. That’s just. There are probably people who other who are protected way better than I am. Where maybe Moy William more difficult. But to me it’s not that big of a concern, I guess. Yeah. It’s like it’s not a millionaire, so. Yeah. Right? It’s like. It feels like it’s a big concern. I just. I’m like, what can I possibly do, right? I mean, I don’t know. Data brokers are going to do data broker shit. They’re evil people.
Me: Yeah. I agree. For the most part, they’re just. People are just trying to get you a Buy more T shirts and stuff. That’s like most of the advanced data. Stop wasting. All the e commerce people are like, they have really great attribution, but all they’re trying to do is get you to buy more stuff.
Them: Right. Like these sneakers that follow me around forever and ever after. I’ve already bought them.
Me: Yes, yes.
Them: All right. Well, I guess if that’s all we got, then we can go back about our regularly scheduled affairs. I’ll take on my action items, and then, yeah, anybody needs me, I’ll be around. Devil and snow, so. I did five hours yesterday. I got a few more today, so. Good luck. Yeah. Thank you.
Me: Thank you.
Them: Thanks, everybody.