S2 Ep2: Is Your Data Ready for AI? A Practical Self-Assessment (The Data Literacy Show)

23:13

S2 Ep2: Is Your Data Ready for AI? A Practical Self-Assessment (The Data Literacy Show)

Data Literacy 25.02.2026 57 просмотров 1 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

In this episode of The Data Literacy Show, Ben Jones (CEO of Data Literacy) and Alli Torban (Senior Data Literacy Advocate) tackle a question they’re hearing everywhere: “Is our data ready for AI — and what does ‘ready’ really mean?” We introduce DF4AI (Data Foundations for AI), which is a practical 10-minute self-assessment that helps organizations understand their AI readiness without chasing perfection. You’ll learn… - Why you don’t need perfect data to start getting AI value - The five key dimensions of readiness - How weak foundations quietly derail AI initiatives If your team is experimenting with AI (or feeling pressure to), this episode will help you move forward with clarity and avoid costly missteps. Take the free DF4AI self-assessment: https://df4ai.com/ See full show notes here: https://dataliteracy.com/season-02-episode-02/ Subscribe to our channel: 🔗 https://www.youtube.com/channel/UCo3bzxEm0FSFZsMkAFilY8A?sub_confirmation=1 About Data Literacy: 🔗 https://dataliteracy.com 🌀 Co-founders Ben and Becky Jones, started Data Literacy, LLC in 2018 with a mission to help people learn the language of data. To help our customers become more data literate, we design, implement and continuously improve cost-effective training and certification programs that we deliver online, on-site and on-demand. We aim to demystify data, and to make the learning experience fun and enjoyable. A main tenet of our offerings is that data simply provides a lens into our world and our humanity. Learn more about our online courses (contact directly for group rates): 🔗https://dataliteracy.com/training Subscribe to the Data Literacy newsletter for special discounts & offers: 🔗 https://share.hsforms.com/1ubvVCV85T2acOINZ0qWqKQ34aq6 50% off all our courses & books for students and educators: 🔗 https://dataliteracy.com/education Find us on Social: 🔵Twitter ~ https://twitter.com/dataliteracycom 🔵 Instagram ~ https://www.instagram.com/dataliteracycom 🔵 LinkedIn ~ https://www.linkedin.com/company/data-literacy 🔵 Facebook ~ https://www.facebook.com/dataliteracycom #DataLiteracy #Data #DataVisualization #Education

Оглавление (5 сегментов)

Segment 1 (00:00 - 05:00)

Welcome to the Data Literacy Show, the podcast that helps organizations build, measure, and level up their data and AI literacy. I'm Ally Torbin, the senior data literacy advocate here at Data Literacy. And I'm Ben Jones, CEO and co-founder of Data Literacy. So, as we know, data and AI, they're two sides of the same coin, right? We're here in this deep learning era and so all these foundational AI models have been trained on massive amounts of data and those AI applications are connecting to our data so they can answer our questions and take action for us. So today we'd like to talk about something that's coming up a lot in conversations with our clients right now which is how ready is our organization's data for AI and what does ready even mean? — Right? Because sometimes when you say ready, it can kind of sound like perfect. — Yeah. But you know, data is never going to be perfect, right? And so that's what not this is not about. So, — you know, there's an interesting argument I've been seeing out there right now u like in the data communities on LinkedIn and such that, you know, an organization doesn't really need to worry about having pristine data and a really robust semantic layer. that'll come up again a little later and we'll define that or even having really full pristine meta data if they want to start getting value from AI in other words they don't need those things so people are starting to say that right so and in one sense I agree with part of that — yeah and the same thing we're seeing here at data literacy we can get value out of AI even when our data is like not actually perfect — yeah so I think what we're going to do in this episode here let's talk about how to start where you are. How do you can get some value early on and also strengthen your data foundations so your AI efforts don't turn into a like a confidence crisis later on? — Yeah. And when you say data foundations, what do you mean by that? — Well, so data foundations, you know, think about laying a foundation for a building. It's the stuff that makes data usable at scale. You're going to build on top of it and you're going to, you know, make sure that data is accessible. These are often things that you don't notice when it's working well, but you're really going to notice it when it's not working well. — For example, if your data is inconsistent or maybe really hard to find or if it's unclear or poorly governed, — AI might actually magnify that. So, you might get fast answers with a nice tone, but those might be fast wrong answers. They might be incomplete. They might be a little bit risky. — Yeah, we don't like that. And AI is really good at saying, "Oh, yeah, you're right. Thank you for the correction. We were very confident a second ago. " — Yeah, that's right. Yeah. And you know, even maybe that's okay in some low stakes scenarios, you know, but then what you might find is that it's going to totally fall apart the moment you try to scale it to more teams or the more you try to expand to different kinds of questions or certainly like anytime you're involved in a regulated environment. This is where some of the risks pop up. — Right. And that's why we developed DF4 AI. — Yeah. So DF4AI, what does that stand for? Data foundations for AI. It's just a pretty quick self assessment for organizations to take maybe about 10 15 minutes and it's going to help your organization get an honest snapshot of, you know, how ready you are for AI. It's got 30 questions. They're in five different areas. So six questions in each of these five areas we'll talk about. At the end, you get a nice scorecard. It gives you a score. It gives you scores for individual categories or areas. And it helps sort of shine the spotlight on maybe what's strong, what's weak, and what might be tripping you up, — right? And the point of it is not to keep you from using AI like here's a report, you got to come back next year. — No, that would be a bummer, right? So, it's more like a map. Okay, so here we are right now on the map and what are we going to do to start making headway or moving forward? you know, we need to make AI work. Okay, great. But let's find out where all the potholes are and maybe what kind of improvements will, you know, help you get to your destination uh sooner. — Yeah. And there are 30 questions in five areas. So, let's walk through the five areas. The first one is data architecture and accessibility. This is one that's about whether your data is structured, it's integrated, it's discoverable. It's asking the question, can AI systems actually find and access the data that they need? — Yeah. And you know, you don't need some perfect setup to start getting value here, but you do want some signals that the basics are in the right place. So, you know, things like, you know, do we even know our core systems for the data sources? Do we have a consistent way

Segment 2 (05:00 - 10:00)

that data is moving between them? Do we have a place people can go to find what's trusted? And this is where you're going to start to hear some interesting terms pop up like data catalog or semantic layer or certified data sets. And the buzzwords aren't really kind of what's the point here. You know what you're trying to really get to is knowing and figuring out if the organization has a repeatable, dependable, reliable way to connect, publish, and reuse their data. And there's one thing I'd like to add here, you know, and that, you know, AI gets more useful very fast when the data is accessible through a clean pathway. But if the only way to get context is if some one-off extract or a CSV download or something that someone is updating manually, you know, AI is going to feel impressive in a demo, but it's going to probably fall apart in real usage. — Yeah. Because it doesn't have the right context when you actually need it, when you use it again, — right? And it needs that. I mean, if your data is scattered across a bunch of systems, let's say you've got maybe in inconsistent ways you're integrating between those systems. So, you how's the AI tool going to make sense of that? It's going to struggle. And you what you might find is sometimes you get an answer, but it's only a partial answer, but it'll say it in a way that sounds like it's full and complete, but it might be missing some important pieces of the puzzle. — Yeah. So, one of the important questions here in this section is how centralized is your organization's data? Is it scattered across multiple systems or spreadsheets or maybe you have like a data lake where the critical data is stored uh all the way to the other end which is you know a the ideal scenario where you have like one unified centralized data platform. — Yep. — All right. So let's go to the next dimension that is data quality and trust. And this is where teams usually say hey our data is a mess. — Yeah. And I mean, you know, we've seen that over and over again, right? And it's probably true, but you know, in some places, maybe in other places, it's not so bad. So, what you can't do is fix all of your data quality issues everywhere all at once in order to even start getting value. So, that message might be out there. I don't think that's really true. And the problem is it's just not feasible, right? So what you got to do is find out where the data is weak but also important. So it's an important area workflows that really matter and the data is the quality is just not where it needs to be. That's what you're trying to identify. — Yeah. Because you really don't want AI turning bad inputs into polished outputs again. — Yeah. Especially not where it matters most. So okay instead of thinking about it like oh we got to get our data perfect instead say hey let's manage this. Right. So, okay, who are the owners? Are we measuring the quality of the data? Do we know if it's changing? Maybe there's some kind of obvious problem that we can catch before that just gets fed into some AI application and outcomes nonsense, right? So, as you expand AI to more and more areas, then you've got to kind of roll out those same quality practices to those different areas to improve it, — right? Yeah. So, one of the questions we ask in this section is how current and up to-date is your organization's data? Maybe it's outdated uh or refreshed manually, like someone has to go in there and hit refresh for you or maybe it refreshes automatically, but it's on a schedule that you have to keep in mind like, oh, is this about to be refreshed uh with new data tomorrow and right now it's like three months old. You have to know what schedule it's on. And of course the gold standard is data that's updated real time or near real time across all major systems. — Right? — Okay. Third section dimension — is data governance and stewardship. And this is like things like permissions, policies, decision rights, ownership and just stuff that's not super glamorous but very important. — Yeah. It's not scintillating. It's kind of make or break though really you know. Um, so what happens is the moment you start putting AI co-pilots or systems into place and you're implementing them, you know, within your own organization. So pretty soon, I'm guessing you're going to be connecting these applications, these models to things like customer data or, you know, at that point, look, governance isn't optional anymore, right? So this is kind of you've got to have it there. Um, okay. Maybe it's a useful assistant. That would be awesome. Unfortunately, if you're connecting to sensitive data and you don't have any governance in place, you just can't ship it. — Mhm. What specifically you think are we looking for here? — Well, I mean, I think you kind of need to know who owns the data set, each one, right? Who is the owner of it? Who is the person internally that's saying who gets access to it and who doesn't? um maybe what's allowed to be used in AI

Segment 3 (10:00 - 15:00)

tools. Maybe some things aren't allowed to be a used in certain kinds of AI tools, — right? — Um on the other hand, you can think of some tables or some records or some kinds of attributes that kind of need to stay out um and then what are we going to do about all these sensitive uh records and fields. So, you know, even simple answers to those questions, they're going to prevent a lot of avoidable drama. You just got to think about it up front. And I do think it's really critical for everyone. But hey, if you're in a highly regulated industry, let's say financial services or health care or insurance or the public sector, you know, anywhere in these places, look, a small data mistake, that can turn into a major compliance issue pretty fast. — Yeah. And one of the big questions we ask in this section is how clearly are data ownership and stewardship roles defined and followed. So our responsibility is very clear or unclear. Um maybe you have some sort of de facto data owner but nothing is formally assigned. You know we just we are hoping that person knows the responsibility and keeps doing it. — Um or maybe you have data owners and stewards assigned and trained across all parts of the organization and that you y — okay fourth dimension is documentation and metadata. So things like definitions, glosseries, data set descriptions, lineage, change history, all that kind of stuff that helps people understand what they're looking at. — Yeah. I mean, we're just inundated with acronyms and terms and metrics and sometimes — what one person means is different, right? So this is where this whole semantic layer, you know, I referenced that a little bit before. This is a big debate going on right now about the semantic layer. And you know, I did kind of touch on it, but I think it's good to explore it a little bit more here in this fourth topic of documentation and metadata. So, if we just kind of take super simple terms, you know, what the heck is a semantic layer? Well, this is the place where an organization captures what these things mean. So, definitions and logic, it's going to take all your data tables. It's going to really allow you to get out of those tables metrics that are shared that people can use that we can all be on the same page. Yeah. And some teams feel like, hey, we have to build the perfect semantic layer before we can do anything with AI. Do you think that's right? — Well, yeah. I mean, I don't I mean, I think that you do need a semantic layer. It's really important. And some tools out there do depend on it, you know, and it has to be in place. But if this is going to take a whole year to create and you're not going to get any business intelligence value until it's completely 100% done, that's a showstopper. You know, that's a long painful project, right? You're going to have just the time ticking and even maybe months or years and everyone's waiting. So like what's going on? Your data team is just like totally taking on water and you still don't have any value, right, in the real world. So I don't think that's reasonable or feasible. Uh but it's not to say that semantic layer doesn't matter. So I think really the best way to do it is to say like don't start with like boiling the ocean like documenting everything in your entire data universe. Maybe instead start with some data sets and metrics that are more kind of targeted and specific than that, right? Specifically ones that are power powering like very definable workflows that you're starting with those and start with those and you care about them. they matter and they're also something that is acceptable to be uh experimenting with improvements. Okay. So then get all the definitions locked in there before trying to get everything done all at once. — So you're thinking of some specific use cases that are going to give you a lot of value and that's going to help pull forward what exactly you need to make sure is correct when you're thinking about your perfecting your semantic layer. — Yeah. Get those areas right. Hey, pick a few use cases like you said, maybe this quarter. Let's use internal Q& A. Let's have workflows. Let's enable sales. Whatever it is. So then tighten up the definitions and the metadata around the data that those use cases touch. — And actually, what is really kind of interesting to me is that you don't really have to do this by yourself anymore. You might think, oh my gosh, I have to type in by hand all these definitions and like literally someone has to sit there in a keyboard entering everything in little by little. actually no. Um, you know, AI itself can help you get data ready for AI in this way because what it will do is suggest um definitions uh for some of your attributes and your tables. It can kind of infer what those are. I've seen this with, you know, with a lot of great tools today. I was even just playing with this with my class at Udub. We were looking at University of Washington. I have a class there that I teach and we were looking at data bricks and how you can import data into its um into its Unity catalog

Segment 4 (15:00 - 20:00)

and it'll automatically suggest a bunch of definitions and the fields are editable. You can go in and retype them and say, "Well, that's kind of close but not quite right. " So, the bottom line is it gives you kind of a starting point, right? So, you avoid this like — massive blank page of death of doom where it's like, "Okay, someone's got to go in and just brute force type everything in. " Well, you can actually use AI to kind of help you get started, give you a boost, you know, you just want to be careful to review it carefully. And I mean, the temptation is just to accept everything without even looking at it. So, you know, you don't want to do that. But, um, so you do need those humans that are in the loop and engaged and, you know, taking a look and seeing and greenlighting and accepting what actually gets accepted and becomes official, — right? So it's not we have the perfect semantic layer and it's not no semantic layer at all. We got to use we can use AI to help us identify some use cases and some pieces that are useful for us that we can define now so we're not boiling the ocean but we can start getting moving forward. — Yeah, it's totally a Goldilocks effect, right? You kind of want to get in the middle. If you try to just ditch it, I think you're going to probably regret it. If you try to get it all done before you go anywhere, it's just not going to work. So you got to be in the middle. So you know build where ma where it matters make it as useful as you can and then keep adop keep expanding right as you as your adoption starts to take root and take hold and then apply those approaches as you kind of go through the organization sort of area by area table by table. — Yeah that makes sense. Start small and then expand. — All right. So one of the main questions we're asking here is how well is metadata captured, managed and made available across your organization's data. So maybe you don't collect anything right now or maybe metadata is harvested automatically for you which would be nice and it's already machine readable and consistently maintained across systems. That would be ideal. — Y — um but you can be anywhere in between. — You can you probably are. — Yeah. — Yep. — Okay. Last dimension is data literacy and culture. One of our — This is our favorite one. — We're agreed on that. So even in architecture, quality and governance are strong. So you want to be asking, do your people trust and understand the data? Are they trained? Maybe their incentives are aligned or not aligned. Um do you encourage people to experiment? And that this is definitely in our wheelhouse here at data literacy because we focus on data and AI literacy training because tools don't create these kinds of capabilities on their own. That's where people come in. And when teams have the shared language and the judgment, you can take AI from, oh, that's cool, to something that's a really useful, genuinely useful tool for your organization. And it's a lot safer. — Safer, too. Right. Yeah. So, you know, you can think about AI adoption from like a purely technical and technological standpoint, but I mean, the truth is it's as much about behaviors and attitudes as anything. So, if your culture is not supporting questioning, interpreting, and improving data, then these AI tools aren't going to like somehow magically create that. So, you still have to have, you know, the culture in place to support this type of decision-m and this type of dialogue. And so you want to kind of eventually start to see some of the same patterns. Um you want to, you know, not see anymore people pulling in totally different numbers, people trusting blindly in outputs they shouldn't having a conversation about that, right? Um or, you know, even sometimes in some cultures we've seen um people just avoid the data because they're just not sure what it's about. So they just kind of keep doing their thing. So with AI, you know, you're going to see this pace speed up. Well, that means maybe the impact of misunderstanding is going to speed up too if you're not careful. So this last category is pretty critical. — Yeah. And one of the important questions we are asking ourselves in this section is how capable are your organization's employees in accessing, exploring, and analyzing data without intervention from data experts? So, are people mostly learning or they're leaning on one or two data experts on their team? I mean, we've always we've all been on teams like that where there's like one or two people like, I'm going to go ask Cindy over there. She knows exactly all the data. — Um, or maybe most team members are able to perform more of like a self-service analytics that can get the data, explore it on their own, get their own insights. Again, you're probably somewhere in between the perfect and the um the uh smallest scenario, but yeah, there's a lot of space that you can be in between. — I think so. Yeah. Most people are somewhere in the middle there. And so, again, what this is all about is kind of mapping out where you are in that spectrum and figuring out if that's a blocker for you. I mean, and for your organization. — Yeah. And like we said, there's 30 questions in the assessment. We went

Segment 5 (20:00 - 23:00)

through a few of them, few of the highlights, but if you take the assessment, Ben, what do you think they should do with the results? — Yeah. So, these um organizations that take the assessment, okay, they're going to pick at the end of the day, they're going to have a few um ideas of some use cases where they want to start to use AI to help in the kind of the here and the now. And then they can take that and say, "Okay, let's check out our scores and see which are the gaps that are hurting us the most in order for us to s succeed at those use cases. " And that will lead them, I think, to be able to put in place a nice improvement plan. So, you got to have, you know, a realistic one. It's got clear ownership. There's some measurable changes you can make to, you know, make it more likely that you succeed in your u in your initiatives. — Yeah. And you can take the assessment as just one person, you know, clicking through the questions yourself, but it's even better with a small group. Like you grab a few people uh in your organization and maybe pull up the assessment in a um in a workspace together. — Yeah. Right. In a meeting room or a virtual workspace, have a conversation around each one of those 30 questions. You could probably get through it in a half an hour meeting. And um you know, that's I think a great way to do it, too. You know, eventually we'll add to the platform the ability to send out multiple assessments to different people on the team. We're evaluating that right now. And then you can see the average scores and see the highs and lows and all that good stuff. But for now, you kind of do it together as a team kind of sitting around the same circle and having a conversation that might be even more value than anything. — Yeah. — Um so then, you know, hopefully, you know, get people from all across the uh the industry that are really knowledgeable about the data systems. you know, you got the BI team, the data team, some other folks from IT or security, maybe even some folks from the business that are really hands-on with the data. So, this is going to give you hopefully a more accurate picture, right? If you kind of pulled more people in and you know, you're going to step away from this mindset of like fingerpointing. It's someone else's problem and someone else has to fix it. It's kind of you're all in it sort of together. So, that's I think important perspective to keep as well as you go through it. — Yeah, for sure. And if you want to take the DF4AI self assessment, we'll put a link in the show notes for you. It's free to take it for now. And for anybody who wants to go further after you take the assessment, we're happy to dig into your results and help you uh get some recommendations on what to do next. So there's a form at the end where you can just ping us and we'll send you a proposal for that. — Yep. Exactly. So, you know, hey, if you're already moving forward with AI, a lot of organizations are, well, this assessment is going to help you kind of keep momentum and you won't maybe step on some of the same landmines that we keep saying seeing, you know, unclear definitions, shaky access controls, confusing data, broken trust, all those kinds of things. And it'll really kind of give you the way to go back to the organization and say, "Hey, let's focus on improving these areas so that we get more value out of our investment. " — Yep. I hope this episode was useful for you. Make sure to subscribe to the show so you never miss an episode and we will catch you next time. Bye guys. — All right. Talk to you soon everyone. Bye — bye.

Другие видео автора — Data Literacy

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник