But AI actually has a lot of problems that most people don't know about. And in today's video, I'll be explaining 10 problems in AI. And the God's honest truth is that most of these problems can't actually be solved. So, let's actually take a look at what the biggest issues in AI currently are. The first one is going to be hallucinations. In AI, a hallucination occurs when the model basically just generates incorrect or misleading information, often presenting it as facts. Now, this can happen when the model lacks sufficient training data, makes incorrect assumptions, or contains biases. And these hallucinations can range from minor factual errors to completely fabricated claims. And most people think that, you know, this is just like a minor problem. It's actually a really big problem because if we can't verify whether or not a model accurately knows what it's saying, then we can't really use it in many of the applications where it needs to be. If it makes mistakes in financial, that could cause someone to lose millions of dollars. If it makes mistakes in legal, it could cause someone to go to jail. I mean, there's just so many things wrong with that. Now, Jen Sang Huang talks about how this is essentially still a problem. Today, the answers that we have are the best that we can provide. But we need to get to a point where the answer that you get is not the best that we can provide. And somewhat you still have to decide whether is this hallucinated or not hallucinated? Is this does this make sense? Is it sensible or not sensible? um we have to get to a point where the answer that you get you largely trust. You largely trust and so I think that we're several years away from being able to do that and in the meantime we have to keep increasing our computation. Now there are partial solutions to this that are discussed by Andrew NG. It's been exciting to see how AI technology improves month over month. So uh I think today we have much better tools for guarding against hallucinations compared to say 6 months ago. Uh but just one example um if you ask the AI to use um retrieve augmented generation. So don't just generate text but ground it in a specific trusted article uh and give a citation that reduces hallucinations. Uh and then further if a AI generates something you really want to be right. It turns out you can ask the AI to check his own work. You know dear AI look at this thing you just wrote. Look at this trusted source. read both carefully and tell me if everything is justified based on a trusted source and this won't squash hallucinations completely to zero but it will massively squash it compared if you ask AI to just you know to just say what it had on its mind. So, I think hallucinations is um it is an issue, but I think it's not as bad an issue as people fear. And the reason I've actually put this point into the video is because unfortunately with the recent models that we got 03 and 04, there are actually more hallucinations. These models hallucinate significantly more than the predecessors 01 and the other GPT series models. First reported by TechCrunch OpenAI system card detailed in the person QA evaluation results designed to test for hallucinations. From the results of this evaluation, 03's hallucination rate is 33% and 04 mini hallucination rate is 48% almost half the time. By comparison, 01's hallucination rate is 16%, meaning that 03 hallucinated about as twice as often. Now, one thing that was actually left out from this, you know, article is that the person QA is actually a benchmark designed to elicit hallucinations. So, it's not just your average question. When you ask 03 something, the average hallucination rate isn't going to be a third of the time. So, you don't need to worry about that. But when we look at a benchmark designed to elicit those hallucinations, it does significantly worse. Now, OpenAI actually said crazily that they don't really understand why this is the case. But, of course, there are many things that are probably going to solve this in the future. Now, if you're wondering which models hallucinate the most, you can actually look at the grounded hallucination rates for the top 25 large language models, this was updated in April, and we can see that these models, and this wasn't one designed to elicit hallucinations, but rather they just, you know, had around, I think, a thousand different documents and asked them to fact check certain claims. And this is essentially the percentage of times that those facts would be wrong. And we can see here that, you know, the reasoning theories definitely have some limitations and hallucinations. And while this doesn't seem like a big deal, remember that even a 1% hallucination rate across millions of users in a certain industry is going to be a large amount of errors when it adds up. So this is why this is such a big deal. Now, crazily, crazily, we have a situation on our hands because recently a lawyer representing Anthropic admitted to using an erroneous citation created by the company's Claude AI chatbot in its ongoing legal battle with music publishers. According to a filing made in a Northern Californian court on Thursday, Claude hallucinated the citation with an inaccurate title and inaccurate authors, Anthropic said in the filing. Anthropic's lawyers explained that their manual citation check did not catch it, nor several other errors that were caused by clause hallucinations. So, this is why you still need this to be like as close to 100% as possible because on the off chance that hallucination gets through, it is devastating when that comes to fruition. Because right here, like entire cases can fall apart if it hinges on certain evidence or certain facts being real. And this is a real case
fake problem. Now, this isn't really a Gen AI thing, but still a wider AI thing. But we've seen recent AI developments in AI voices, and deep fakes are essentially just any type of media, usually video, audio, images, that use AI to basically replace or mimic someone's face, voice, or action. so convincingly that it looks real. So you could make a video where Elon Musk appears to say something that he never said or even show a politician doing something that they never did and most people wouldn't know that it is fake. Now recently the FBI has been warning senior US officials that they are currently being impersonated using text and AI based voice cloning and hackers are increasingly using more advanced software for statebacked espionage campaigns and major ransomware attacks. So, we need to be more and more careful as we move into the future because the technology is only going to get better and hackers and, you know, these individuals that are trying to take you for every dime you've got are only going to get smarter. And take a look at this. Okay, this is a deep fake tool. This is called, you know, deep live cam. It's been on GitHub for quite some time. And honestly, if I had seen this, I would probably not believe that it's Elon Musk. I would say there's something off about this, but I can't tell. But if I showed this to a live stream to someone, they would be like, "Yeah, well, that's Elon Musk. " I mean, his, you know, body does look a bit weird, but I guess it is it's him. I mean, the lighting, I mean, there's there's really no indication that it isn't Elon Musk here. So, I mean, imagine what someone could do if they're able to look like one of your relatives. They're able to have the voice. I mean, companies I mean, h how are you going to trust? I mean, let's say you're running a company with, you know, 3,000 employees. How hard would it be to impersonate, you know, one of the directors, come on a Skype call, tell them you need to do something. I mean, it's crazy the kind of things that this technology is allowing. Now, one issue that most people don't actually realize is that the over reliance, okay, I'm