# This Free App Runs AI Offline On Your iPhone

## Метаданные

- **Канал:** Matt Wolfe
- **YouTube:** https://www.youtube.com/watch?v=4dZ0VYjB8N8
- **Дата:** 04.03.2026
- **Длительность:** 11:52
- **Просмотры:** 56,405
- **Источник:** https://ekstraktznaniy.ru/video/11070

## Описание

Trying out locally AI to run AI models on my phone without internet.

Discover More:
🛠️ Explore AI Tools & News: https://futuretools.io/
📰 Weekly Newsletter: https://futuretools.io/newsletter
🎙️ The Next Wave Podcast: https://youtube.com/@TheNextWavePod

Socials:
❌ Twiter/X: https://x.com/mreflow
🖼️ Instagram: https://instagram.com/mr.eflow
🧵 Threads: https://www.threads.net/@mr.eflow
🟦 LinkedIn: https://www.linkedin.com/in/matt-wolfe-30841712/
👍 Facebook: https://www.facebook.com/mattrwolfe

Resources From Today's Video:
https://locallyai.app/

Let’s work together!
- Brand, sponsorship & business inquiries: mattwolfe@smoothmedia.co

#AINews #AITools #ArtificialIntelligence

## Транскрипт

### Segment 1 (00:00 - 05:00) []

Okay, so I came across something really cool that I think everybody's going to want to know about and it's the ability to run AI models on your phone that are actually really good AI models without needing to be connected to the internet. So like you could use these models while you're on a plane and at your house when you don't want them sent to any sort of cloud service. You don't want to use OpenAI or Anthropic or Google or any of those companies. You could use some of the best available openweight models on your phone. So check this out. This should be a pretty quick video. So, I came across this post from Adrian Gronden here where he was showing that he was running the new Quinn 3. 5 on his computer. And if you actually zoom in here, you can actually see he's using this in airplane mode. It's actually running pretty fast. And he's using the Quinn 3. 52B model. This is like a brand new model here. If you're not familiar with this new Quinn 3. 5 model, it literally just came out this week. It came out on March 2nd. It came out in four variations. an 800 million parameter model, a two billion parameter model, a four billion parameter model, and a 9 billion parameter model. And it's actually a really solid model. Like most of the type of stuff you would want to do on a phone, this is going to work great for. Like you're not probably doing any sort of insane logic problems or having it solve some complex math, you're probably just going to have it like brainstorming with you or asking it how to deal with your kid that's having a meltdown right now or something. But check this out. Here's the benchmarks. And as you know, I don't put a ton of weight on benchmarks, but these models are on par with the best open-source GPT model. Performs better than GPT5 Nano in most of these benchmarks. Like, it's a really decent model. So, jumping back to this Adrian dude here, I wanted to figure out what the heck is he doing to actually run these models on his phone. And it turns out that he created something called Locally AI app. So, I went and found that app. You can find it in the app store. It's called Locally AI. I will link to it just to make sure that you're getting to the right one if you do want to download it. Let's go ahead and grab it. I'm just going to scan the QR code on the website here. And well, here's what it looks like inside the app store. Let's go ahead and click get. And as of right now, 4. 8 stars with 579 ratings. So, people seem to like this one. And let's take a look here. So, when we open the app, it's going to ask us which model to choose. You've got the Apple Foundation model if you're on an iPhone that's built in. We've got Gemma 2, Quinn 3 1. 7B, and Llama 3. 2. Now, I want to use the new Quinn 3. 5, which should be available. So, let's just go ahead and skip. I'll select a model. And when I select a model after skipping, I actually have more models available to me that weren't showing on that initial screen. Now, for running on a device like this, the smartest thinking model that we're going to be able to get on our phone is going to be this Quinn 3. 5. This is the one we're going to want. So, I'm going to go ahead and select that one. And then we have a few options. You can grab the 4 billion parameter model, but it does recommend an iPhone 15 Pro or newer to be able to run that one. You have the 2 billion parameter model, which recommends an iPhone 15, not necessarily a Pro. And then Quinn 3. 5 with the 800 million parameter model, and that one can run on an iPhone 14 or newer. I'm on a 17 Pro here, so I'm going to grab 3. 5, but obviously grab whichever model is going to work best on the phone that you're on. It's going to take a moment to download here. Okay, so it took roughly 5 minutes on my Wi-Fi. Obviously, your results may vary depending on your own internet speeds, but I'm going to quickly download the 2 billion parameter version as well so that I can kind of compare speeds. All right, that took a few more minutes, but I now have both those models. Let's browse around real quick. So, up in the settings, I've got manage models where I can add more models in personalization. I can actually give it custom instructions. That's pretty cool. Can also change the temperature. I'll leave everything on their defaults for right now. We can delete our whole conversation history. And yeah, that's about all the settings we've got. But also, you can add a series shortcut where you can say, "Hey, locally AI and ask questions directly to the app. " Pretty cool. All right, so right now it's on the Quinn 3. 54 billion parameter model. Let's do a basic test here. How many Rs are in the word strawberry? We can see it's breaking it all down for us and found there are three Rs in the word strawberry. Now, I know this shows that I'm on Wi-Fi right now. Unfortunately, if I turn off Wi-Fi, my ability to share my screen to my computer also turns off. But I have tested this and it does work with Wi-Fi turned off. You do not need to be on the internet. This is not sending any information to the cloud whatsoever. It is operating completely on your phone. I'm going to go ahead and switch it to the 2 billion parameter model here. Let's start a new chat up here and I'll ask one of the sillier questions that's been going around right now. If there's a car wash that's 200 meters from my house, should I walk or should I drive to get there? Let's see what it says. It has some of the same logic flaws that the other models have. To decide whether to walk or drive, you need to compare the travel time under different conditions. Compare exact travel times. Consider non-time factors. You should drive if your driving time is

### Segment 2 (05:00 - 10:00) [5:00]

less than walking time in all certain scenarios where walking takes over an hour. Driving is blah blah. Um, yeah. Why would I walk if I need my car with me to use the car wash? If your car wash requires your car to be present, then walking becomes a practical necessity, not a theoretical choice. So, I mean, it's not going to be the best at logic like some of the newer models are. But if you're using it as like a brainstorm partner, like brainstorm some ideas for YouTube videos about how AI will impact our daily lives. And because I'm using this two billion parameter model, it's moving actually pretty quickly. And again, pretend I'm offline right now cuz it's not connecting to the internet to give me these responses. AI will take your job. The hard truth. Could your cat control your doorbell with chat GPT? Just trying to think through that video. Is my voice learning by listening to a podcast. Robot exoplanet robo turtles verse human dog walking. Point of view horror. Why AI might just be an alias for me. Five AI tools that actually work better than CDDP. I don't know what CDDP is. That some is that an acronym I should know. How I'm using AI to unlearn my bad habits. The death of copy paste. AI generated relationships. Curating the digital supermarket. AI for caregiving. I mean, there's a lot of ideas here. And I could be sitting on an airplane getting these ideas. I also didn't even have the thinking mode turned on. You can see down in the bottom left there's a little light bulb here. Let's turn the light bulb on and say, "Give me even more ideas. " All right. So now it's not just immediately responding. It's actually thinking through and we can see the chain of thought as it's doing this. So we actually have like a chain of thought thinking model again running completely on device. I do feel my phone getting warmer. It's definitely warming up as I do this. It's not hot, but it's using the processing power for sure. It's actually spending quite a bit of time thinking. And now it's giving me a response. Here's a list of 30 plus specific niche deeply engaging YouTube video ideas. What I'm telling you right now, a direct AI interaction, five deep fakes you won't see in 2024, but you will. The interview with a human, the interview with an AI, testing the prompt ecosystem on a job market analysis, how I caught myself being spammy, the AI filter. You get the idea. It's given me just a ton of ideas right now. As this chat gets longer, you can probably even tell from my video that it's actually starting to get a little bit choppy. So, it will slow down a little bit as there's like more details in the context. Just my scrolling here is getting choppy now. But it's also a visual model. So, let's go ahead and stop this and let's open up a new chat. I'm going to turn off thinking. Let's click the plus button here and take a photo so you can see what I'm looking at here. And this is what I'm drinking right now. Let's take a picture of it. Let me go like that so it's not got that crazy effect going on. All right, let's take a picture. I'll use this photo and we'll say, "Is this a healthy option? " Yes, it's a healthy option. Zero sugar, natural flavors, blah blah blah. And it's actually running fast again because it's a new chat, so it's not trying to pull in all the context from the previous part of the discussion. And I guess just to prove that it will work without Wi-Fi, I'm going to Let's go ahead and record my screen. I'll do record. And then I'm actually going to turn off Wi-Fi. And from this moment on, it's going to be using my phone's recording. So, I'll turn off Wi-Fi here and I'll also put it in uh airplane mode. So, it's not using the 5G at all. So, we have zero internet. We are in airplane mode right now. Let's start a new chat. Let's go. My kid is throwing a fit because I took his iPad away. How should I calm them down? And we could see it working right now. It is streaming pretty quickly. And again, you could see up in the top right, I am in airplane mode. No internet connected at all right now. So, if you're at 30,000 ft on a plane and your kids's having a meltdown and you don't have access to internet, but you want to ask an AI for some support, well, you've got a good tool to do that right here. And it's not giving me just like a short answer either. It's giving me like a super in-depth, pretty decently thoughtout answer to my prompt. So, I mean, look at that. It's even got some like nice design to it. Everything's looking good. Very, very cool. All right. Now, it's also just like chat GPT and Enthropic got a voice mode. Now, just to make it easier for my recording, I did switch my Wi-Fi back on so I could stream my Bode to my computer, but let's go ahead and test the voice mode. It's quickly downloading the voice mode here. Apparently, wasn't installed by default. Let's do a new chat. I'm feeling pretty hungry, but I can't decide what to eat. Can you suggest some things for dinner? — Since you're still hungry, here are a few quick options. Grilled salmon, fresh and filling. Chicken stir fry, comforting and versatile. Vegetable soup, warm and healing. Taco bar. A mix of different favorites. Pizza. Simple and satisfying. Do you have any specific ingredients you have in the fridge? — Interesting. A taco bar. I mean, they're not bad suggestions. I'm just a little bit confused about how a taco bar is

### Segment 3 (10:00 - 11:00) [10:00]

what I should eat. I guess it makes sense, right? Like you put out all the ingredients and then you make a taco with what you have available. But anyway, I find this pretty cool, right? I think it's awesome that we can get local models that are actually pretty decent. Like a model that can run completely locally on your phone without internet. that's better than what we were getting out of chat GPT like, you know, a year and a half ago. It's obviously not the most state-of-the-art models like your GPT 5. 3s or your Claude Opus 4. 6 or even your Sonnet 4. 6s because those are just too big to still be able to run on device. Those kind of require a cloud or just like insane GPU power. But we're talking about models that are probably better than the most state-of-the-art model we had, you know, a year and a half, two years ago. And that's pretty cool. Anyway, this video is not sponsored. This company locally AI or Adrian who uh made that app has no idea I'm even making this video. I just thought it was cool that we could do this now and it works fairly well. And there's been a lot of drama and craziness going on in both the AI world and just the world in general. And I just wanted to make a video about a cool tool that I was playing with. So that's what I did. This is a cool tool that I came across and I thought this is something I think other people might want to know about. and play with. It's not sending your messages to a cloud. It's staying completely on your phone. You don't need internet to use it. OpenAI, Anthropic, Google, XAI, none of those companies are getting any of your data. They're not able to train on any of your prompts cuz again, it all is just staying on your phone. That's cool. That's fun. I wanted to show it off. So, that's what I got for you. Pretty sweet. And uh yeah, I don't really know what else to say about it. Go check it out. It is free. Like, you just need a semi- new iPhone. like an iPhone that was made within the last four or five years and you could go do this. All right, that's what I got for you. Thanks for nerding out with me. Hopefully I'll see you in the next one.
