# Groq - New ChatGPT competitor with INSANE Speed

## Метаданные

- **Канал:** Skill Leap AI
- **YouTube:** https://www.youtube.com/watch?v=fv69M1M_fXw
- **Дата:** 20.02.2024
- **Длительность:** 6:36
- **Просмотры:** 23,313
- **Источник:** https://ekstraktznaniy.ru/video/12694

## Описание

There is a brand new AI Chatbot platform called Groq that can answer any prompt in near real-time speed.

In my testing, it was 3-4 times faster than ChatGPT.

Two important notes before I show you how it works in more detail.

Groq with a Q is not the same as Grok on Twitter.
When I looked into it, it turns out the Groq has been around for much longer and they even have the trademark on the word. And they released a letter asking Elon Musk to change the name of his AI chatbot, Grok.

Groq is also not a large language model like Grok and ChatGPT. It’s a language processing unit. So its purpose is to run AI models like large language models using that technology and give you speed that other models haven’t achieved.

They way Groq explains it’s speed is due to it’s hardware.  They use LPU which is a new technology they created, specifically for this reason. 
All other LLMs are using GPUs.

## Транскрипт

### Segment 1 (00:00 - 05:00) []

there's a brand new AI chatbot platform is called Gro and it could answer any prompt in almost realtime speed this is called Gro with a Q by the way and it's different than the one on Twitter with a K I'll talk about that in a second now let me show you the speed here then I'll show you exactly how this grock platform works so on the left side I have GPT 3. 5 is the free version because Gro is also a free website here and I'll press send here and if you look up here it's going to say basically how many tokens per second is processing this is close to 300 sometimes it's closer to 450 tokens which is in this case was about 300 words or so now before I show you some of the options here and how it works two important notes Gro here with a Q is again a different model than the one that you're using on Twitter called grock with a K that's the one I'm paying for right now that's a paid upgrade on the Twitter platform but when I looked into it turns out grock this one with a Q is actually a much older company and they have the trademark on this name so they just published a letter to Elon Musk saying please change the name of your AI chatbot we've had this for a long time I thought it was the other way around that just name this because this is the first I've heard of this company but this is actually a hardware company so grock like chat GPT like Gemini is not a large language model okay so what it's doing is running large language models here just on this website gro. com so you could run llama 2 which is the one from meta this is a open- source platform and you could also run mix roll here and this one is grayed out right now but I'm assuming it will roll out different language model from the same company these are all open source models basically I usually don't cover open source on this platform because I found chat GPT and Gemini really outperform them but again this is totally free and it's available right now for you to test out and it's extremely fast so one of the reasons that really impressed me with this platform is almost the realtime speed now from time to time it still is limited on this specific website just because it's going viral right now so you may not get your request put in right away to get that instant speed that I just got in that version especially when I use mix roll more people are using this model right now on this website you basically were in a weight list but the generation still was almost in real time when your time was up let me explain quickly why grock is so fast because this might completely change how other large language models actually run in the background so grock is a hardware company so what they buil is this thing called lpu so lpu stands for language processing unit this is a hardware and this is what's actually powering these large language models the open source ones here to run so fast so this is the first of its kind so they plan on running different large language models and I'm assuming other AI models on top of this new technology and other models that we were used to chat GPT for example runs on gpus so Nvidia makes graphic cards those gpus and those gpus are the way every single large language models and AI model is being powered and they basically have a bunch of blog post and a bunch of different benchmarks but you could see grock is all the way over here everybody else is over here and that's based on the tokens per second that I showed you that it generates again it's using a whole different Hardware in the background now let me quickly show you this website and the different things you could do and then I'll show you exactly how they make money because using this website is really not their business model but this is now a free version of a large language model that you could use on this website so you could modify any output here and look how quick this is if I want to get an educator Tor out of it I could click that and that's pretty fast it'll show us 200 280 tokens per second that it generated so modifying it and you'll see right here it says active request it had that stop sign that's what I was talking about sometimes you might be standing in the line but that's nothing to do with the hardware processing that's I'm assuming the virality and causing this website to create that line but look at that every time I'm getting close to 300 using llama 2 right here I'm using the Llama 2 model so you could change your model so you're technically getting a couple different chatbots inside of one website or large language model and a couple other settings you could go over here this is the system prompts so if you're used to chat GPT you could basically set custom instruction at the account level this works in a similar way so I have different videos about custom instruction so if you want to add them here add them to your system and for more advanced users here there's the system settings where you could see the different things like your token output llama is set to this 4K token output but mix roll is set to 32,000 so you'll see

### Segment 2 (05:00 - 06:00) [5:00]

that change over here and again these are some more advanced settings if you are a advanced prompt engineer that you could tweak here now how useful is this website well it's very limited if you're used to things like chat GPT and Gemini because this has no internet access of any kind we don't have things like custom gpts plugins this is just a straight open- Source large language model that is powered by this new processing speed right so if you want speed this one Beats everything it's not even remotely close so they're really demoing that for you to show you how fast this is so that's why I wanted to show you because this could be the first time we're seeing a whole different Tech behind the scenes that could really change how we move forward and maybe we don't use gpus down the line and we use this kind of processing and they also have API access so this is why they're really demoing this website to show you that you could access the Llama API here and they have a 10-day free trial this is application only so you could go ahead and apply but it's extremely cheap relative to most apis that I've seen out there so if you're trying to build something using this kind of tech this might be worth a look instead of paying maybe open AI for their API or CLA Gemini this is just another alternative here where you could get API access I'm only making this video to demonstrate speed as far as usability obviously chpd CLA and Gemini are going to be at a whole different level but speed this is in own category so really worth a try gro. com hope you found this useful I'll see you next time
