Build an AI Assistant for Meeting Scheduling Step by Step!
8:45

Build an AI Assistant for Meeting Scheduling Step by Step!

AssemblyAI 30.07.2025 1 407 просмотров 39 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
🔑 Get your AssemblyAI API key here: https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_jason_2 📁 Github repo: https://github.com/AssemblyAI-Community/simple-livekit-agent 📚 AssemblyAI docs: https://www.assemblyai.com/docs?utm_source=youtube&utm_medium=referral&utm_campaign=yt_jason_2 💬 Build a real-time AI voice agent using Python, LiveKit, AssemblyAI, Cerebras, and Rime. In this tutorial, you’ll create a voice assistant that transcribes speech with AssemblyAI’s speech-to-text API, generates responses using a Cerebras LLM, and speaks back using Rime’s text-to-speech. We’ll walk through setup with uv, configuring API keys, and running the agent from the terminal. Great for developers building apps with real-time transcription, LLMs, and voice AI. Tools used: → AssemblyAI (STT) → Cerebras (LLM) → Rime (TTS) → LiveKit (audio orchestration) ▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬ 🖥️ Website: https://www.assemblyai.com 🐦 Twitter: https://twitter.com/AssemblyAI 🦾 Discord: https://discord.gg/Cd8MyVJAXd ▶️ Subscribe: https://www.youtube.com/c/AssemblyAI?sub_confirmation=1 🔥 We're hiring! Check our open roles: https://www.assemblyai.com/careers ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ #machinelearning #aiassistant

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

Hey everyone. Today we'll be showing you how to build and run a LiveKit AI voice agent using Assembly AI, Cerebras, and Rhyme. We'll start off by showing the demo we'll be creating. — Hello, I'm here to help with any questions or tasks you have. What do you need assistance with? — Hi, can you tell me what amazing apps I can make with AI voice agents? You can make virtual assistants, chat bots, and voice controlled apps for tasks like smart home automation, language translation, and customer service. Again, this demo uses LiveKit as the orchestrating framework. Under the hood, there are three key steps when interfacing with an AI voice agent. When a user speaks to an agent, that voice data is passed to a speechtoext or ST component. This converts the speech to text. That text is then passed to a large language model or LLM which will generate an appropriate response. That response comes in the form of text which is passed to the texttospech or TTS component to give the response a voice and speak back to the user. We will use assembly AI for the ST element, Cerebrris for the LLM, and Rhyme as the TTS element. Okay, first we're going to create our project and then go into it. And then using UV as our virtual environment manager, we're going to create a new project. And then we're going to use UV to add our dependencies with UV add. You can see here we're going to bring in LiveKit agents and specifically the Assembly AI, OpenAI, Rhyme, Solaro, and Turn Detector plugins. And then also the plugin for noise cancellation as well as the Python-dv package. So we can pull in environment variables from a file. Next, using a text editor or your favorite IDE of choice, uh we're going to use cursor. Inside our IDE, we're going to create av file. These are the environment variables that will be needed. API keys for Assembly AI, Cerebras, Rhyme, and an API key, secret, and URL for LiveKit. Assembly AI's API key can be found on the dashboard after logging in. On the lefth hand column tab, click on API keys. There is a copy button for each existing API key, and there is a create new API key button on the bottom. Cerebras's API key can be found on their cloud dashboard. On the lefth hand side, there is an API keys tab. Select it and you'll see a list of your available keys. Click on the copy icon to use an existing key or create a new one by using the generate API key button in the upper right. From Rymes dashboard, go to the left-h hand column and select API tokens from the settings section. Click on the meatballs or triple dot button to copy an existing key or click the new token button to autocreate a new key. Rhymes keys are unnamed. The LifeKit API key can be found on their dashboard on the lefth hand side. Open the settings disclosure option. Then select API keys. There is a kebab or vertical dot menu option for regenerating tokens or deleting existing keys. By selecting an existing key, a dialogue box with connection and credential details will appear in a dialogue box. Next, we're going to rename the autogenerated main file. And we'll just call this agent. py. All right. And we're going to replace this code with a variation of LiveKit's own AI agent sample code. And here we're importing the Rhyme and Assembly AI plugins as before. So, we're assigning the ST to Assembly AI and we're assigning the LLM to Cerebras and can define which model to use with the model argument. And then for TTS, we're using Rhyme. And also you can configure the model and which voice to use. Before you can run this app, you have to first download the plugins. To do that, enter uvun python agent. py download-files plural. Once the plugins are loaded, you can then run the app by running uvun pythonag agent. py console. — Thank you. Can you tell me about the company Cerebrus AI? — Cerebras AI is a company that develops large-scale AI computing systems including the world's largest chip. — So although we could see the transcriptions of the user within the console output, you can't actually make use of that in code. uh here to do that. What you can do is modify the

Segment 2 (05:00 - 08:00)

assistant class and put in a override for the onuser turn completed function. And when you hook onto this, you can react to the user input message. So you can either log the transcript or affect code from here. Now for tracking and monitoring the agent transcription, this is a little bit more involved. So, what you have to do first is we're going to bring in the Rhyme plugin, specifically the TTS. So, what we're creating is a wrapper around the rhyme TTS function and we're overriding the synthesize function so that we can intercept what the agent is about to speak. So, we're going to replace the TTS call with this new wrapper class. And then for assembly AI, what we're going to do is we're going to modify the arguments and we're going to add a format turns true. What this will do is it will format the final transcription to be more readable such as adding commas and periods. And just to highlight the difference here, what we'll do is we will change the speaker for the rhyme model. Here in the rhyme dashboard, go to the pronunciation tab first and then on the right you'll see a list of voices. So you can select any of these and press — hi my name is Cove. That's spelled C O V E. — You can also see a list of these in documentation. You can see here that uh this list here has I guess their flagship voices. And if you click on further on the online link down here at the bottom, you can see that there is actually an endpoint you can call to get a full list of the voices available. And that list will look something like this. So these are all lowercase names. To contrast with what we've got uh from earlier, I'm going to look for any male voice here. — Hi, my name is Cove. — I'm on a mission to make local businesses shine. — All right, Marsh it is. So I believe it's just Marsh lowerase here. So, we'll just copy this and paste it into the code so we can run this app again. — Hello, I'm here to help. What do you need assistance with? — Hi, can you tell me about the company Rhyme AI spelled R I M E? Ramai is a company that develops conversational AI models. They focus on creating humanlike chatbots and virtual assistants. All right, now that we know this is running locally, we can actually push this up to LiveKit cloud for running in their playground which you can share globally. To do that, just enter in uvun python agent. py dev. So we can see this agent has been deployed successfully. If you go to agents-playground. lifekit. io, IO. You'll be presented with this interface after logging in. And here with Life Cloud, I've already got a project set up. So, I can just click on this and it will take me to an interface that looks much nicer than the command line CLI. — Hello, I'm here to help with any questions or tasks you have. What do you need assistance with today? — Hi, can you tell me about AI agents and what they can be used for? — AI agents are autonomous programs that can perform tasks, make decisions, and interact with their environment. They can be used for various applications such as customer service, data analysis, and process automation among others. And that's it for this video. Don't forget to like and subscribe and check out our other videos on voice and AI technologies on our YouTube channel.

Другие видео автора — AssemblyAI

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник