# Build a Virtual Assistant in Python | Speech Recognition Project 5

## Метаданные

- **Канал:** AssemblyAI
- **YouTube:** https://www.youtube.com/watch?v=p-3tZH-I2xI
- **Дата:** 10.06.2022
- **Длительность:** 19:06
- **Просмотры:** 11,915

## Описание

In this tutorial, we will build a virtual assistant in Python using real-time speech recognition and OpenAI.

Find out more at AssemblyAI Streaming STT docs: https://www.assemblyai.com/docs/speech-to-text/streaming/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_pat_41

Get your Free Token for AssemblyAI Speech-To-Text API 👇https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_pat_41

Get the code here: https://github.com/AssemblyAI-Examples/python-speech-recognition-course

▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬

🖥️ Website: https://www.assemblyai.com
🐦 Twitter: https://twitter.com/AssemblyAI
🦾 Discord: https://discord.gg/Cd8MyVJAXd
▶️  Subscribe: https://www.youtube.com/c/AssemblyAI?sub_confirmation=1
🔥 We're hiring! Check our open roles: https://www.assemblyai.com/careers

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#MachineLearning #Python

Microphone icons created by Freepik - Flaticon: https://www.flaticon.com/free-icons/microphone
Bot icons created by Smashicons - Flaticon: https://www.flaticon.com/free-icons/bot

## Содержание

### [0:00](https://www.youtube.com/watch?v=p-3tZH-I2xI) Intro

welcome to the final project in this one you will learn a bunch of new exciting technologies first of all you will learn how to do real-time speech recognition in python then you will learn how to use the open ai api and build a virtual assistant or chat chatbot and finally you will learn a little bit about websockets and how to use async io in python so i think this is going to be really fun and first of all let me show you the final project so now when i run the code i can start talking to my bot and ask questions what's your name how old are you what's the best ice cream and you see this works so i think this is super exciting so now let's get started alright so here i have a new project folder and again we have our api secrets file and now a new main. pi file

### [1:00](https://www.youtube.com/watch?v=p-3tZH-I2xI&t=60s) RealTime Speech Recognition

and the first thing we're going to do is set up real-time speech recognition and for this we have a detailed blog post on the assembly ai block this will walk you through this step by step so first of all we need pi audio to do the microphone recording so this is the very same thing that we learned in part one then we use web sockets and then we use the assembly ai real time speech recognition feature that works over websockets and then we create a function to send the data from our microphone recording and also a function to receive the data and then we can do whatever we want with this so but in order to just copy and paste this let's actually code this together so let's get started um one note here in

### [1:45](https://www.youtube.com/watch?v=p-3tZH-I2xI&t=105s) Setup

order to use the real-time feature you need to upgrade your account though so yeah but anyway let's get started so let's import all the things we need so we want um pi audio again then we need web sockets so we say import websockets and this is a third-party library that i showed you in the beginning that makes it easy to work with websockets and this is built on top of async io so now we're going to you to build async code then we also import async io we also import base64 so we need to encode the data to a base64 string before we send this and then we import json to receive the json result and then we say from api secrets we import our api key from assembly ai and now the first thing we set up is set up our microphone recording so for this we use the exact same code that we learned in part one so i simply copy and paste this part from here so let's copy and paste so we set up our parameters then our pi audio instance and then we create our stream and now we need to define the url for the websocket and we can find this on the blog post homepage so here i can copy and paste the url so the url is at websockets and then assemblyai. com and then real time and then the last part is also important so here we say question mark sample rate equals sixteen thousand so this is the same rate that we use here so make sure to align this with what you have and now we create one

### [3:45](https://www.youtube.com/watch?v=p-3tZH-I2xI&t=225s) Send Receive

function to send and receive the data and this is a async function so we say async def and we call the send receive so this is responsible for both sending and receiving the data and now we connect to the websocket and we do this with a async context manager so again we say async and then with and then websockets dot connect and now we specify the parameters url then we say a we set a ping timeout and we can set this to 20 for example then we want a ping interval and this should be five and then we also need to send our authorization token so the key or the parameter for this is extra headers and this is a dictionary with the key authorization and the value is our token and then we say async with s and then we can call this what we want so i say underscore w s for websocket then first we wait to let this connect so here we say await async i o a sync i o dot sleep 0. 1 so be careful here we cannot use time. sleep so we are inside a async function so we have to use the async sleep function and then we um wait or we try to connect and wait for the result so we say um session underscore begins equals and then again a weight underscore w s and then this is called r e s v for receive i guess and then we can print the data and see how this looks let's also print sending messages and now we need two inner functions so again a async function so we say async def sent and for now we simply say pass and then we say async def receive and here also we pass and actually these are both these both will have a infinite while true loop so they will run infinitely and listen for incoming data so here we say while true and for now let's just print um sending and here we also say while true and here we simply pass so i don't want to spoil our output and now after this we need to combine them in a async io way so in order to do this we say um we call the gather function so it's called async io dot gather and now here we gather sent and receive and this will return to things so the send result and the receive result so actually we don't need this but just in case we have this here and now after defining this function of course we also have to run the code and we have to run this in an infinite loop and in order to do this we call async io and then dot run and then our send receive function so now this should connect and then should print sending all the time so let's run this and hope that this works so yeah it's already connected and sending works so you see that's why i didn't put the receive in here as well so we get a lot of outputs and yeah i can't even scroll to the top anymore but basically yeah it should have printed this once and then now this is working so far so we can continue implementing these two functions now so now let's implement the

### [8:33](https://www.youtube.com/watch?v=p-3tZH-I2xI&t=513s) Send

send function first and we wrap this in a try except block and now we read the microphone input so we say stream dot reads and then we specify the frames per buffer and i also want to say exception on overflow equals false so sometimes when the websocket connection is too slow there might be an overflow and then we have an exception but i don't want this it should still work and then we need to convert this or encode it in base64 so we say base64 b 64 encode our data and then we decode it again in utf-8 this is what assembly ai expects then we need to convert it to a json object so we say json dump s and then this is a dictionary with the key audio data so again this is what assembly ai needs and then here we put in the data and then we send this and we also have to await this so awaits ws sends the json data and then we have to catch a few errors so let's copy this from our blog post so these ones let's copy and paste this in here so um we accept a web sockets exceptions connection closed error then we print the error then we make sure it's of this code and then we also break and then we catch every other error so it's not best practice to do it like this but it's fine for this simple tutorial and then we assert here and then after each while true iteration we also sleep again and yeah so now we can copy this whole code and paste it into this so the code is very similar here so we have the same try except but now here of course we have to wait for the

### [10:52](https://www.youtube.com/watch?v=p-3tZH-I2xI&t=652s) Dictionary

transcription result from assembly ai so we say result string equals and then again we wait and then the w s r e s v then we can convert this to a dictionary by saying results equals json dot load from a string and here the result string and now this has a few so this is a json object or now in python it's a dictionary so now we can check a few um keys so we can get the prompt or actually now this is the transcription of what we set so we say prompt equals results and then it tests the key text and it also has a key that is called message type so now we check if we have a prompt and if the results and then the key message underscore type and now this should be final transcript and now what assembly ai is doing it will while we are talking it will already start sending the transcript and once we finished our sentence it will do another pass and make a few small corrections if necessary and then we get the final transcript so we want only the final transcripts and now for now let's print um me and then let's print the prompt and now we want to use our chatbot so now let's print um bot and then let's for now let's simply print um let's print a random text for now and then we set up this in the next step but first i want to test this so let's say this is my answer and this is all that we need for the receive functions so let's clear this and run this and test this oh we get an error a weight wasn't used with future async io gather oh this is a classic mistake of course here i have to say await async io gather so let's run this again and now it's working so yeah what's your name and you see the transcript is working so now i stopped this but if i scroll up what's your name and each time we get this is my answer so this is working and now of course

### [13:45](https://www.youtube.com/watch?v=p-3tZH-I2xI&t=825s) OpenAI

here we want to do a clever thing with our prompt and use our virtual assistant so for this we now set up open ai so they have a api that provides access to gpt-3 and this can perform a wide variety of natural language tasks so in order to use this you have to sign up but you can do this for free and you get a free you get free credits so this will be more than enough to play around with this and it's actually super simple to set this up so let's create a new file and i call this um let's call this open a i helper dot pi and then we also have to install this so we have to say pip install open ai and then we also after signing up you get a api token so we have to copy this in api secrets and then we can use this and now we can import open ai and we also need to import our secret so from api secrets we import our api key open ai then we have to set this so we say open a i dot api key equals um api key and now we want to do question answering so the open ai api is actually super simple to use so we can click on examples and then we see a bunch of different examples so open ai can do a lot of things for example q and a grammar correction text to command classification a lot of different stuff so let's click on q and a and if we scroll down then here we find the code example so we already set our api key and now we need to grab this and um let's copy this and let's create a helper function so define and let's call this ask computer and this gets the prompt as input and now i paste this in here so we say response equals open a i dot completion dot create then here we specify an engine and now we specify the prompt and in our case the prompt is going to be um the prompt that we put in so prompt equals prompt from the parameter and now there are a lot of other different parameters that you could check out in the documentation so in my case i only want to keep the max token so this will specify how long the result can be and yeah let's say 100 is fine for this and now this is all that we need and now of course we need to um return the response and this is actually a json object again or now a dictionary and we only want to extract the first possible response so it can also send more if you specify this here so in our case we only get one and then we say response and this is in the key choices and then the index zero so the first choice and then the key text so this will be the actual response from gpt3 and now in the main the only thing we have to do is say from open ai helper we import ask computer and then down here in the receive functions and now here we say response equals ask computer and then we put in the prompt and then here this will be our response and now this should be everything that we need so now let's again clear this and run the main. pi and let's hope this works what's your name how old are you where are you from all right so let's stop this again and yeah you see this works and this is how you can build a virtual assistant that works with real-time speech recognition together with openai and yeah i really hope you enjoyed this last project if you've watched all the way through thank you so much and then i hope to see you in the next video bye

---
*Источник: https://ekstraktznaniy.ru/video/13091*