# How to make a web app that transcribes YouTube videos with Streamlit | Part 1

## Метаданные

- **Канал:** AssemblyAI
- **YouTube:** https://www.youtube.com/watch?v=CrLmgrGiVVY
- **Дата:** 30.10.2021
- **Длительность:** 16:54
- **Просмотры:** 7,781

## Описание

Let's build an interactive web app that can transcribe YouTube videos in minutes! Streamlit is a great Python library that makes web development a piece of cake. And on top of Streamlit's powerful framework we will plug in Assembly AI's easy-to-use API to quickly upload and transcribe audio files.

Sign up for a free AssemblyAI API token here 👇 
https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_mis_2

In the first part of the tutorial we will create a project structure, install all necessary dependencies, create a base Streamlit app with which we can test our code and develop the Python function that starts the transcription process.

We will use three main libraries: Streamlit, YouTube_dl and FFmpeg. These libraries will enable us to create a front-end for our project, download youtube videos and extract audio files from youtube videos respectively.

All the code in this tutorial is shared through a public Github repository. (https://github.com/misraturp/YouTube-transcriber)

This is all possible with only the free API token provided by Assembly AI. Sign up to get your own free API token here: https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=channel_assemblyai

👩‍💻 Grab the code: https://github.com/misraturp/YouTube-transcriber
📥 Download FFmpeg: https://ffbinaries.com/downloads
✍️ See this tutorial in written format: https://www.assemblyai.com/blog/how-to-get-the-transcript-of-a-youtube-video/

00:00 Introduction
00:19 App overview
00:31 Get your own free API token
01:01 Let's get started!
01:25 Creating the project structure
02:35 Installing dependencies
05:26 Starting the Streamlit application
06:40 Setting up youtube_dl constants
07:50 Setting up the AssemblyAI API options
08:33 Getting the free API token
09:39 Extracting audio from the YouTube video
12:38 Uploading the audio to AssemblyAI
13:15 Starting the transcription
15:22 Checking that the code works correctly
16:31 Like and subscribe!

## Содержание

### [0:00](https://www.youtube.com/watch?v=CrLmgrGiVVY) Introduction

hey and welcome in this video we will build a web app using streamlit and assembly ai that has the power to transcribe youtube videos just from their link this will be a two-part tutorial in this video we will write the code that transcribes a youtube video in the next video we will go into the details of a stream it application and make it more user-friendly so once we're

### [0:19](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=19s) App overview

done with it this is what the app will look like it will have the power to get a youtube link as input show the video in a player and also display the transcription of the video to the user below the video

### [0:31](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=31s) Get your own free API token

and we will be able to create the transcript thanks to assembly ai provides an easy to use api to automatically transcribe audio and video files with human level accuracy and you can sign up for assembly ai for free too i am using the free api token they have for this tutorial and once you sign up for a free account you get three hours of audio transcription per month to follow along in the tutorial and build your own app too don't forget to get the free api token through the link in the description all right so now we

### [1:01](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=61s) Let's get started!

can get started with the project i will show you how to do this project on a step-by-step manner first i will show you the project structure and then we will install all the libraries and dependencies that we need after we will create a streamlight application kind of a base application that will be an empty web app and then later we will fill it with the code that we need to transcribe the file so let's get started the first thing that i want

### [1:25](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=85s) Creating the project structure

to do is to create a project folder i will do that on my desktop and call the project youtube transcriber once i've done that i can add the files that i need in there the first one is of course the main web application file python file so for that one for now i'm only importing streamlit so this is just to create the project uh i will call this one again youtube trench criber uh this will be the main project that will disable the main file we will run the project on but i also need one other file in there which will be the configuration file so in there i'm going to keep my free authentication key that i got from assembly ai so for that one i will just for now i'll leave it empty but you know your api key will go in there this one i'm also going to save in my folder under the name configure. pi

### [2:35](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=155s) Installing dependencies

once we're done with this the next thing is to download the dependencies of course so the first dependency i have of course is streamlight right so i will install streamlight using pip i will take a second and the next thing that i want to install is this library called youtube dl this library helps us download youtube videos uh from youtube so i will also really quickly install it using pip i already had that on my computer so it didn't take long it might take a tad longer for you the next thing that we want to install is a library called ffmpeg so this library helps us deal with video and audio files specifically for this one video files of course so it's a little bit trickier to download or install than the other libraries and it's different for microsoft and mac os um still very simple though so what we need to do is to go to their downloads web page i will leave the link and everything that you need in the description also for windows you just need to choose the correct one and download these files and put them anywhere this could be in your on your desktop the file your project file whenever or folder wherever you want and later while we're going further in the project i will show you where you need to specify where you put them so that your project can find or your app can find and run them when needed for mac os it's a bit different we just need to download them again after downloading i need to unzip them so the next thing that i want to do is to put them in a place where my project can find them when it wants to run them right when it needs to run them so what i'm going to do for this is to put them at a place and then add this place to the path a path is basically a place where your computer checks to see if there are any executable files so for that i'm going to go where they are which is a desktop and then i'm going to copy them to a specific place on my computer so that i can add that later to the path and then my computer will know where to find them but this could be different for you if you want you can only add them to your project uh folder and then point the path to that place but it's kind of up to you once this is done i'll of course need to specify the path and tell the computer that that is the place where it needs to look at so open this file in any of the editors that you like and add one extra line at the end specifying where it needs to look for these executables and save and quit all right so this basically finalizes everything that we need to do in terms of dependencies next

### [5:26](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=326s) Starting the Streamlit application

thing that i want to do is to create my streaming application to kind of start filling it with things um one thing i can do is first to create a title of course i will call it an easy way to transcribe youtube videos and we can already even have a text input where people can uh input the youtube links that they want to transcribe all right so let's start with this for now and then see how our application looks to run the stimulus application all you need to do is run streamlist run and the name of your application but we're not in the folder yet so i'll first go there all right so this is what our application looks like for now it doesn't do anything i can fill it with whatever and uh but we have a nice title and we have a way of also collecting the input so next let's look into how to start developing the code to download the youtube video turn it into an audio file and then transcribe it to do this we're

### [6:40](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=400s) Setting up youtube_dl constants

going to need to set up some more things uh in our code specifically so as you know we downloaded some of the libraries that we want to use so we need to set up some options and constants to be able to use these libraries and also to use the assembly ai api the first thing that we want to set up is the youtube dl library so as i said we're first going to change these youtube videos into audio files and then we're going to save them to our local computer for that one we are using the best audio formats we are going to turn them into mp3s um and then we are going to save them on our computers using this file name so it's going to be the id of the youtube video plus the extension which is mp3 one other thing here as i mentioned we are going to there are different processes to use the ffmpeg library so if you are on a windows computer here is where you need to specify where you keep those executable files that we downloaded so don't forget to change that here so this is to use the youtube dl and the ffmpeg library

### [7:50](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=470s) Setting up the AssemblyAI API options

the next thing that we want to set up is how to use the assembly ai api of course so from assembly ai we're going to use two endpoints one of them is going to be the upload endpoint that's going to help us upload the audio file that we have locally to their servers and the second one is going to be the transcript endpoint so you know very easy to understandable from the name also it's going to transcribe the audio file that we are going we will upload to their system uh for that one of course we need to set up a header specifically for authentication if you remember we have a separate file here configure. pi where we are going to have our authentication key so now that we're setting this up let's

### [8:33](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=513s) Getting the free API token

go and get our free api token from assembly ai it's very easy to get a free api token from assembly ai you can either go to assemblyai. com or use the link in the description to end up at this page where you can sign up for a free api token uh it's very easy just say start now for free create an account and then activate your account and then finally you can reach your own api key so at the end once you have your api key you're going to end up at a page like this and here you will be able to see your api key so i'm just going to click to copy this one and then add it here and that's all i need to do so now when i want to use assembly ai i have everything set up and the authorization will work perfectly and the last constant that i want to specify is chunk size is going is something that we're going to be specifying while we are uploading the file so in case the audio file is a very big file we want to upload it in chunks and not all in one go so now for the

### [9:39](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=579s) Extracting audio from the YouTube video

actual transcription part of this project uh i will keep everything in the same function as much as i can so it's easier to call from the stream application it will just be like a one line where we can read the link from the user input and then start transcribing using assembly ai's api i'll create this function now and of course it needs a it needs the link as an argument all right so the first part of this function is going to be downloading the video right so what i do here is i use the youtube dl library to get the link strip it from any blank spaces that might be before the link or after the link and then feed it to the youtube dl library and have it save it to my desktop or wherever i want to save it using this output format that way or the naming that we specified before so just to kind of try that this actually works i want to integrate this function to my streamlight application already so link is going to be something that i get from the text input of the user and then i'm going to call this function yeah and then see what happens so i will run this one go see that my application is rerun so of course we get this error now saying that uh zero or empty string is not a valid url because we don't have a default url so i'm just going to go here and add a default url even before the user has inputted anything i'll be run the application so and now nothing is happening in the stream inside of course because that's because we are downloading the video so if you see uh if you come back to your terminal you will actually see that the uh application started downloading um the video and you can see a percentage here and this is of course going to take some time because you're downloading a video and you're going to save it as an audio at your local file system so we're going to have to wait for this video to be downloaded so once this is done we will see the saved mp3 to the youtube video audio mp3 line here so actually when you go to your file or for in your folder you will see that it is downloaded as an audio file here you can even start listening to it every single day and that is actually the correct uh file so that's perfect now we have the audio file on our laptop all we need to do is to upload it to the assembly ai's platform or servers and then ask it to be transcribed so from now on once i

### [12:38](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=758s) Uploading the audio to AssemblyAI

have the file downloaded the audio file downloaded and saved into my system what we need to do is to upload it to assembly ai so as you remember we already set up the headers we set up the authentication and everything we specified upload endpoint and this is very simple just using the requests library of python you can easily post to assembly ai's upload endpoint using the header of course with the authentication and specifying the data and we know where the save location is because we just did it and we are just starting the upload right now so um

### [13:15](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=795s) Starting the transcription

after we've done this the next thing that we want to do is to stand start the transcription and it is also very similar again here's how we do that basically we are sending a transcription request to assembly ai right one of the things that we passed to it is the audio url that we actually got as a response to our uploading the audio to the assembly ai servers additional to that we are also passing if we are interested in learning what category this audio file belongs to or not for that one i will create another argument so we can decide that on the go later if you want to so depending on that argument we will decide if you want categories as a return in return or not uh and again using the requests library of python we are using a transcript endpoint and this little json uh we just created a json variable we just created and the headers for authentication we will create the transcript uh request so once this request is sent basically assembly ai starts transcribing the file but what we want to do is to have a way of checking if the audio file has been transcribed already or not so that's why in the transcript response we are able to get an id so this is the id that belongs to that audio file that we want to transcribe and once we get that id we can create a polling endpoint variable where we have the transcript endpoint url that we specified before earlier in this code and also the id that assembly ai returned to us belonging to the audio file that we wanted transcribed and using this we can check if this file has been transcribed already or not and to end this function what i'm going to do is basically just add a little communication here that the audio file has started being transcribed at assembly ai and i'm going to return the polling endpoint to back to my stream at application so that i can check when i want to if the transcription is complete so that i can show the transcription once it's complete so let's just save

### [15:22](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=922s) Checking that the code works correctly

this and run the streamlight application again to see that everything is working correctly so i will just go and say rerun here and of course i needed to update my transcribe from link oh a little typo from link arguments to include category i'll just say false for now i don't really want to categories so now i can see come and see here again that the video is being downloaded so once it's downloaded then we're able to follow along with the video being saved to my local system and also upload it and then starting the transcription nice so now we can see that our file was saved locally which is here and then it was uploaded to assembly ai and now it's being transcribed assembly ai so if you come here of course now you're not going to be able to see anything but we know that the at least the code is working so from now on what i want to do is more on the streamless side of things but we

### [16:31](https://www.youtube.com/watch?v=CrLmgrGiVVY&t=991s) Like and subscribe!

will get to that in the next part of this tutorial if you like this video don't forget to give it a like and maybe even subscribe to our channel if you have any questions or comments about this video you can leave them in the comment section but before you leave definitely don't forget to go and grab your free api token from assembly ai using the link in the description i'll see you in the second part

---
*Источник: https://ekstraktznaniy.ru/video/13344*