# Sentiment Classification for Audio Files in Python | Speech Recognition Project 3

## Метаданные

- **Канал:** AssemblyAI
- **YouTube:** https://www.youtube.com/watch?v=RpzdQmwCJsc
- **Дата:** 03.06.2022
- **Длительность:** 16:52
- **Просмотры:** 11,619

## Описание

Learn how to work do sentiment classification for audio files in Python.

Get your Free Token for AssemblyAI Speech-To-Text API 👇https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_pat_39

Code: https://github.com/AssemblyAI-Examples/python-speech-recognition-course

▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬

🖥️ Website: https://www.assemblyai.com
🐦 Twitter: https://twitter.com/AssemblyAI
🦾 Discord: https://discord.gg/Cd8MyVJAXd
▶️  Subscribe: https://www.youtube.com/c/AssemblyAI?sub_confirmation=1
🔥 We're hiring! Check our open roles: https://www.assemblyai.com/careers

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#MachineLearning #DeepLearning

Feelings icons created by photo3idea_studio - Flaticon: https://www.flaticon.com/free-icons/feelings
Microphone icons created by Freepik - Flaticon: https://www.flaticon.com/free-icons/microphone

## Содержание

### [0:00](https://www.youtube.com/watch?v=RpzdQmwCJsc) Segment 1 (00:00 - 05:00)

hi everyone in this video i teach you how to apply sentiment analysis to youtube videos so i will show you how we can use the youtube dl package to automatically download a video or extract information from a video and then we apply sentiment analysis so this is for example one result we get and in this case i apply it to iphone 30 review videos on youtube so this is one result for example we get the text of each sentence and then we get the sentiment so for example if we read this the new iphone display is brighter than before the battery life is longer and the sentiment is positive then here we get still there are some flaws and the sentiment is negative so this works pretty well and now let's have a look at how you can do this so here i've created a new project folder and we already have our api secrets and the api pi file with the helper functions to work with the assembly ai api and now let's create two more files so the main. pi file that will combine everything and the youtube extractor files this is another helper file to extract the infos from the youtube video and for this we are going to use the youtube dl package this is a very popular command line program to download videos from youtube and other sites and we can use this as command line program but we can also use this in python so for this we say pip install youtube dl and then we can import this so we say import youtube dl and then we set up an instance so we say ydl equals youtube dl dot youtube dl and now i'm going to show you how you can download a video file and also how you can extract the infos from a video so let's create a helper function that i call get video infos and this takes an url and now we use the ydl object as a context manager so we say with ydl then we say result equals ydl dot extract info and this gets the url and by default it has download equals true so this would also download the file but in my case i say download equals false because of course we could download the file and then upload it to assembly ai but we can actually skip this step and just extract the url of the hosted file and then we can pass this to the transcribe point in assembly ai so we can set download equals false here then we do one more check so we say if entries if the entries key is in the result then this means we have a playlist url here and then we want to return only the first video of this playlist so we say return result with the key entries and then the result zero or entry zero and otherwise um we return the result simply so this is the whole video info object and then let's create another helper file that i call get audio url and this gets the video infos and first of all let's simply print all the video infos to see how this looks like so now let's say if underscore name equals main and then let's first extract the video info so video info equals get video infos this needs an url and then we say audio url equals get audio url and then we want to print the audio url so right now this is none because we don't return anything so let's get an example a url so for this i went to youtube and searched for iphone 13 review and i choose this video so iphone 13 review pros and cons so we can click on this and then we have to watch an ad but we can actually copy this url right away and then put it in here as a string and now if we run this then so we run python youtube extractor dot time then it should print the um whole url so yeah actually here i have to pass this youtube info

### [5:00](https://www.youtube.com/watch?v=RpzdQmwCJsc&t=300s) Segment 2 (05:00 - 10:00)

and let's try this again and yeah so here it extracted the hole or i printed the whole info so this is actually a very long object a very long dictionary so i can tell you that this has a key in it that is called formats so let's actually print only the formats and if we run this then this is also still a very large um very large dictionary um but then again this is an inner dictionary and this has a key that is called um or actually this is a list so now we can iterate over this so here we say for f in video in for formats and then we can print the f let's print f dot and it has the key x for extension and it also has a url so we also want to print f dot u r l and now if we run this then let's see what happens um let's actually comment out the url because this is super long so let's print only the extension and now we see um we have a lot of different extensions because it uh actually start the video in a lot of different formats and with a lot of different uh resolutions and so on so what we want is this one so the m4a this is a audio format ending so we now check if the format or if the extension equals m for a then we return the f url key so this is the audio url and if we save this and then print this at the very end then we should get the url to this hosted file so you can see this is at this url so this is uh not related to youtube. com so now let's for example click on this and then we have this in our browser so we could listen to the audio file so yeah this is the first part how to work with the youtube dl package to extract the infos and now let's combine this in the main. pi so in main. pi we combine the youtube extractor infos with assembly ai and extract the transcript of the video and also the sentiment classification results so sentiment classification is usually a pretty difficult task but assembly ai makes it super simple to apply this so if we go to the website assemblyi. com and have a look at the features then we see they provide core transcription so this is basically the speech recognition we've seen in the last part but they also offer audio intelligence features and they are pretty cool so there are a lot of features you can use for example detect important phrases and words topic detection auto chapters so auto summaries and much more and if we scroll down then here we find sentiment analysis so if we click on this then we see a short description so with sentiment analysis assembly ai can detect the sentiment of each sentence of speech spoken in your audio files sentiment analysis returns a result of positive negative or neutral for each sentence in the transcript so this is exactly what we need here and it's actually super simple to use this so the only thing we have to change is when we call the transcript endpoint we also have to send sentiment analysis equals true as json data so this is all we need to do so let's go to our code and implement this so let's import all the files we need so we want json then we say from youtube extractor we import get audio url and get video infos and from our api helper file we import save transcripts then here i create one helper function that i call safe video sentiments and this gets the url and here we get the video

### [10:00](https://www.youtube.com/watch?v=RpzdQmwCJsc&t=600s) Segment 3 (10:00 - 15:00)

infos by calling get video infos with the url then we get the audio url by calling get all your url and this gets the video infos and then i simply call the safe transcript function and this gets the audio url and it also gets a title and for the title i want to use the title of the video so we can get this from the video infos so this has a key that is called title and then i want to slightly modify this so i say title equals title dot strip so i want to remove all leading and trailing white space and then i want to replace all spaces with a underscore and then i also say title equals data slash plus title so i want to store this in a separate folder so here we create this and call this data and now we have to modify this slightly so if we have a look back then we see this needs the additional argument sentiment analysis and now so in the safe transcript file i will put this as additional argument and i will give this a default of false and then here we say sentiment analysis equals true and now we have to pass this um through so to the get transcription result url so this also needs this parameter then the transcribe needs the parameter and here this now as a json data that we send we put sentiment analysis equals true or false and this is all that we need and now of course i also want to save this so here we check if the parameter is true then i create a separate file so again i say file name equals title plus and then let's call this underscore sent to mance. json and now i say with with open the file name in right mode s f and then i import json in the top import json and then here we simply say json. dump and first we have to extract the infos of course so we call this sentiments equals data and then the key if we have a look at the documentation then here we see that json response now has this additional key sentiment analysis results so we use this and then we dump the sentiments into the file and i also want to say indent equals 4 to make this a little bit more readable and now in the main. pi we call this function and say if underscore name equals underscore main and then i want to call the safe video sentiments and the url is this one so let's copy and paste this in here and now let's run the main. pi file and hope that everything works so the website is downloaded and transcription starts so this looks good so let's wait all right so this was successful and the transcript was saved and now we have a look at the data folder then here we get the transcript of the video and we also see our json file with all the sentiments so for each sentiment we get the text of the sentence so for example this one with the exception of a smaller notch the iphone 30 doesn't seem very new at first glance but when you start using this flagship you start to appreciate a bunch of welcome upgrades then we get the start and end time then we get the sentiment which is positive and we also get the confidence which is

### [15:00](https://www.youtube.com/watch?v=RpzdQmwCJsc&t=900s) Segment 4 (15:00 - 16:00)

pretty high then the next example the new iphone display is brighter than before the battery life is longer and apple has improved blah so here also the sentiment is positive then we have still there are some flaws here and now the sentiment is negative so this works pretty well and yeah this is how you can apply sentiment analysis with assembly ai now i want to show you a little bit more code how we could analyze this for example so now we can comment this out so we don't need to download this again then we can read our json file and here we store the positives negatives and neutrals so we iterate over the data and then we extract the text so the text and we also extract the sentiment so then we check if this is positive negative or neutral and append it to the corresponding list then we can calculate the length of each list and then we can print the number of positives negatives and neutrals and we can also for example calculate the positive ratio so here we ignore the neutrals and simply do the number of positives divided by the number of positives plus the number of negatives and now if we save this and run this again then here we get the number of positives so 38 only four negatives over all positive ratio is 90 percent so with this we you can get a pretty quick overview of a review for example and yeah i think the sentiment classification feature can be applied to so many different use cases it's so cool and yeah i hope you enjoyed this project and then i hope to see in the next video bye

---
*Источник: https://ekstraktznaniy.ru/video/13101*