How to Use Machine Learning in Bioinformatics with @DataProfessor
11:47

How to Use Machine Learning in Bioinformatics with @DataProfessor

AssemblyAI 25.12.2022 1 034 просмотров 26 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
The end of the year is coming close but this doesn't mean that learning should end! In the last series of the year, we are counting down to the end of the year with 15 creators. Each day a new creator will answer a community question in a quick and informative video. In this video, the @Data Professor will give you a high-level overview on how to apply machine learning to make sense of bioinformatics data. He'll also share how you can learn bioinformatics quicker by using AssemblyAI and Streamlit. Links mentioned in the video: 📖How to Build a Regression Model in Python https://towardsdatascience.com/how-to-build-a-regression-model-in-python-9a10685c7f09 🎬 Playlist of AssemblyAI tutorial videos https://www.youtube.com/watch?v=NNq_XBVk30w&list=PLtqF5YXg7GLmgxBk-MyyTHjBjlGUqfBOX 🎈 Streamlit https://streamlit.io/ 📦 Streamlit App Starter Kit - Blog https://blog.streamlit.io/streamlit-app-starter-kit-how-to-build-apps-faster/ 📦 Streamlit App Starter Kit - Template https://github.com/streamlit/app-starter-kit 💎 Streamlit Quests https://blog.streamlit.io/streamlit-quests-getting-started-with-streamlit/ Check out Chanin's YouTube channel: https://www.youtube.com/@UCV8e2g4IWQqK71bbzGDEI4Q Connect with Chanin on Twitter: https://twitter.com/thedataprof ▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬ 🖥️ Website: https://www.assemblyai.com 🐦 Twitter: https://twitter.com/AssemblyAI 🦾 Discord: https://discord.gg/Cd8MyVJAXd ▶️ Subscribe: https://www.youtube.com/c/AssemblyAI?sub_confirmation=1 🔥 We're hiring! Check our open roles: https://www.assemblyai.com/careers ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ #MachineLearning #DeepLearning

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

foreign happy holidays tech people so firstly I would like to thank both Patrick and Mishra my friends over at assembly AI for allowing me to make this short video in the 15 day countdown to the new year so in this video I'm going to talk about assembly Ai streamlit and bioinformatics so what do all of these terminologies have in common well firstly assembly AI is a speech to text service streamlit is the fastest way for you to build data apps and bioinformatics is a data intensive field of study where computational approaches are applied to extract Knowledge from biological data that are of high dimension and very difficult to analyze via traditional means so let me start by talking about bioinformatics in the context of computational drug Discovery so as already mentioned bioinformatics essentially May makes use of computational approaches in order to make sense of the big biological data and the data could be pertaining to proteins DNA rnas carbohydrates and these biological macromolecules are very small but they're very fast and are present in huge numbers and the presence or absence of these proteins is a great indicator to the metabolic status or the deceased state of a human being so being able to figure out the patterns that are hidden inside these biological data will allow us to diagnose patients find therapeutic cure for cancer or other diseases and so in a nutshell this is made possible by analyzing the big biological data through the use of data analytics and machine learning and so to better illustrate this let me show you a Blog that I've written that best summarizes this one moment okay so I found the particular blog article and it is called how to build a regression model in Python and so in this particular blog I summarize at a high level how you could go about building a machine learning model and the Machine learning model is using a data set in the realm of bioinformatics or chem informatics so let's have a look at this particular infographic that I've had drawn on an iPad in case you're wondering so here we could see that there are two molecules and they look almost exactly alike except for this functional group here called the methyl group and you can see here that the method Group is located on different positions on the molecule so the one in the left here is located to the right of this particular ring and the same functional group is located to the left here at this particular position so these might look almost similar but because of the functional group here that are located in different positions slightly will give rise to different molecular descriptor or molecular fingerprint which is essentially the features that will uniquely describe each of these two molecules so you can see here that we've uniquely identified each molecule with a numerical fingerprint and then you can see here that molecule one will represent the first row here and molecule two in yellow will represent the second row and so essentially we're starting from the molecules which we could have several hundreds to several thousands and then we use a program to convert the molecule here from a molecular structure or in Smiles notation or in MO format and then convert it into numerical form here could be fingerprints it could be molecular descriptors or in recent years even molecular graphs as well which will essentially describe the connectivity of the atoms which make up the molecule because the atoms are connected in different ways and no two molecule will connect in the same manner and so if they are connected in the same manner then they are the same identical molecule but if they are connected in different ways then they are different molecules so these fingerprints will constitute the data set and the data set will be used as normal and usual for the development of machine learning models as you can see here in the gray box and then we have a new molecule shown here in purple and then we convert it into molecular fingerprints and then we apply it to the model and then the model will make a prediction to the biological activity of the third molecule here and then we get the predicted value whether they are active or inactive in terms of the biological activity and another great thing from building the machine

Segment 2 (05:00 - 10:00)

learning model is that we could extract insights on which particular feature of the molecule are contributing to the biological activity and so for example here if we use random Forest we could analyze the genie in-depth or the feature importance and figure out which particular molecular feature give rise to the biological activity and once we know that we could relate the information to biologists and chemists so that they could go back and really sign the molecule and then perform the experimentation again tested biological activity and see whether there is an improvement in the biological activity inhibition of the molecule in binding to the Target protein and so in a nutshell that is how we apply machine learning to make sense of biological data and so let's say that you're coming from a field other than biology or bioinformatics let's say you already know machine learning and you would like to find some interesting data set to work with and let's say that bioinformatics might look fascinating to you but how do you exactly get started well a great way is to go on YouTube and there are several lectures available on the topic of bioinformatics or even machine learning for analyzing biological data sets but then again there's so many videos but how can you assimilate all of the knowledge from all of those videos if you are to listen to all of them it might take you forever if you listen to them at 2x speed it will save you time but still you'll have to cover a lot of videos however there is a better way what if let's say theoretically you could go to YouTube you could search for topics about bioinformatics and machine learning and then you use assembly AI to transcribe whatever is being said in the video and then you could use AI particularly Transformers or gbt to summarize the transcribed text from the videos on YouTube and when they are summarized you have less tabs to cover because not all videos might be helpful to you and if through reading the summary texts you figure out that some of the videos are worth of exploring in more detail while some or the majority are not then you could pinpoint to the videos that will be helpful to you and other than that you could use assembly AI to automatically create timestamps or timeline of the videos it will tell you exactly what topics are being said at which time point and so if you want to take a deeper look at particular videos that is what you could do okay so now we've already covered bioinformatics and we've already covered assembly Ai and so the third technology that we're going to talk about now is trimlid so we've already talked about how you could use assembly AI to transcribe the text from YouTube but in order to do such thing it would be easier if you have a front end or a user interface for that a user interface such as a web application where you could graphically enter the YouTube video links you would like it to be transcribed so that will help you to Breeze through your lecture transcription Endeavor so let me show you here I've created three videos on the use of assembly AI together with trimlet in order to make transcription for here the first video here is how you could use assembly AI to perform speech to text transcription and here we essentially build a streamlined app where you enter the URL of the YouTube video so essentially you could just use this video as a template and all of the code and the demo app is provided in the video description of this particular video shown here indicated and so I'll provide you the link in the video description or here you could use this particular app that we built to perform transcription of lectures that you might happen to listen to Live perhaps you might drop by an in-person lecture on bioinformatics and machine learning and then you could use this to transcribe the text being said in real time I mean it could also be a webinar r that would be even better right and the third one here is for Content moderation so if this is helpful to you then you could also check it out so I've already mentioned about streamlit here and at a high level it is a python library and it is a low code tool so in a few lines of code you could build a very simple and or complex web application to analyze data sets to process data to process the transcribe tags that I've mentioned and there are several widgets for you to use and to get you started there is a Blog called the streamlit app starter kit that will allow you to save time every time you build your streamlit app and the code for this is provided in the GitHub repo here and so this will allow you to get up to speed and build your first app in only a few seconds so you

Segment 3 (10:00 - 11:00)

could just use this template for your streamlined app and another way to get started is to head over to this particular blog on the streamlined Quest so it will give you the learning experience as if you are playing a role-playing game and you are learning and progressing by going through this checklist of tasks to do when you're learning extremely so this is a great way for you to go through a guided path for learning streamlined and another is the 30 days of streamlined which is a learning challenge that will get you up to speed in learning about streamlined and how you could use it to build data-driven apps and finally here the instrument components hub web application was developed by Johannes Reich who is a product manager at streamlit over at Snowflake and this particular web app has essentially aggregated all of the instrument components all in one place and so for example if you want to use audio features you could use webrtc which will allow you to use your webcam and audio you could also use streamlined audio recorder here as well and fanilo has made an awesome video on the first day of the 15 day countdown on assembly AI Channel also check out that video and so that's essentially it and that is how we could integrate assembly AI streamlab into bioinformatics by learning and doing and so thanks again to assembly AI for allowing me to give this short demo talk on how it could be integrated with streamlit and also to analyze bioinformatics and so happy holidays foreign

Другие видео автора — AssemblyAI

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник