# Gradient Clipping for Neural Networks | Deep Learning Fundamentals

## Метаданные

- **Канал:** AssemblyAI
- **YouTube:** https://www.youtube.com/watch?v=KrQp1TxTCUY
- **Дата:** 21.02.2022
- **Длительность:** 3:35
- **Просмотры:** 12,305

## Описание

Unstable gradients are one of the main problems of Neural Networks. And when it comes to Recurrent Neural Networks, we need a bit of a different approach due to their recurrent nature. In this video we will learn about Gradient Clipping, a technique to tackle the exploding gradients problem.

👇 Get your free AssemblyAI token here
https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_mis_20

▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬

🖥️ Website: https://www.assemblyai.com
🐦 Twitter: https://twitter.com/AssemblyAI
🦾 Discord: https://discord.gg/Cd8MyVJAXd
▶️  Subscribe: https://www.youtube.com/c/AssemblyAI?sub_confirmation=1
🔥 We're hiring! Check our open roles: https://www.assemblyai.com/careers

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#MachineLearning #DeepLearning

## Содержание

### [0:00](https://www.youtube.com/watch?v=KrQp1TxTCUY) Segment 1 (00:00 - 03:00)

unstable gradients are one of the main problems of deep neural networks and most of the time batch normalization is the answer to deal with this problem but when you're dealing with recurrent neural networks batch normalization is a little bit tricky to implement so instead we might use something else called gradient clipping so in this video let's learn what gradient clipping is and how we can apply it so gradient clipping is used for the exploding gradients problem and what you do is quite simple it's just that there are a bunch of approaches to it that we will cover in this video so basically with gradient clipping you very simply clip the gradients to be in a certain threshold so for example if you determine that you want your gradients to be between one and one you can easily set the parameter to be one and then from then on whatever your gradients are calculated to be that means whatever your weights were going to be updated with they will be clipped to be between minus one and one the tricky part here is that when you clip some of the gradients and some not the direction of your gradient is going to change so let's take this for example let's say this is the graph that we have and this is kind of the graph of how we can change the weight 1 and weight 2 and how the cost is going to change from then on so if you let's say out of a gradient vector like this that has 0. 9 3. 2 150 minus 2. 1 and 0. 0. 23 what you're going to do if you apply gradient clipping here is that you're going to clip the gradients that are higher than 1 and also lower than minus one to be between one and minus one and you're going to bring them to be one and minus one so let's say in the original vectors the vector's direction the direction and the graph was going to look this way but if you clip the gradients the vector is going to change and what's going to happen is that your vector is going to be pointing in a completely different direction something you can do to maintain the direction of the gradient is what we call clipping by norm so basically instead of only clipping the ones that are outside of the ranges that you're aiming for you can just lower all of your gradient values all of the values that are in your gradient vector to be in between -1 and 1. this way you're keeping the proportion of the numbers in your gradient vector and thus keeping the direction of your original gradient vector the main problem with this approach is that this time some of your gradient values are going to become very very small and at the end they might not actually be effective in updating the parameters of your network unfortunately there are no hard rules when it comes to gradient clipping you're going to have to try grading clipping and also grading clipping with norm and see which one works better for you might also need to try different threshold values to see which one gives you a better result but at the end of the day this is a very effective way of solving the exploding gradients problem and luckily it is very simple to implement using the keras deep learning library but that is all you need to know about grading clipping as i said it's a very easy to use and effective way of dealing with the exploding gradients problem and lucky for you and me it is very simple to implement it using the keras deep learning library i hope you enjoyed this video if you liked it don't forget to give us a like and maybe even subscribe to show us your support we would also love to hear any of your comments or questions in the comment section below but for now thanks for watching again and i will see you in the next video

---
*Источник: https://ekstraktznaniy.ru/video/13199*