K Means Algorithm Solved Exmaple K Means Clustering Algorithm in Machine Learning by Vidya Mahesh Huddar
Consider the following set of data points given in Table.
Cluster it using K-means algorithm with the initial value of object 2 and 5 with the coordinate values (4, 6) and (12, 4) as initial seeds.
The distance function used is Euclidean distance.
Object X Y
--------------------
1 2 4
2 4 6
3 6 8
4 10 4
5 12 4
The following concepts are discussed:
______________________________
k means clustering algorithm,
k means clustering,
k means algorithm in machine learning,
kmeans clustering,
kmeans in machine learning
********************************
Follow Us on:
1. Blog / Website: https://www.vtupulse.com/
2. Download Final Year Project Source Code: https://vtupulse.com/download-final-year-projects/
3. Like Facebook Page: https://www.facebook.com/VTUPulse
4. Follow us on Instagram: https://www.instagram.com/vtupulse/
5. Like, Share, Subscribe, and Don't forget to press the bell ICON for regular updates
Оглавление (1 сегментов)
Segment 1 (00:00 - 04:00)
Welcome back. In this video I will discuss how to use K-means clustering algorithm to divide the given data set into different clusters. I have already solved few examples based on K-means clustering. Link for those videos are given in the description below. This is a given data set. Here we need to apply the K-means clustering algorithm and we need to divide the given data set into different clusters. As per the problem definition, object two and five with a coordinate values 4,6 and 12,4 as initial seeds. To find the distance, we are using the Euclidean distance formula. These are the initial centroids C1 and C2. Now we need to calculate the distance from data point to these centroids. To find the distance, we need to use the Euclidean distance formula. If you have the two data points with the coordinates X1, Y1 and X2, Y2, the Euclidean distance will be square root of X2 - X1 bracket squared plus Y2 - Y1 bracket squared. In this case, this one is X1, Y1 and this is X2 and Y2. As per this formula, X2 - X1, that is 2 - 4 bracket squared plus 4 - 6 bracket squared, which is equal to 2. 83. Similarly, square root of 4 - 4 bracket squared plus 6 - 6 bracket squared, which is equal to 0. Similarly, we need to calculate the remaining distances from data point to first centroid. Next, we need to calculate the distance from the data point to second centroid, that is 12,4. Here square root of 2 - 12 bracket squared plus 4 - 4 bracket squared, which is equal to 10. And a square root of 4 - 12 bracket squared plus 6 - 4 bracket squared, which is equal to 8. 25. Similarly, we need to find the remaining distances. Once you find the distance, next we need to assign the cluster for these data points. While assigning the clusters, we need to check the distances. Which distance is having the minimum value? That cluster will be assigned to these data points. So, in the first data point, the distance is 2. 83 and 10. In this, 2. 83 is a minimum value. So that for the first data point, we will assign as a cluster C1. And in the second data point, 0 and 8. 25. Here also, the 0 is a minimum value. So that the second data point will be assigned to the cluster C1. Similarly, we need to assign the cluster for the remaining data points. Once you assign the clusters, the data points 1, 2, 3 are belongs to cluster C1 and 4 and 5 belongs to cluster C2. So that we need to calculate the new centroids. So centroids, we need to add the X coordinates divided by number of data points, that is the three. And we need to add the Y coordinates divided by three, that is 4, 6. And here we have two data points belong to C2, that is 10 + 12 / 2, 4 + 4 / 2, which is equal to 11, 4. Once you calculate the new centroids, next we need to calculate the distance from the data point to the new centroids. So first, I will consider the first data point. So this is a X1, Y1 and this is X2, Y2. As per the Euclidean distance formula, which is equal to square root of X2 - X1 bracket squared, that is 2 - 4 bracket squared plus 4 - 6 bracket squared, we will get the value as 2. 83. Once you find the distances, next we need to assign the cluster for these data points. So if you observe here, 2. 83 and 9. In these two values, 2. So that the first data point will be assigned as a cluster C1. And second Similarly, we need to assign the clusters for the remaining data points. If you compare previous assignment and a new assignment, both are exactly same. This means that the algorithm converges. So, finally we got the final clusters. In this, the data point 1 2 3 are belong to cluster C1 and 4 5 C2. And we got the final centroids as a C1, which is equal to 4,6 and C2, which is equal to 11,4. So, this is how we can apply the K-means clustering algorithm to the given data set and we can classify into one of the cluster. I hope the concept is clear. If you like the video, do like and share with your friends. Press the subscribe button for more videos. Press the bell icon for regular updates. Thank you for watching.