Hello python programmers, This is AK. In this video we’re going to see about one of the most popular and important algorithms in machine learning that is named as kmeans clustering.
STEP1: Installing Dependencies
* pip install pandas
* pip install matplotlib
* pip install scikit-learn
STEP2: Importing the libraries
from pandas import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
STEP3: Read the CSV
data=pd.read_csv('Sample scores.csv')
data
Dataset link: http://bitly.ws/dIvR
STEP4: Plot the data
plt.scatter(data['Overs'],data['Scores'])
plt.xlabel('x')
plt.ylabel('y')
plt.show()
STEP5: Create dataframe
df=DataFrame(data,columns=['Scores','Overs'])
df
STEP6: Create Clusters
kmeans=KMeans(n_clusters=3).fit(df)
STEP7: Place the centroid points
centroids=kmeans.cluster_centers_
print(centroids)
STEP8: Plot the data points
plt.scatter(df['Overs'],df['Scores'],c=kmeans.labels_.astype(float),s=50,alpha=1)
plt.scatter(centroids[:,0],centroids[:,1],c='red',s=50)
plt.xlabel('Overs')
plt.ylabel('Scores')
plt.show()
Github repository : https://github.com/akpythonyt/ML-algorithms
Thanks for reading this article..!
No comments:
Post a Comment