STEP1: Installing Dependencies
* pip install pandas
* pip install matplotlib
* pip install scikit-learn
STEP2: Importing the libraries
from pandas import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
STEP3: Read the CSV
data=pd.read_csv('Sample scores.csv')
data
Dataset link: http://bitly.ws/dIvR
STEP4: Plot the data
plt.scatter(data['Overs'],data['Scores'])
plt.xlabel('x')
plt.ylabel('y')
plt.show()
STEP5: Create dataframe
df=DataFrame(data,columns=['Scores','Overs'])
df
STEP6: Create Clusters
kmeans=KMeans(n_clusters=3).fit(df)
STEP7: Place the centroid points
centroids=kmeans.cluster_centers_
print(centroids)
STEP8: Plot the data points
plt.scatter(df['Overs'],df['Scores'],c=kmeans.labels_.astype(float),s=50,alpha=1)
plt.scatter(centroids[:,0],centroids[:,1],c='red',s=50)
plt.xlabel('Overs')
plt.ylabel('Scores')
plt.show()
Github repository : https://github.com/akpythonyt/ML-algorithms
Thanks for reading this article..!
No comments:
Post a Comment