Member-only story

Implementation of K-means++ — Know the smarter brother of K-means

Dhaval Thakur
3 min readAug 15, 2022

--

K-means clustering algorithm is one of the well-known algorithms among ML enthusiasts and also one of the first algorithms that are taught in machine learning classes in college.

K-Means++ Algorithm Tutorial & Implementation (Image credits: StackExchange)

To give you a recap of what K-means clustering means .. it is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.

So what’s wrong with our beloved K-means algorithm?

This algorithm is sensitive to the initialization of the centroids or the mean points. For instance , if a centroid is initialized to be a far-off point, it might just end up with no points associated with it, and at the same time, more than one cluster might end up linked with a single centroid.

Here comes the smarter version of K-means! K-means++

K-means++ is exactly the same algorithm apart from one thing! This algorithm ensures a smarter initialization of the centroids and improves the quality of the clustering.

Implementation of K-means ++ Algorithm

# importing

--

--

Dhaval Thakur
Dhaval Thakur

Written by Dhaval Thakur

Data Enthusiast, Geek, part — time blogger. Every week 1 new Data Science/ Product Management story 🖥 I also write on Python, scripting & blockchain

Responses (1)