# What is k-Means Cluster Analysis?

Cluster analysis is a method for automatically grouping data into a smaller number of subsets or clusters so that the records grouped are most statistically similar to each other based on the attributes of the data compared.

“In statistics and data mining,is a method of cluster analysis which aims to partitionk-means clusteringnobservations intokclusters in which each observation belongs to the cluster with the nearest mean.Given a set of observations (

x_{1},x_{2}, …,x_{n}), where each observation is ad-dimensional real vector,k-means clustering aims to partition thenobservations intoksets (k≤n)S= {S_{1},S_{2}, …,S_{k}} so as to minimize the within-cluster sum of squares.

where

μ_{i}is the mean of points inS_{i}.“–Wikipedia

k-Means cluster analysis achieves this by partitioning the data into the required number of clusters by grouping records so that the euclidean distance between the record’s dimensions and the clusters centroid (point with the average dimensions of the points in the cluster) are as small as possible.

The following is a macro I wrote in VBA for Microsoft Excel that performs k-Means Cluster Analysis on the table selected.