[ top | up ]

k-Means Clustering

Usage

kmeans(x, centers, iter.max=100, verbose=FALSE, method=0)

Arguments

x Data matrix
centers Number of clusters or initial values for cluster centers
iter.max Max. number of iterations
verbose If TRUE, make some output during learning
method If 0, then mean square error, otherwise the mean absolute error is used

Description

The data given by x is clustered by the k-Means algorithm. This algorithm works by repeatedly moving all cluster centers to the mean of their Voronoi sets.

If centers is a matrix, its rows are taken as the initial cluster centers. If centers is an integer, centers rows of x are randomly chosen as initial values.

The algorithm stops, if no cluster center has changed during the last iteration or the maximum number of iterations (given by iter.max) is reached.

If verbose is TRUE, for each iteration the number of the iteration and the numbers of cluster indices which have changed since the last iteration is given.

If method is 0, the distance between the cluster cneter and the data points is the Euclidian distance (ordinary kmeans algorithm). Otherwise the distance between the cluster cneter and the data points is the sum of the absolute values of the distances of the coordinates.

Value

kmeans returns an object of class "cluster".
centers The cluster centers.
changes The number of changes performed in each iteration step.
cluster Vector containing the indices of the clusters where the data is mapped.
error The error made when mapping the data points onto their cluster centers.
initcenters The inital cluster centers.
iter The number of iterations performed.
method The method applied.
ncenters The number of cluster centers.
size The number of data points in each cluster.

Author(s)

Friedrich Leisch and Andreas Weingessel

See Also

plot.cluster, predict.cluster

Examples

# a 2-dimensional example
x<-rbind(matrix(rnorm(100,sd=0.3),ncol=2),
         matrix(rnorm(100,mean=1,sd=0.3),ncol=2))
cl<-kmeans(x,2,20,verbose=TRUE)
plot(cl,x)   

# a 3-dimensional example
x<-rbind(matrix(rnorm(150,sd=0.3),ncol=3),
         matrix(rnorm(150,mean=1,sd=0.3),ncol=3),
         matrix(rnorm(150,mean=2,sd=0.3),ncol=3))
cl<-kmeans(x,6,20,verbose=TRUE)
plot(cl,x)

# assign classes to some new data
y<-rbind(matrix(rnorm(33,sd=0.3),ncol=3),
         matrix(rnorm(33,mean=1,sd=0.3),ncol=3),
         matrix(rnorm(3,mean=2,sd=0.3),ncol=3))
ycl<-predict(cl, y)
plot(cl,y)