[ top | up ]

Silhouette Plot Of Nonhierarchical Clusterings

Usage

plot.partition(x, ...)

Arguments

x an object of class "partition", e.g. created by the functions pam, clara, and fanny. Graphical parameters (see par) may also be supplied as arguments to this function.

Description

A silhouette plot of the nonhierarchical clustering is created. The silhouette plot is fully described in Rousseeuw (1987) and in chapter 2 of Kaufman and Rousseeuw (1990). For each object i, a bar is drawn, representing the silhouette width s(i) of the object. Objects are grouped per cluster, starting with cluster 1 at the top. Objects with a large s(i) (almost 1) are very well clustered, a small s(i) (around 0) means that the object lies between two clusters, and objects with a negative s(i) are probably placed in the wrong cluster. A clustering can be performed for several values of k (the number of clusters). Finally, choose the value of k with the largest overall average silhouette width.

The silhouette width is computed as follows: Put a(i) = average dissimilarity between i and all other points of the cluster to which i belongs. For all clusters C, put d(i,C) = average dissimilarity of i to all objects of C. The smallest of these d(i,C) is denoted as b(i), and can be seen as the dissimilarity between i and its neighbor cluster. Finally, put s(i) = ( b(i) - a(i) ) / max( a(i), b(i) ). The overall average silhouette width is then simply the average of s(i) over all objects i.

Value

a NULL value is returned.

Side Effects

A silhouette plot is plotted on the current graphics device.

NOTE

Object labels are only printed when the number of objects is limited to less than 40, for readability.

References

Kaufman, L. and Rousseeuw, P.J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.

Rousseeuw, P.J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math., 20, 53-65.

See Also

partition.object, pam, pam.object, clara, clara.object, 'fanny, fanny.object, par.

Examples

		#generate 25 objects, divided into 2 clusters.
x <- rbind(cbind(rnorm(10,0,0.5),rnorm(10,0,0.5)),
           cbind(rnorm(15,5,0.5),rnorm(15,5,0.5)))

plot(pam(x, 2))