xcluster {ctc} | R Documentation |
Performs a hierarchical cluster analysis on a set of dissimilarities.
xcluster(data,distance="euclidean",clean=FALSE,tmp.in="tmp.txt",tmp.out="tmp.gtr")
data |
a matrix (or data frame) which provides the data to analyze |
distance |
The distance measure used with Xcluster. This must be one of
"euclidean" , "pearson" or "notcenteredpearson" .
Any unambiguous substring can be given. |
clean |
a logical value indicating whether you want the true
distances (clean=FALSE ), or you want a clean dendogramme |
tmp.in, tmp.out |
temporary files for Xcluster |
Available distance measures are (written for two vectors x and y):
Xcluster does not use usual agglomerative methods (single, average, complete), but compute the distance between each groups' barycenter for the distance between two groups.
This have a problem for this kind of data:
A | 0 | 0 |
B | 0 | 1 |
C | 0.9 | 0.5 |
Ie: a triangular in {bf R}$^2$, the distance between A and B is larger than the distance between the group A,B and C (with euclidean distance).
For that case it can be useful to use clean=TRUE
and that mean
that you must not consider A and B as a group without C.
An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:
merge |
an n-1 by 2 matrix.
Row i of merge describes the merging of clusters
at step i of the clustering.
If an element j in the row is negative,
then observation -j was merged at this stage.
If j is positive then the merge
was with the cluster formed at the (earlier) stage j
of the algorithm.
Thus negative entries in merge indicate agglomerations
of singletons, and positive entries indicate agglomerations
of non-singletons. |
height |
a set of n-1 non-decreasing real values.
The clustering height: that is, the value of
the criterion associated with the clustering
method for the particular agglomeration. |
order |
a vector giving the permutation of the original
observations suitable for plotting, in the sense that a cluster
plot using this ordering and matrix merge will not have
crossings of the branches. |
labels |
labels for each of the objects being clustered. |
call |
the call which produced the result. |
method |
the cluster method that has been used. |
dist.method |
the distance that has been used to create d
(only returned if the distance object has a "method"
attribute). |
Xcluster is a C program made by Gavin Sherlock that performs hierarchical clustering, K-means and SOM.
Xcluster is copyrighted. To get or have information about Xcluster: http://genome-www.stanford.edu/~sherlock/cluster.html
Antoine Lucas, http://genopole.toulouse.inra.fr/~lucas/R
# Create data .Random.seed <- c(1, 416884367 ,1051235439) m <- matrix(rep(1,3*24),ncol=3) m[9:16,3] <- 3 ; m[17:24,] <- 3 #create 3 groups m <- m+rnorm(24*3,0,0.5) #add noise m <- floor(10*m)/10 #just one digits # And once you have Xcluster program: # #h <- xcluster(m) # #library(mva) #plot(h)