dupcor.series {limma} | R Documentation |
Estimate the correlation between duplicate spots (regularly spaced replicate spots on the same array) or between technical replicates from a series of arrays.
duplicateCorrelation(object,design=rep(1,ncol(M)),ndups=2,spacing=1,block=NULL,trim=0.15,weights=NULL) dupcor.series(M,design=rep(1,ncol(M)),ndups=2,spacing=1,initial=0.8,trim=0.15,weights=NULL)
object |
a numeric matrix of log-ratios or an MAList object from which the log-ratios can be extracted.
If object is an MAList then the arguments design , ndups , spacing and weights will be extracted from it if available and do not have to be specified as arguments. |
M |
a numeric matrix. Usually the log-ratios of expression for a series of cDNA microarrrays with rows corresponding to genes and columns to arrays. |
design |
the design matrix of the microarray experiment, with rows corresponding to arrays and columns to comparisons to be estimated. The number of rows must match the number of columns of M . Defaults to the unit vector meaning that the arrays are treated as replicates. |
ndups |
a positive integer giving the number of times each gene is printed on an array. nrow(M) must be divisible by ndups .
Will be ignored if block is specified. |
spacing |
the spacing between the rows of M corresponding to duplicate spots, spacing=1 for consecutive spots |
block |
vector or factor specifying a blocking variable |
initial |
a numeric value between -1 and 1 giving an initial estimate for the correlation. Not currently used. |
trim |
the fraction of observations to be trimmed from each end of tanh(all.correlations) when computing the trimmed mean. |
weights |
an optional numeric matrix of the same dimension as M containing weights for each spot. If smaller than M then it will be filled out the same size. |
When block=NULL
, this function estimates the correlation between duplicate spots (regularly spaced within-array replicate spots).
If block
is not null, this function estimates the correlation between repeated observations on the blocking variable.
Typically the blocks are biological replicates and the repeated observations are technical replicates.
In either case, the correlation is estimated by fitting a mixed linear model by REML individually for each gene.
The function also returns a consensus correlation, which is a robust average of the individual correlations, which can be used as input for
functions lmFit
or gls.series
.
At this time it is not possible to estimate correlations between duplicate spots and between technical replicates simultaneously.
If block
is not null, then the function will set ndups=1
.
The function may take long time to execute as it fits a mixed linear model for each gene.
dupcor.series
produces the same values as duplicateCorrelation
and it retained for compatibility with earlier releases of the software.
A list with components
consensus.correlation |
the average estimated inter-duplicate correlation. The average is the 0.1 trimmed mean of the correlations for individual genes on the tanh-transformed scale. |
cor |
same as consensus.correlation , for compatibility with earlier versions of the software |
all.correlations |
a numeric vector of length nrow(M)/ndups giving the individual genewise correlations. |
Gordon Smyth
Smyth, G. K., Michaud, J., and Scott, H. (2003). The use of within-array duplicate spots for assessing differential expression in microarray experiments. http://www.statsci.org/smyth/pubs/dupcor.pdf
These functions use randomizedBlockFit
from the statmod package.
An overview of linear model functions in limma is given by 5.LinearModels.
# See gls.series for an example