ebam.wilc {siggenes} | R Documentation |
Performs an Empirical Bayes Analysis of Microarrays by using Wilcoxon Rank Sums as expression scores for the genes.
ebam.wilc(data,cl,delta=.9,p0=NA,ties.rand=TRUE,zero.rand=TRUE,gene.names=NULL, R.fold=TRUE,R.unlog=TRUE,file.out=NA,na.rm=FALSE,rand=NA)
data |
the data set that should be analyzed. Every row of this data set must correspond to a gene, and each column to a biological sample. |
cl |
a vector containing the class labels of the samples. In the two class unpaired case,
the label of a sample is either 0 (e.g, control group) or 1 (e.g., case group).
In the two class paired case, the labels are the integers between 1 and n/2
(e.g., before treatment group) and between -1 and -n/2 (e.g., after treatment
group), where n is the length of cl and k is paired with -k. |
delta |
a gene will be called significant, if its posterior probability of
being differentially expressed is larger than or equal to delta . |
p0 |
prior probability that a gene is differentially expressed. If not specified, it will automatically be computed. |
ties.rand |
if TRUE (default), non-integer expression scores will be randomly
assigned to the next lower or upper integer. Otherwise, they are assigned to
the integer that is closer to the mean. |
zero.rand |
if TRUE (default), the sign of each Zero in the computation of
the Wilcoxon signed rank sums will be randomly assigned. If FALSE , the
sign of the Zeros will be set to '–'. |
gene.names |
a vector containing the names of the genes. |
R.fold |
if TRUE (default), the fold change for each differentially
expressed gene will be computed. |
R.unlog |
if TRUE , 2^data will be used in the computation of the
R.fold. This is recommended if data consists of log2 transformed gene expression
data. |
file.out |
if specified, general information like the number of significant genes and the estimated FDR and gene-specific information like the expression scores, the q-values, the R fold etc. of the differentially expressed genes are stored in this file. |
na.rm |
if FALSE (default), the fold change of genes with at least one
missing value will be set to NA . If TRUE , missing values will be
replaced by the genewise mean. |
rand |
if specified, the random number generator will be set in a reproducible state. |
a plot of the expression scores vs. their posterior probability of being differentially expressed, and (optionally) a file containing general information like the FDR and the number of differentially expressed genes and gene-specific information on the differentially expressed genes like their names, their q-values and their fold change.
nsig |
number of significant genes. |
fdr |
estimated FDR. |
ebam.output |
table containing gene-specific information on the differentially expressed genes. |
row.sig.genes |
vector containing of the row numbers that belong to the differentially expressed genes. |
... |
Holger Schwender, holger.schw@gmx.de
Efron, B., Storey, J.D., Tibshirani, R. (2001). Microarrays, empirical Bayes methods, and the false discovery rate, Technical Report, Department of Statistics, Stanford University.
Storey, J.D., and Tibshirani, R. (2003). Statistical significance for genome-wide experiments, Technical Report, Department of Statistics, Stanford University.
Schwender, H. (2003). Assessing the false discovery rate in a statistical analysis of gene expression data, Chapter 8, Diploma thesis, Department of Statistics, University of Dortmund, http://de.geocities.com/holgerschw/thesis.pdf.
if (interactive()) { library(multtest) # Load the data of Golub et al. (1999). data(golub) contains a 3051x38 gene expression # matrix called golub, a vector of length called golub.cl that consists of the 38 class labels, # and a matrix called golub.gnames whose third column contains the gene names. data(golub) # An EBAM-Wilc analysis of the Golub data is performed by ebam.wilc.out<-ebam.wilc(golub,golub.cl,gene.names=golub.gnames[,3],rand=123) # For further analyses, the row numbers of the differentially expressed genes are obtained by ebam.wilc.out$row.sig.genes }