Sample size for gene expression microarray experiments

Bioinformatics. 2005 Apr 15;21(8):1502-8. doi: 10.1093/bioinformatics/bti162. Epub 2004 Nov 25.

Abstract

Motivation: Microarray experiments often involve hundreds or thousands of genes. In a typical experiment, only a fraction of genes are expected to be differentially expressed; in addition, the measured intensities among different genes may be correlated. Depending on the experimental objectives, sample size calculations can be based on one of the three specified measures: sensitivity, true discovery and accuracy rates. The sample size problem is formulated as: the number of arrays needed in order to achieve the desired fraction of the specified measure at the desired family-wise power at the given type I error and (standardized) effect size.

Results: We present a general approach for estimating sample size under independent and equally correlated models using binomial and beta-binomial models, respectively. The sample sizes needed for a two-sample z-test are computed; the computed theoretical numbers agree well with the Monte Carlo simulation results. But, under more general correlation structures, the beta-binomial model can underestimate the needed samples by about 1-5 arrays.

Contact: jchen@nctr.fda.gov.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Computer Simulation
  • DNA / genetics*
  • Gene Expression Profiling / methods*
  • Models, Genetic*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Sample Size
  • Sequence Analysis, DNA / methods*

Substances

  • DNA