Not applicable; primary human cells were processed immediately after collection from subjects.
Growth protocol
Not applicable; primary human cells were processed immediately after collection from subjects.
Extracted molecule
total RNA
Extraction protocol
Venous blood was collected directly into BD-Vacutainer CPT tubes (Becton Dickinson, Franklin Lakes, NJ). Peripheral blood mononuclear cells (PBMCs) were isolated by centrifugation, according to the manufacturer’s protocol, no later than three hours after blood collection. PBMCs were washed with Hank’s Balanced Salt Solution without calcium, magnesium or phenol red (Gibco-BRL, Grand Island, NY), and RNA was isolated immediately thereafter under RNase-free conditions using the Purescript total RNA isolation kit (Gentra, Minneapolis, MN) or the Ambion ToTALLY RNA isolation kit (Life Technologies, Grand Island, NY), according to the manufacturers’ instructions. Contaminating DNA was removed using the DNA-free kit (Ambion, Austin, TX). RNA was eluted in 20 l RNase/DNase-free water and stored at -80°C after the addition of 32 U of RNase inhibitor (Promega, Madison, WI). RNA integrity was assessed by electrophoresis using an Agilent Bioanalyzer 2100 (Agilent, Palo Alto, CA) prior to cDNA synthesis for microarray hybridization. Samples having an RNA integrity number below 6 were excluded from further analysis.
Label
biotin
Label protocol
Between 5 and 20 ng of total RNA from each PBMC sample was used to generate high fidelity cDNA using the Ovation RNA amplification system (NuGEN Technologies, Inc., San Carlos, CA), according to the manufacturer’s protocol. The amplified cDNA was fragmented to 50-100 nucleotides and labeled with biotin using standard Affymetrix protocols.
Hybridization protocol
cDNA was hybridized to the Affymetrix GeneChip.HG-U219 high-density oligonucleotide array (Affymetrix, Santa Clara, CA). Following hybridization, arrays were stained with streptavidin-phycoerythrin and washed in an Affymetrix fluidics module using standard Affymetrix protocols.
Scan protocol
The detection and quantitation of target hybridization was performed using a GeneArray Scanner 3000 (Affymetrix).
Description
Gene expression data from healthy adult PBMCs
Data processing
Microarray data were analyzed using GeneSpring GX14.9 software (Agilent Technologies, Santa Clara, CA). Raw expression values in CEL file format were normalized by robust multi-array analysis (RMA) and quantile normalization, filtered to include only those with intensity values above the 20th percentile, and baseline transformed to the median of all samples. Statistical analysis was performed using a one-way ANOVA with Benjamini-Hochberg multiple testing correction to reduce false positives. Differentially expressed transcripts, defined as those having a P-value of <0.05 and a fold change of at least 2 relative to the healthy donor group, were subjected to hierarchical clustering and principal component analysis. A generic predictive modeling framework was developed and applied to two comparisons: acute LD (n=28) versus healthy donors (n=21), and acute LD versus 6 month convalescent LD (n=10). In the first step, the distribution of the gene expression variance across all experimental groups was computed, and genes with variance at or above the 90th percentile were identified. This threshold is a parameter of the framework and can be appropriately set based on the variance distribution in a considered cohort of samples. In the second step, expression data containing the top 10% of variance in each experimental group were subjected to iterations (n = 50) of random forest analysis. An importance value for each gene was generated following each iteration of random forest analysis, and a final importance value for each gene was computed by averaging the importance values across all 50 iterations. Averaged importance values were used to rank all top selected genes. Finally, for each experiment, leave-one-out predictive modeling was performed and tested using incrementally expanding sets of the most significant genes (top 20 through top 2004) to assess the changes of accuracy performance across different sets of predictors.