close Help window

GDS cluster analysis


The GEO dataset (GDS) cluster analysis program is a visualization tool for displaying cluster heat maps. Cluster portions of interest may be selected, enlarged, charted as line plots, viewed in Entrez GEO Profiles, and the original data downloaded. The cluster analysis tool may be accessed from GDS records under the "analysis" pull-down menu, or by clicking the cluster thumbnail.

Precomputed hierarchical clusters (single linkage, complete linkage, and average linkage/UPGMA), as well as user-defined K-means/K-median clustering (where K = 2 through 15) are available. Clusters are calculated using a variety of distance metrics (Euclidean distance, Pearson correlation, or un-centered correlation coefficient).

Cluster analyses help provide insight into the relationships between data. It is recommended that care is taken with biological interpretation using cluster results; classifications are based on basic clustering algorithms over a variety of dataset types, making no prior assumptions on original data distribution and range. Alternative algorithms, normalization procedures and distance metrics will generate different cluster outputs. For K-clustering, the initialization procedure involves random assignment of genes to each partition, so K-cluster results may be different on each run.

Cluster region selection and visualization options

Once a hierarchical or K-cluster image of interest has been identified, specific cluster regions may be selected for further analysis as follows.

  • The red box is the image cropper. To move the image cropper box, drag it across the image, or click on any region of the image.
  • To alter the height of the image cropper, drag the top or bottom borders of the box.
  • The image cropper can also be moved using the arrow keys on the keyboard. Holding down the shift key along with the arrow keys moves the box faster; holding down the control key moves the box slower.
  • Select additional regions of interest by clicking the "+" icon in the top right corner of the active image cropper box, or with "a" on the keyboard. Each selected region is numbered.
  • The "Stack selections" button opens a new window to view multiple, stacked selections. The image cropper box can again be used to specify region(s) of interest on the stacked image.
  • Double click the active image cropper box or hit the space bar to view an enlarged image of the selected cluster region with gene and sample annotation.
  • Use the "Get selected data" button to download values and sample information in SOFT format for the chosen cluster region(s).
  • Use the "Plot selected gene profiles" button to view profile line plots for the chosen cluster region(s).
  • Use the "Get profiles in Entrez-GEO" button to retrieve individual gene profile charts and accompanying information from Entrez GEO Profiles for chosen cluster region(s).
  • Please read the GEO Info page for details of cluster methods and references.


  • close Help window