Using association rule mining to determine promising secondary phenotyping hypotheses

Bioinformatics. 2014 Jun 15;30(12):i52-59. doi: 10.1093/bioinformatics/btu260.

Abstract

Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene-phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well as the development of drugs. However, given that there are ∼: 20 000 genes in higher vertebrate genomes and the experimental verification of gene-phenotype relations requires a lot of resources, methods are needed that determine good candidates for testing.

Results: In this study, we applied an association rule mining approach to the identification of promising secondary phenotype candidates. The predictions rely on a large gene-phenotype annotation set that is used to find occurrence patterns of phenotypes. Applying an association rule mining approach, we could identify 1967 secondary phenotype hypotheses that cover 244 genes and 136 phenotypes. Using two automated and one manual evaluation strategies, we demonstrate that the secondary phenotype candidates possess biological relevance to the genes they are predicted for. From the results we conclude that the predicted secondary phenotypes constitute good candidates to be experimentally tested and confirmed.

Availability: The secondary phenotype candidates can be browsed through at http://www.sanger.ac.uk/resources/databases/phenodigm/gene/secondaryphenotype/list.

Contact: ao5@sanger.ac.uk or ds5@sanger.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Ataxia / genetics
  • Data Mining / methods*
  • Disease / genetics*
  • Genes
  • Humans
  • Mice
  • Mitochondrial Diseases / genetics
  • Muscle Weakness / genetics
  • Phenotype*
  • Ubiquinone / deficiency
  • Ubiquinone / genetics

Substances

  • Ubiquinone

Supplementary concepts

  • Coenzyme Q10 Deficiency