Incomplete multi-view gene clustering with data regeneration using Shape Boltzmann Machine

Comput Biol Med. 2020 Oct:125:103965. doi: 10.1016/j.compbiomed.2020.103965. Epub 2020 Sep 8.

Abstract

Deciphering patterns in the structural and functional anatomy of genes can prove to be very helpful in understanding genetic biology and genomics. Also, the availability of the multiple omics data, along with the advent of machine learning techniques, aids medical professionals in gaining insights about various biological regulations. Gene clustering is one of the many such computation techniques that can help in understanding gene behavior. However, more comprehensive and reliable insights can be gained if different modalities/views of biomedical data are considered. However, in most multi-view cases, each view contains some missing data, leading to incomplete multi-view clustering. In this study, we have presented a deep Boltzmann machine-based incomplete multi-view clustering framework for gene clustering. Here, we seek to regenerate the data of the three NCBI datasets in the incomplete modalities using Shape Boltzmann Machines. The overall performance of the proposed multi-view clustering technique has been evaluated using the Silhouette index and Davies-Bouldin index, and the comparative analysis shows an improvement over state-of-the-art methods. Finally, to prove that the improvement attained by the proposed incomplete multi-view clustering is statistically significant, we perform Welch's t-test. AVAILABILITY OF DATA AND MATERIALS: https://github.com/piyushmishra12/IMC.

Keywords: Boltzmann machine; Gene clustering; Incomplete multi-view clustering; Multi-modality.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Genomics
  • Machine Learning*