Identification of Prognostic Candidate Genes in Breast Cancer by Integrated Bioinformatic Analysis

J Clin Med. 2019 Aug 2;8(8):1160. doi: 10.3390/jcm8081160.

Abstract

Breast cancer is one of the most common malignancies. However, the molecular mechanisms underlying its pathogenesis remain to be elucidated. The present study aimed to identify the potential prognostic marker genes associated with the progression of breast cancer. Weighted gene coexpression network analysis was used to construct free-scale gene coexpression networks, evaluate the associations between the gene sets and clinical features, and identify candidate biomarkers. The gene expression profiles of GSE48213 were selected from the Gene Expression Omnibus database. RNA-seq data and clinical information on breast cancer from The Cancer Genome Atlas were used for validation. Four modules were identified from the gene coexpression network, one of which was found to be significantly associated with patient survival time. The expression status of 28 genes formed the black module (basal); 18 genes, dark red module (claudin-low); nine genes, brown module (luminal), and seven genes, midnight blue module (nonmalignant). These modules were clustered into two groups according to significant difference in survival time between the groups. Therefore, based on betweenness centrality, we identified TXN and ANXA2 in the nonmalignant module, TPM4 and LOXL2 in the luminal module, TPRN and ADCY6 in the claudin-low module, and TUBA1C and CMIP in the basal module as the genes with the highest betweenness, suggesting that they play a central role in information transfer in the network. In the present study, eight candidate biomarkers were identified for further basic and advanced understanding of the molecular pathogenesis of breast cancer by using co-expression network analysis.

Keywords: GEO; TCGA; breast cancer; prognosis; weighted gene coexpression network analysis.