« Back to Publications list

Genome-Wide Canonical Correlation Analysis-Based Computational Methods for Mining Information from Microbiome and Gene Expression Data

Multi-omics datasets are very high-dimensional in nature and have relatively fewer number of samples compared to the number of features. Canonical correlation analysis (CCA)-based methods are commonly used for reducing the dimensions of such multi-view (multi-omics) datasets to test the associations among the features from different views and to make them suitable for downstream analyses (classification, clustering etc.). However, most of the CCA approaches suffer from lack of interpretability and result in poor performance in the downstream analyses. Presently, there is no well-explored comparison study for CCA methods with application to multi-omics datasets (such as microbiome and gene expression datasets). In this study, we address this gap by providing a detail comparison study of three popular CCA approaches: regularized canonical correlation analysis (RCC), deep canonical correlation analysis (DCCA), and sparse canonical correlation analysis (SCCA) using a multi-omics dataset consisting of microbiome and gene expression profiles. We evaluated the methods in terms of the total correlation score, and the classification performance. We found that the SCCA provides reasonable correlation scores in the reduced space, enables interpretability, and also provides the best classification performance among the three methods.

https://doi-org.uml.idm.oclc.org/10.1007/978-3-030-18305-9_53

Shikder R., Irani P., Hu P. (2019) Genome-Wide Canonical Correlation Analysis-Based Computational Methods for Mining Information from Microbiome and Gene Expression Data. In: Meurs MJ., Rudzicz F. (eds) Advances in Artificial Intelligence. Canadian AI 2019. Lecture Notes in Computer Science, vol 11489. Springer, Cham

Bibtext Entry

@inproceedings{scopus2-s2.0-85066147961,
author = "Shikder, R. and Irani, P. and Hu, P.",
copyright = "Copyright 2019 Elsevier B.V., All rights reserved.",
isbn = "9783030183042",
issn = "03029743",
journal = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
keywords = "Canonical Correlation Analysis (CCA) ; Comparison Study ; Dcca ; Microbiome and Gene Expression Data ; Multi-Omics Data ; Rcc ; Scca",
pages = "511,517",
publisher = "Springer Verlag",
title = "Genome-Wide Canonical Correlation Analysis-Based Computational Methods for Mining Information from Microbiome and Gene Expression Data",
volume = "11489",
year = "2019",
}

Authors

Pourang Irani

Pourang Irani

Professor
Canada Research Chair
at University of British Columbia Okanagan Campus

As well as: Rayhan Shikder, Pingzhao Hu