Skip to main content
SLU publication database (SLUpub)

Research article2023Peer reviewedOpen access

Identifying potential circulating miRNA biomarkers for the diagnosis and prediction of ovarian cancer using machine-learning approach: application of Boruta

Hamidi, Farzaneh; Gilani, Neda; Arabi Belaghi, Reza; et al.; et al.

Abstract

Introduction: In gynecologic oncology, ovarian cancer is a great clinical challenge. Because of the lack of typical symptoms and effective biomarkers for noninvasive screening, most patients develop advanced-stage ovarian cancer by the time of diagnosis. MicroRNAs (miRNAs) are a type of non-coding RNA molecule that has been linked to human cancers. Specifying diagnostic biomarkers to determine non-cancer and cancer samples is difficult.

Methods: By using Boruta, a novel random forest-based feature selection in the machine-learning techniques, we aimed to identify biomarkers associated with ovarian cancer using cancerous and non-cancer samples from the Gene Expression Omnibus (GEO) database: GSE106817. In this study, we used two independent GEO data sets as external validation, including GSE113486 and GSE113740. We utilized five state-of-the-art machine-learning algorithms for classification: logistic regression, random forest, decision trees, artificial neural networks, and XGBoost.

Results: Four models discovered in GSE113486 had an AUC of 100%, three in GSE113740 with AUC of over 94%, and four in GSE113486 with AUC of over 94%. We identified 10 miRNAs to distinguish ovarian cancer cases from normal controls: hsa-miR-1290, hsa-miR-1233-5p, hsa-miR-1914-5p, hsa-miR-1469, hsa-miR-4675, hsa-miR-1228-5p, hsa-miR-3184-5p, hsa-miR-6784-5p, hsa-miR-6800-5p, and hsa-miR-5100. Our findings suggest that miRNAs could be used as possible biomarkers for ovarian cancer screening, for possible intervention.

Keywords

artificial intelligence; Boruta; biomarker; feature selection; Gene Expression Omnibus; ovarian cancer; oncology

Published in

Frontiers in Digital Health
2023, Volume: 5, article number: 1187578

    UKÄ Subject classification

    Cancer and Oncology
    Computer Science

    Publication identifier

    DOI: https://doi.org/10.3389/fdgth.2023.1187578

    Permanent link to this page (URI)

    https://res.slu.se/id/publ/126913