A two phase ensembled deep learning approach of prominent gene extraction and disease risk prediction

Main Article Content

DOI

Prajna Paramita DEBATA

prajnaparamitaa@gmail.com

Alakananda TRIPATHY

alakanandatripathy@soa.ac.in

Pournamasi PARHI

pournamasiparhi@soa.ac.in

Smruti Rekha DAS

sdas5@gitam.edu

Abstract

Unlocking novel insights from gene expression of individual patient profiles, clinicians and researchers can discern patterns, biomarkers, and therapies. Moreover, accurate classification enables the development of predictive models for prognosis and treatment response, facilitating personalized medicine approaches. Determining the optimal model for classification remains a time-consuming nondeterministic polynomial-time hard issue. However, available voluminous gene expression data is too much to handle for the traditional data analysis approaches. Therefore, a two phase ensembled deep learning approach can be considered as a dependable framework for the root level investigation of genomic data. In this experimental model, a gene extraction approach, a kernel applied Fisher score (KFScore) method is presented to select the prominent genomes, and sine-cosine ensembled Monarch Butterfly algorithm (SC-MBO) optimized CNN (Convolutional Neural Network) strategy is implemented for genomic data classification. Here, the SC-MBO ensembled approach is used to get the optimal value of the hyperparameters in CNN. The effectiveness of the presented model is estimated by accuracy% of classification, number of extracted prominent genomic feature, sensitivity, specificity, and ROC (Receiver Operating Characteristic) curve. The suggested methods' efficacy is successfully tested in GSE13159, GSE15061, GSE13204, Breast Cancer, and Ovarian cancer gene expression dataset with 91.6%, 90.22%, 91.9%, 97.93% and 99.6% accuracy. The suggested model is also contrasted with that of other existing models. According to experimental evaluation, the suggested strategy is accurate, reliable, and robust. Consequently, the presented method can be treated as a trustworthy foundation for disease risk prediction.

Keywords:

Genomic data, Kernal Fisher Score, Prominent gene selections, sine-cosine ensembled Monarch Butterfly Optimization, Two Phase ensembled Deep learning, Classification Accuracy

References

Article Details

DEBATA, P. P., TRIPATHY, A., PARHI, P., & DAS, S. R. (2025). A two phase ensembled deep learning approach of prominent gene extraction and disease risk prediction. Applied Computer Science, 21(2), 111–127. https://doi.org/10.35784/acs_6958