BREAST CANCER DIAGNOSIS USING WRAPPER-BASED FEATURE SELECTION AND ARTIFICIAL NEURAL NETWORK
Nawazish NAVEED
nawazish.ibr@cas.edu.omUniversity of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)
Hayan T. MADHLOOM
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)
Mohd Shahid HUSAIN
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)
Abstract
Breast cancer is commonest type of cancers among women. Early diagnosis plays a significant role in reducing the fatality rate. The main objective of this study is to propose an efficient approach to classify breast cancer tumor into either benign or malignant based on digitized image of a fine needle aspirate (FNA) of a breast mass represented by the Wisconsin Breast Cancer Dataset. Two wrapper-based feature selection methods, namely, sequential forward selection(SFS) and sequential backward selection (SBS) are used to identify the most discriminant features which can contribute to improve the classification performance. The feed forward neural network (FFNN) is used as a classification algorithm. The learning algorithm hyper-parameters are optimized using the grid search process. After selecting the optimal classification model, the data is divided into training set and testing set and the performance was evaluated. The feature space is reduced from nine feature to seven and six features using SFS and SBS respectively. The highest classification accuracy recorded was 99.03% with FFNN using the seven SFS selected features. While accuracy recorded with the six SBS selected features was 98.54%. The obtained results indicate that the proposed approach is effective in terms of feature space reduction leading to better accuracy and efficient classification model.
Keywords:
Breast Cancer Diagnosis, Feature Selection, Neural Network, Grid Search, Machine LearningReferences
Addeh, A., Demirel, H., & Zarbakhsh, P. (2017). Early detection of breast cancer using optimized ANFIS and features selection. 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN) (pp. 39–42). IEEE. http://doi.org/10.1109/CICN.2017.8319352
DOI: https://doi.org/10.1109/CICN.2017.8319352
Google Scholar
Agrawal, S., & Agrawal, J. (2015). Neural network techniques for cancer prediction: A survey. Procedia Computer Science, 60, 769–774. http://doi.org/10.1016/j.procs.2015.08.234
DOI: https://doi.org/10.1016/j.procs.2015.08.234
Google Scholar
Ang, J. C., Mirzal, A., Haron, H., & Hamed, H. N. A. (2015). Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM transactions on computational biology and bioinformatics, 13(5), 971–989. http://doi.org/10.1109/TCBB.2015.2478454
DOI: https://doi.org/10.1109/TCBB.2015.2478454
Google Scholar
Barna, S. D., & Khan, S. (2019). Performance Evaluation of Classification Learning Models for Wisconsin Breast Cancer Data Repository. 7th International Conference on Data Science and SDGs: Challenges, Opportunities and Realities (EC-50). Bangladesh.
Google Scholar
Bonakdari, H., Moradi, F., Ebtehaj, I., Gharabaghi, B., Sattar, A. A., Azimi, A. H., & Radecki-Pawlik, A. (2020). A Non-Tuned Machine Learning Technique for Abutment Scour Depth in Clear Water Condition. Water, 12(1), 301. http://doi.org/10.3390/w12010301
DOI: https://doi.org/10.3390/w12010301
Google Scholar
Casaubon, J. T., Tomlinson-Hansen, S., & Regan, J.-P. (2020). Fine Needle Aspiration of Breast Masses. StatPearls. StatPearls Publishing.
Google Scholar
Dhungel, N., Carneiro, G., & Bradley, A. P. (2015). Automated Mass Detection in Mammograms Using Cascaded Deep Learning and Random Forests. International Conference on Digital Image Computing: Techniques and Applications (DICTA) (pp. 1–8). IEEE. http://doi.org/10.1109/DICTA.2015.7371234
DOI: https://doi.org/10.1109/DICTA.2015.7371234
Google Scholar
Douangnoulack, P., & Boonjing, V. (2018). Building Minimal Classification Rules for Breast Cancer Diagnosis. 2018 10th International Conference on Knowledge and Smart Technology (KST) (pp. 278–281). IEEE. http://doi.org/10.1109/KST.2018.8426198
DOI: https://doi.org/10.1109/KST.2018.8426198
Google Scholar
Ed-Daoudy, A., & Maalmi, K. (2020). Breast cancer classification with reduced feature set using association rules and support vector machine. Network Modeling Analysis in Health Informatics and Bioinformatics, 9(1), 34. http://doi.org/10.1007/s13721-020-00237-8
DOI: https://doi.org/10.1007/s13721-020-00237-8
Google Scholar
Foithong, S., Srinil, P., & Pinngern, O. (2017). Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough-Mutual Feature Selection. Walailak Journal of Science and Technology, 14(4).
Google Scholar
Guliyev, N. J., & Ismailov, V. E. (2018). On the approximation by single hidden layer feedforward neural networks with fixed weights. Neural Networks, 98, 296-304. http://doi.org/10.1016/j.neunet.2017.12.007
DOI: https://doi.org/10.1016/j.neunet.2017.12.007
Google Scholar
Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, L. A. (2008). Feature extraction: foundations and applications (Vol. 207). Springer. http://doi.org/10.1007/978-3-540-35488-8
DOI: https://doi.org/10.1007/978-3-540-35488-8
Google Scholar
Hsu, Y.-C., Tsai, Y.-H., Weng, H.-H., Hsu, L.-S., Tsai, Y.-H., Lin, Y.-C., Hung, M.-S., Fang, Y.-H., & Chen, C.-W. (2020). Artificial neural networks improve LDCT lung cancer screening: a comparative validation study. BMC Cancer, 20(1), 1023. https://doi.org/10.1186/s12885-020-07465-1
DOI: https://doi.org/10.1186/s12885-020-07465-1
Google Scholar
Islam, M. M., Haque, M. R., Iqbal, H., Hasan, M. M., Hasan, M., & Kabir, M. N. (2020). Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques. SN Computer Science, 1(5), 290. https://doi.org/10.1007/s42979-020-00305-w
DOI: https://doi.org/10.1007/s42979-020-00305-w
Google Scholar
Jain, D., & Singh, V. (2018). Feature selection and classification systems for chronic disease prediction: A review. Egyptian Informatics Journal, 19(3), 179–189. https://doi.org/10.1016/j.eij.2018.03.002
DOI: https://doi.org/10.1016/j.eij.2018.03.002
Google Scholar
Khan, A., Shah, R., Imran, M., Khan, A., Bangash, J. I., & Shah, K. (2019). An alternative approach to neural network training based on hybrid bio meta-heuristic algorithm. Journal of Ambient Intelligence and Humanized Computing, 10(10), 3821-3830. https://doi.org/10.1007/s12652-019-01373-4
DOI: https://doi.org/10.1007/s12652-019-01373-4
Google Scholar
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1–2), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X
DOI: https://doi.org/10.1016/S0004-3702(97)00043-X
Google Scholar
Kumar, V. (2021). Evaluation of computationally intelligent techniques for breast cancer diagnosis. Neural Computing and Applications, 33(8), 3195–3208. https://doi.org/10.1007/s00521-020-05204-y
DOI: https://doi.org/10.1007/s00521-020-05204-y
Google Scholar
Kumar, V., & Minz, S. (2014). Feature selection: a literature review. SmartCR, 4(3), 211-229.
DOI: https://doi.org/10.6029/smartcr.2014.03.007
Google Scholar
Kumari, M., & Singh, V. (2018). Breast Cancer Prediction system. Procedia Computer Science, 132, 371–376. https://doi.org/10.1016/j.procs.2018.05.197
DOI: https://doi.org/10.1016/j.procs.2018.05.197
Google Scholar
Liu, X., Li, B., Shen, D., Cao, J., & Mao, B. (2017). Analysis of Grain Storage Loss Based on Decision Tree Algorithm. Procedia Computer Science, 122, 130–137. https://doi.org/10.1016/j.procs.2017.11.351
DOI: https://doi.org/10.1016/j.procs.2017.11.351
Google Scholar
Moodley, J., Walter, F., Scott, S., & Mwaka, A. (2018). Towards timely diagnosis of symptomatic breast and cervical cancer in South Africa. South African Medical Journal, 108(10), 803–804. https://doi.org/10.7196/SAMJ.2018.v108i10.13478
DOI: https://doi.org/10.7196/SAMJ.2018.v108i10.13478
Google Scholar
Mushtaq, Z., Yaqub, A., Hassan, A., & Su, S. F. (2019). Performance Analysis of Supervised Classifiers Using PCA Based Techniques on Breast Cancer. 2019 International Conference on Engineering and Emerging Technologies (ICEET) (pp. 1–6). IEEE. https://doi.org/10.1109/CEET1.2019.8711868
DOI: https://doi.org/10.1109/CEET1.2019.8711868
Google Scholar
Patsadu, O., Tangchitwilaikun, P., & Lowsuwankul, S. (2021). Liver Cancer Patient Classification on a Multiple-Stage using Hybrid Classification Methods. Walailak Journal of Science and Technology, 18(10). https://doi.org/10.48048/wjst.2021.9169
DOI: https://doi.org/10.48048/wjst.2021.9169
Google Scholar
Santos-Pereira, J., Gruenwald, L., & Bernardino, J. (2021). Top data mining tools for the healthcare industry. Journal of King Saud University – Computer and Information Sciences, in press. https://doi.org/https://doi.org/10.1016/j.jksuci.2021.06.002
DOI: https://doi.org/10.1016/j.jksuci.2021.06.002
Google Scholar
Senturk, Z. K., & Kara, R. (2014). Breast Cancer Diagnosis Via Data Mining: Performance Analysis of Seven Different algorithms. Computer Science & Engineering: An International Journal (CSEIJ), 4(1), 35–46. https://doi.org/10.5121/cseij.2014.4104
DOI: https://doi.org/10.5121/cseij.2014.4104
Google Scholar
Shenouda, E. A. M. A. (2006). A Quantitative Comparison of Different MLP Activation Functions in Classification. In: J. Wang, Z. Yi, J. M. Zurada, B. L. Lu & H. Yin (Eds.), Advances in Neural Networks. Lecture Notes in Computer Science (vol. 3971). Springer. https://doi.org/10.1007/11759966_125
DOI: https://doi.org/10.1007/11759966_125
Google Scholar
Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. In Data classification: Algorithms and applications (chapter 2). Chapman and Hall/CRC. https://doi.org/10.1201/b17320
DOI: https://doi.org/10.1201/b17320
Google Scholar
Vijayalakshmi, S., & Priyadarshini, J. (2017). Breast Cancer Classification using RBF and BPN Neural Networks. International Journal of Applied Engineering Research, 12(15), 4775–4781.
Google Scholar
Wahhab, H. T. A. (2015). Classification of acute leukemia using image processing and machine learning techniques. University of Malaya.
Google Scholar
WBCD. (1995). Retrieved January 20, 2021 from https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original).
Google Scholar
Wu, J., Zhuang, Q., & Tan, Y. (2020). Auxiliary Medical Decision System for Prostate Cancer Based on Ensemble Method. Computational and Mathematical Methods in Medicine, 2020, 6509596. https://doi.org/10.1155/2020/6509596
DOI: https://doi.org/10.1155/2020/6509596
Google Scholar
Yi, L., & Yi, W. (2017). Decision Tree Model in the Diagnosis of Breast Cancer. In 2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC) (pp. 176–179). IEEE. https://doi.org/10.1109/ICCTEC.2017.00046
DOI: https://doi.org/10.1109/ICCTEC.2017.00046
Google Scholar
Zarei, M., Ansari, H., Keshavarz, P., & Zerafat, M. (2020). Prediction of pool boiling heat transfer coefficient for various nano-refrigerants utilizing artificial neural networks. Journal of Thermal Analysis and Calorimetry, 139(6), 3757–3768.
DOI: https://doi.org/10.1007/s10973-019-08746-z
Google Scholar
Authors
Nawazish NAVEEDnawazish.ibr@cas.edu.om
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman
Authors
Hayan T. MADHLOOMUniversity of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman
Authors
Mohd Shahid HUSAINUniversity of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman
Statistics
Abstract views: 318PDF downloads: 51
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Grzegorz KŁOSOWSKI, Tomasz KLEPKA, Agnieszka NOWACKA, NEURAL CONTROLLER FOR THE SELECTION OF RECYCLED COMPONENTS IN POLYMER-GYPSY MORTARS , Applied Computer Science: Vol. 14 No. 2 (2018)
- Edyta ŁUKASIK, Wiktor FLIS, EFFICIENCY COMPARISON OF NETWORKS IN HANDWRITTEN LATIN CHARACTERS RECOGNITION WITH DIACRITICS , Applied Computer Science: Vol. 19 No. 4 (2023)
- Sheikh Amir FAYAZ, Majid ZAMAN, Muheet Ahmed BUTT, Sameer KAUL, HOW MACHINE LEARNING ALGORITHMS ARE USED IN METEOROLOGICAL DATA CLASSIFICATION: A COMPARATIVE APPROACH BETWEEN DT, LMT, M5-MT, GRADIENT BOOSTING AND GWLM-NARX MODELS , Applied Computer Science: Vol. 18 No. 4 (2022)
- Wafaa Mustafa HAMEED, Asan Baker KANBAR, USING GA FOR EVOLVING WEIGHTS IN NEURAL NETWORKS , Applied Computer Science: Vol. 15 No. 3 (2019)
- KK Praneeth Tellakula, Saravana Kumar R, Sanjoy Deb, A SURVEY OF AI IMAGING TECHNIQUES FOR COVID-19 DIAGNOSIS AND PROGNOSIS , Applied Computer Science: Vol. 17 No. 2 (2021)
- Behnaz ESLAMI, Mehdi HABIBZADEH MOTLAGH, Zahra REZAEI, Mohammad ESLAMI, Mohammad AMIN AMINI, UNSUPERVISED DYNAMIC TOPIC MODEL FOR EXTRACTING ADVERSE DRUG REACTION FROM HEALTH FORUMS , Applied Computer Science: Vol. 16 No. 1 (2020)
- Qingyu Liu, Roben A. Juanatas, MASK FACE INPAINTING BASED ON IMPROVED GENERATIVE ADVERSARIAL NETWORK , Applied Computer Science: Vol. 19 No. 2 (2023)
- Dilek AYDOGAN-KILIC, Deniz Kenan KILIC, Izabela Ewa NIELSEN, EXAMINATION OF SUMMARIZED MEDICAL RECORDS FOR ICD CODE CLASSIFICATION VIA BERT , Applied Computer Science: Vol. 20 No. 2 (2024)
- Marcin TOMCZYK, Barbara BOROWIK, Bohdan BOROWIK, IDENTIFICATION OF THE MASS INERTIA MOMENT IN AN ELECTROMECHANICAL SYSTEM BASED ON WAVELET–NEURAL METHOD , Applied Computer Science: Vol. 14 No. 2 (2018)
- Sahar ZAMANI KHANGHAH, Keivan MAGHOOLI, EMOTION RECOGNITION FROM HEART RATE VARIABILITY WITH A HYBRID SYSTEM COMBINED HIDDEN MARKOV MODEL AND POINCARE PLOT , Applied Computer Science: Vol. 20 No. 1 (2024)
<< < 1 2 3 4 5 6 7 8 9 10 > >>
You may also start an advanced similarity search for this article.