BREAST CANCER DIAGNOSIS USING WRAPPER-BASED FEATURE SELECTION AND ARTIFICIAL NEURAL NETWORK
Nawazish NAVEED
nawazish.ibr@cas.edu.omUniversity of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)
Hayan T. MADHLOOM
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)
Mohd Shahid HUSAIN
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)
Abstract
Breast cancer is commonest type of cancers among women. Early diagnosis plays a significant role in reducing the fatality rate. The main objective of this study is to propose an efficient approach to classify breast cancer tumor into either benign or malignant based on digitized image of a fine needle aspirate (FNA) of a breast mass represented by the Wisconsin Breast Cancer Dataset. Two wrapper-based feature selection methods, namely, sequential forward selection(SFS) and sequential backward selection (SBS) are used to identify the most discriminant features which can contribute to improve the classification performance. The feed forward neural network (FFNN) is used as a classification algorithm. The learning algorithm hyper-parameters are optimized using the grid search process. After selecting the optimal classification model, the data is divided into training set and testing set and the performance was evaluated. The feature space is reduced from nine feature to seven and six features using SFS and SBS respectively. The highest classification accuracy recorded was 99.03% with FFNN using the seven SFS selected features. While accuracy recorded with the six SBS selected features was 98.54%. The obtained results indicate that the proposed approach is effective in terms of feature space reduction leading to better accuracy and efficient classification model.
Keywords:
Breast Cancer Diagnosis, Feature Selection, Neural Network, Grid Search, Machine LearningReferences
Addeh, A., Demirel, H., & Zarbakhsh, P. (2017). Early detection of breast cancer using optimized ANFIS and features selection. 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN) (pp. 39–42). IEEE. http://doi.org/10.1109/CICN.2017.8319352
DOI: https://doi.org/10.1109/CICN.2017.8319352
Google Scholar
Agrawal, S., & Agrawal, J. (2015). Neural network techniques for cancer prediction: A survey. Procedia Computer Science, 60, 769–774. http://doi.org/10.1016/j.procs.2015.08.234
DOI: https://doi.org/10.1016/j.procs.2015.08.234
Google Scholar
Ang, J. C., Mirzal, A., Haron, H., & Hamed, H. N. A. (2015). Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM transactions on computational biology and bioinformatics, 13(5), 971–989. http://doi.org/10.1109/TCBB.2015.2478454
DOI: https://doi.org/10.1109/TCBB.2015.2478454
Google Scholar
Barna, S. D., & Khan, S. (2019). Performance Evaluation of Classification Learning Models for Wisconsin Breast Cancer Data Repository. 7th International Conference on Data Science and SDGs: Challenges, Opportunities and Realities (EC-50). Bangladesh.
Google Scholar
Bonakdari, H., Moradi, F., Ebtehaj, I., Gharabaghi, B., Sattar, A. A., Azimi, A. H., & Radecki-Pawlik, A. (2020). A Non-Tuned Machine Learning Technique for Abutment Scour Depth in Clear Water Condition. Water, 12(1), 301. http://doi.org/10.3390/w12010301
DOI: https://doi.org/10.3390/w12010301
Google Scholar
Casaubon, J. T., Tomlinson-Hansen, S., & Regan, J.-P. (2020). Fine Needle Aspiration of Breast Masses. StatPearls. StatPearls Publishing.
Google Scholar
Dhungel, N., Carneiro, G., & Bradley, A. P. (2015). Automated Mass Detection in Mammograms Using Cascaded Deep Learning and Random Forests. International Conference on Digital Image Computing: Techniques and Applications (DICTA) (pp. 1–8). IEEE. http://doi.org/10.1109/DICTA.2015.7371234
DOI: https://doi.org/10.1109/DICTA.2015.7371234
Google Scholar
Douangnoulack, P., & Boonjing, V. (2018). Building Minimal Classification Rules for Breast Cancer Diagnosis. 2018 10th International Conference on Knowledge and Smart Technology (KST) (pp. 278–281). IEEE. http://doi.org/10.1109/KST.2018.8426198
DOI: https://doi.org/10.1109/KST.2018.8426198
Google Scholar
Ed-Daoudy, A., & Maalmi, K. (2020). Breast cancer classification with reduced feature set using association rules and support vector machine. Network Modeling Analysis in Health Informatics and Bioinformatics, 9(1), 34. http://doi.org/10.1007/s13721-020-00237-8
DOI: https://doi.org/10.1007/s13721-020-00237-8
Google Scholar
Foithong, S., Srinil, P., & Pinngern, O. (2017). Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough-Mutual Feature Selection. Walailak Journal of Science and Technology, 14(4).
Google Scholar
Guliyev, N. J., & Ismailov, V. E. (2018). On the approximation by single hidden layer feedforward neural networks with fixed weights. Neural Networks, 98, 296-304. http://doi.org/10.1016/j.neunet.2017.12.007
DOI: https://doi.org/10.1016/j.neunet.2017.12.007
Google Scholar
Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, L. A. (2008). Feature extraction: foundations and applications (Vol. 207). Springer. http://doi.org/10.1007/978-3-540-35488-8
DOI: https://doi.org/10.1007/978-3-540-35488-8
Google Scholar
Hsu, Y.-C., Tsai, Y.-H., Weng, H.-H., Hsu, L.-S., Tsai, Y.-H., Lin, Y.-C., Hung, M.-S., Fang, Y.-H., & Chen, C.-W. (2020). Artificial neural networks improve LDCT lung cancer screening: a comparative validation study. BMC Cancer, 20(1), 1023. https://doi.org/10.1186/s12885-020-07465-1
DOI: https://doi.org/10.1186/s12885-020-07465-1
Google Scholar
Islam, M. M., Haque, M. R., Iqbal, H., Hasan, M. M., Hasan, M., & Kabir, M. N. (2020). Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques. SN Computer Science, 1(5), 290. https://doi.org/10.1007/s42979-020-00305-w
DOI: https://doi.org/10.1007/s42979-020-00305-w
Google Scholar
Jain, D., & Singh, V. (2018). Feature selection and classification systems for chronic disease prediction: A review. Egyptian Informatics Journal, 19(3), 179–189. https://doi.org/10.1016/j.eij.2018.03.002
DOI: https://doi.org/10.1016/j.eij.2018.03.002
Google Scholar
Khan, A., Shah, R., Imran, M., Khan, A., Bangash, J. I., & Shah, K. (2019). An alternative approach to neural network training based on hybrid bio meta-heuristic algorithm. Journal of Ambient Intelligence and Humanized Computing, 10(10), 3821-3830. https://doi.org/10.1007/s12652-019-01373-4
DOI: https://doi.org/10.1007/s12652-019-01373-4
Google Scholar
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1–2), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X
DOI: https://doi.org/10.1016/S0004-3702(97)00043-X
Google Scholar
Kumar, V. (2021). Evaluation of computationally intelligent techniques for breast cancer diagnosis. Neural Computing and Applications, 33(8), 3195–3208. https://doi.org/10.1007/s00521-020-05204-y
DOI: https://doi.org/10.1007/s00521-020-05204-y
Google Scholar
Kumar, V., & Minz, S. (2014). Feature selection: a literature review. SmartCR, 4(3), 211-229.
DOI: https://doi.org/10.6029/smartcr.2014.03.007
Google Scholar
Kumari, M., & Singh, V. (2018). Breast Cancer Prediction system. Procedia Computer Science, 132, 371–376. https://doi.org/10.1016/j.procs.2018.05.197
DOI: https://doi.org/10.1016/j.procs.2018.05.197
Google Scholar
Liu, X., Li, B., Shen, D., Cao, J., & Mao, B. (2017). Analysis of Grain Storage Loss Based on Decision Tree Algorithm. Procedia Computer Science, 122, 130–137. https://doi.org/10.1016/j.procs.2017.11.351
DOI: https://doi.org/10.1016/j.procs.2017.11.351
Google Scholar
Moodley, J., Walter, F., Scott, S., & Mwaka, A. (2018). Towards timely diagnosis of symptomatic breast and cervical cancer in South Africa. South African Medical Journal, 108(10), 803–804. https://doi.org/10.7196/SAMJ.2018.v108i10.13478
DOI: https://doi.org/10.7196/SAMJ.2018.v108i10.13478
Google Scholar
Mushtaq, Z., Yaqub, A., Hassan, A., & Su, S. F. (2019). Performance Analysis of Supervised Classifiers Using PCA Based Techniques on Breast Cancer. 2019 International Conference on Engineering and Emerging Technologies (ICEET) (pp. 1–6). IEEE. https://doi.org/10.1109/CEET1.2019.8711868
DOI: https://doi.org/10.1109/CEET1.2019.8711868
Google Scholar
Patsadu, O., Tangchitwilaikun, P., & Lowsuwankul, S. (2021). Liver Cancer Patient Classification on a Multiple-Stage using Hybrid Classification Methods. Walailak Journal of Science and Technology, 18(10). https://doi.org/10.48048/wjst.2021.9169
DOI: https://doi.org/10.48048/wjst.2021.9169
Google Scholar
Santos-Pereira, J., Gruenwald, L., & Bernardino, J. (2021). Top data mining tools for the healthcare industry. Journal of King Saud University – Computer and Information Sciences, in press. https://doi.org/https://doi.org/10.1016/j.jksuci.2021.06.002
DOI: https://doi.org/10.1016/j.jksuci.2021.06.002
Google Scholar
Senturk, Z. K., & Kara, R. (2014). Breast Cancer Diagnosis Via Data Mining: Performance Analysis of Seven Different algorithms. Computer Science & Engineering: An International Journal (CSEIJ), 4(1), 35–46. https://doi.org/10.5121/cseij.2014.4104
DOI: https://doi.org/10.5121/cseij.2014.4104
Google Scholar
Shenouda, E. A. M. A. (2006). A Quantitative Comparison of Different MLP Activation Functions in Classification. In: J. Wang, Z. Yi, J. M. Zurada, B. L. Lu & H. Yin (Eds.), Advances in Neural Networks. Lecture Notes in Computer Science (vol. 3971). Springer. https://doi.org/10.1007/11759966_125
DOI: https://doi.org/10.1007/11759966_125
Google Scholar
Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. In Data classification: Algorithms and applications (chapter 2). Chapman and Hall/CRC. https://doi.org/10.1201/b17320
DOI: https://doi.org/10.1201/b17320
Google Scholar
Vijayalakshmi, S., & Priyadarshini, J. (2017). Breast Cancer Classification using RBF and BPN Neural Networks. International Journal of Applied Engineering Research, 12(15), 4775–4781.
Google Scholar
Wahhab, H. T. A. (2015). Classification of acute leukemia using image processing and machine learning techniques. University of Malaya.
Google Scholar
WBCD. (1995). Retrieved January 20, 2021 from https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original).
Google Scholar
Wu, J., Zhuang, Q., & Tan, Y. (2020). Auxiliary Medical Decision System for Prostate Cancer Based on Ensemble Method. Computational and Mathematical Methods in Medicine, 2020, 6509596. https://doi.org/10.1155/2020/6509596
DOI: https://doi.org/10.1155/2020/6509596
Google Scholar
Yi, L., & Yi, W. (2017). Decision Tree Model in the Diagnosis of Breast Cancer. In 2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC) (pp. 176–179). IEEE. https://doi.org/10.1109/ICCTEC.2017.00046
DOI: https://doi.org/10.1109/ICCTEC.2017.00046
Google Scholar
Zarei, M., Ansari, H., Keshavarz, P., & Zerafat, M. (2020). Prediction of pool boiling heat transfer coefficient for various nano-refrigerants utilizing artificial neural networks. Journal of Thermal Analysis and Calorimetry, 139(6), 3757–3768.
DOI: https://doi.org/10.1007/s10973-019-08746-z
Google Scholar
Authors
Nawazish NAVEEDnawazish.ibr@cas.edu.om
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman
Authors
Hayan T. MADHLOOMUniversity of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman
Authors
Mohd Shahid HUSAINUniversity of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman
Statistics
Abstract views: 384PDF downloads: 54
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Malek M. AL-NAWASHI , Obaida M. AL-HAZAIMEH, Mutaz Kh. KHAZAALEH , A NEW APPROACH FOR BREAST CANCER DETECTION- BASED MACHINE LEARNING TECHNIQUE , Applied Computer Science: Vol. 20 No. 1 (2024)
- Nataliya SHABLIY, Serhii LUPENKO, Nadiia LUTSYK, Oleh YASNIY, Olha MALYSHEVSKA, KEYSTROKE DYNAMICS ANALYSIS USING MACHINE LEARNING METHODS , Applied Computer Science: Vol. 17 No. 4 (2021)
- Islam MOHAMED, Mohamed EL-WAKAD, Khaled ABBAS, Mohamed ABOAMER, Nader A. Rahman MOHAMED, PUPIL DIAMETER AND MACHINE LEARNING FOR DEPRESSION DETECTION: A COMPARATIVE STUDY WITH DEEP LEARNING MODELS , Applied Computer Science: Vol. 20 No. 4 (2024)
- Anitha Rani PALAKAYALA, Kuppusamy P, A QUALITATIVE AND QUANTITATIVE APPROACH USING MACHINE LEARNING AND NON-MOTOR SYMPTOMS FOR PARKINSON’S DISEASE CLASSIFICATION. A HIERARCHICAL STUDY , Applied Computer Science: Vol. 20 No. 3 (2024)
- Jerzy JÓZWIK, Magdalena ZAWADA-MICHAŁOWSKA, Monika KULISZ, Paweł TOMIŁO, Marcin BARSZCZ, Paweł PIEŚKO, Michał LELEŃ, Kamil CYBUL, MODELING THE OPTIMAL MEASUREMENT TIME WITH A PROBE ON THE MACHINE TOOL USING MACHINE LEARNING METHODS , Applied Computer Science: Vol. 20 No. 2 (2024)
- Muayed S AL-HUSEINY, Ahmed S SAJIT, BREAST CANCER CAD SYSTEM BY USING TRANSFER LEARNING AND ENHANCED ROI , Applied Computer Science: Vol. 18 No. 1 (2022)
- Miguel Angel BELLO RIVERA, Carlos Alberto REYES GARCÍA, Tania Cristal TALAVERA ROJAS, Perfecto Malaquías QUINTERO FLORES, Rodolfo Eleazar PÉREZ LOAIZA, AUTOMATIC IDENTIFICATION OF DYSPHONIAS USING MACHINE LEARNING ALGORITHMS , Applied Computer Science: Vol. 19 No. 4 (2023)
- Robert KARPIŃSKI, Jakub GAJEWSKI, Jakub SZABELSKI, Dalibor BARTA, APPLICATION OF NEURAL NETWORKS IN PREDICTION OF TENSILE STRENGTH OF ABSORBABLE SUTURES , Applied Computer Science: Vol. 13 No. 4 (2017)
- Shahil SHARMA, Rajnesh LAL, Bimal KUMAR, DEVELOPING MACHINE LEARNING APPLICATION FOR EARLY CARDIOVASCULAR DISEASE (CVD) RISK DETECTION IN FIJI: A DESIGN SCIENCE APPROACH , Applied Computer Science: Vol. 20 No. 3 (2024)
- Roman GALAGAN, Serhiy ANDREIEV, Nataliia STELMAKH, Yaroslava RAFALSKA, Andrii MOMOT, AUTOMATION OF POLYCYSTIC OVARY SYNDROME DIAGNOSTICS THROUGH MACHINE LEARNING ALGORITHMS IN ULTRASOUND IMAGING , Applied Computer Science: Vol. 20 No. 2 (2024)
You may also start an advanced similarity search for this article.