BREAST CANCER DIAGNOSIS USING WRAPPER-BASED FEATURE SELECTION AND ARTIFICIAL NEURAL NETWORK

Nawazish NAVEED

nawazish.ibr@cas.edu.om
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)

Hayan T. MADHLOOM


University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)

Mohd Shahid HUSAIN


University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT (Oman)

Abstract

Breast cancer is commonest type of cancers among women. Early diagnosis plays a significant role in reducing the fatality rate. The main objective of this study is to propose an efficient approach to classify breast cancer tumor into either benign or malignant based on digitized image of a fine needle aspirate (FNA) of a breast mass represented by the Wisconsin Breast Cancer Dataset. Two wrapper-based feature selection methods, namely, sequential forward selection(SFS) and sequential backward selection (SBS) are used to identify the most discriminant features which can contribute to improve the classification performance. The feed forward neural network (FFNN) is used as a classification algorithm. The learning algorithm hyper-parameters are optimized using the grid search process. After selecting the optimal classification model, the data is divided into training set and testing set and the performance was evaluated. The feature space is reduced from nine feature to seven and six features using SFS and SBS respectively. The highest classification accuracy recorded was 99.03% with FFNN using the seven SFS selected features. While accuracy recorded with the six SBS selected features was 98.54%. The obtained results indicate that the proposed approach is effective in terms of feature space reduction leading to better accuracy and efficient classification model.


Keywords:

Breast Cancer Diagnosis, Feature Selection, Neural Network, Grid Search, Machine Learning

Addeh, A., Demirel, H., & Zarbakhsh, P. (2017). Early detection of breast cancer using optimized ANFIS and features selection. 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN) (pp. 39–42). IEEE. http://doi.org/10.1109/CICN.2017.8319352
DOI: https://doi.org/10.1109/CICN.2017.8319352   Google Scholar

Agrawal, S., & Agrawal, J. (2015). Neural network techniques for cancer prediction: A survey. Procedia Computer Science, 60, 769–774. http://doi.org/10.1016/j.procs.2015.08.234
DOI: https://doi.org/10.1016/j.procs.2015.08.234   Google Scholar

Ang, J. C., Mirzal, A., Haron, H., & Hamed, H. N. A. (2015). Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM transactions on computational biology and bioinformatics, 13(5), 971–989. http://doi.org/10.1109/TCBB.2015.2478454
DOI: https://doi.org/10.1109/TCBB.2015.2478454   Google Scholar

Barna, S. D., & Khan, S. (2019). Performance Evaluation of Classification Learning Models for Wisconsin Breast Cancer Data Repository. 7th International Conference on Data Science and SDGs: Challenges, Opportunities and Realities (EC-50). Bangladesh.
  Google Scholar

Bonakdari, H., Moradi, F., Ebtehaj, I., Gharabaghi, B., Sattar, A. A., Azimi, A. H., & Radecki-Pawlik, A. (2020). A Non-Tuned Machine Learning Technique for Abutment Scour Depth in Clear Water Condition. Water, 12(1), 301. http://doi.org/10.3390/w12010301
DOI: https://doi.org/10.3390/w12010301   Google Scholar

Casaubon, J. T., Tomlinson-Hansen, S., & Regan, J.-P. (2020). Fine Needle Aspiration of Breast Masses. StatPearls. StatPearls Publishing.
  Google Scholar

Dhungel, N., Carneiro, G., & Bradley, A. P. (2015). Automated Mass Detection in Mammograms Using Cascaded Deep Learning and Random Forests. International Conference on Digital Image Computing: Techniques and Applications (DICTA) (pp. 1–8). IEEE. http://doi.org/10.1109/DICTA.2015.7371234
DOI: https://doi.org/10.1109/DICTA.2015.7371234   Google Scholar

Douangnoulack, P., & Boonjing, V. (2018). Building Minimal Classification Rules for Breast Cancer Diagnosis. 2018 10th International Conference on Knowledge and Smart Technology (KST) (pp. 278–281). IEEE. http://doi.org/10.1109/KST.2018.8426198
DOI: https://doi.org/10.1109/KST.2018.8426198   Google Scholar

Ed-Daoudy, A., & Maalmi, K. (2020). Breast cancer classification with reduced feature set using association rules and support vector machine. Network Modeling Analysis in Health Informatics and Bioinformatics, 9(1), 34. http://doi.org/10.1007/s13721-020-00237-8
DOI: https://doi.org/10.1007/s13721-020-00237-8   Google Scholar

Foithong, S., Srinil, P., & Pinngern, O. (2017). Min-Uncertainty & Max-Certainty Criteria of Neighborhood Rough-Mutual Feature Selection. Walailak Journal of Science and Technology, 14(4).
  Google Scholar

Guliyev, N. J., & Ismailov, V. E. (2018). On the approximation by single hidden layer feedforward neural networks with fixed weights. Neural Networks, 98, 296-304. http://doi.org/10.1016/j.neunet.2017.12.007
DOI: https://doi.org/10.1016/j.neunet.2017.12.007   Google Scholar

Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, L. A. (2008). Feature extraction: foundations and applications (Vol. 207). Springer. http://doi.org/10.1007/978-3-540-35488-8
DOI: https://doi.org/10.1007/978-3-540-35488-8   Google Scholar

Hsu, Y.-C., Tsai, Y.-H., Weng, H.-H., Hsu, L.-S., Tsai, Y.-H., Lin, Y.-C., Hung, M.-S., Fang, Y.-H., & Chen, C.-W. (2020). Artificial neural networks improve LDCT lung cancer screening: a comparative validation study. BMC Cancer, 20(1), 1023. https://doi.org/10.1186/s12885-020-07465-1
DOI: https://doi.org/10.1186/s12885-020-07465-1   Google Scholar

Islam, M. M., Haque, M. R., Iqbal, H., Hasan, M. M., Hasan, M., & Kabir, M. N. (2020). Breast Cancer Prediction: A Comparative Study Using Machine Learning Techniques. SN Computer Science, 1(5), 290. https://doi.org/10.1007/s42979-020-00305-w
DOI: https://doi.org/10.1007/s42979-020-00305-w   Google Scholar

Jain, D., & Singh, V. (2018). Feature selection and classification systems for chronic disease prediction: A review. Egyptian Informatics Journal, 19(3), 179–189. https://doi.org/10.1016/j.eij.2018.03.002
DOI: https://doi.org/10.1016/j.eij.2018.03.002   Google Scholar

Khan, A., Shah, R., Imran, M., Khan, A., Bangash, J. I., & Shah, K. (2019). An alternative approach to neural network training based on hybrid bio meta-heuristic algorithm. Journal of Ambient Intelligence and Humanized Computing, 10(10), 3821-3830. https://doi.org/10.1007/s12652-019-01373-4
DOI: https://doi.org/10.1007/s12652-019-01373-4   Google Scholar

Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1–2), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X
DOI: https://doi.org/10.1016/S0004-3702(97)00043-X   Google Scholar

Kumar, V. (2021). Evaluation of computationally intelligent techniques for breast cancer diagnosis. Neural Computing and Applications, 33(8), 3195–3208. https://doi.org/10.1007/s00521-020-05204-y
DOI: https://doi.org/10.1007/s00521-020-05204-y   Google Scholar

Kumar, V., & Minz, S. (2014). Feature selection: a literature review. SmartCR, 4(3), 211-229.
DOI: https://doi.org/10.6029/smartcr.2014.03.007   Google Scholar

Kumari, M., & Singh, V. (2018). Breast Cancer Prediction system. Procedia Computer Science, 132, 371–376. https://doi.org/10.1016/j.procs.2018.05.197
DOI: https://doi.org/10.1016/j.procs.2018.05.197   Google Scholar

Liu, X., Li, B., Shen, D., Cao, J., & Mao, B. (2017). Analysis of Grain Storage Loss Based on Decision Tree Algorithm. Procedia Computer Science, 122, 130–137. https://doi.org/10.1016/j.procs.2017.11.351
DOI: https://doi.org/10.1016/j.procs.2017.11.351   Google Scholar

Moodley, J., Walter, F., Scott, S., & Mwaka, A. (2018). Towards timely diagnosis of symptomatic breast and cervical cancer in South Africa. South African Medical Journal, 108(10), 803–804. https://doi.org/10.7196/SAMJ.2018.v108i10.13478
DOI: https://doi.org/10.7196/SAMJ.2018.v108i10.13478   Google Scholar

Mushtaq, Z., Yaqub, A., Hassan, A., & Su, S. F. (2019). Performance Analysis of Supervised Classifiers Using PCA Based Techniques on Breast Cancer. 2019 International Conference on Engineering and Emerging Technologies (ICEET) (pp. 1–6). IEEE. https://doi.org/10.1109/CEET1.2019.8711868
DOI: https://doi.org/10.1109/CEET1.2019.8711868   Google Scholar

Patsadu, O., Tangchitwilaikun, P., & Lowsuwankul, S. (2021). Liver Cancer Patient Classification on a Multiple-Stage using Hybrid Classification Methods. Walailak Journal of Science and Technology, 18(10). https://doi.org/10.48048/wjst.2021.9169
DOI: https://doi.org/10.48048/wjst.2021.9169   Google Scholar

Santos-Pereira, J., Gruenwald, L., & Bernardino, J. (2021). Top data mining tools for the healthcare industry. Journal of King Saud University – Computer and Information Sciences, in press. https://doi.org/https://doi.org/10.1016/j.jksuci.2021.06.002
DOI: https://doi.org/10.1016/j.jksuci.2021.06.002   Google Scholar

Senturk, Z. K., & Kara, R. (2014). Breast Cancer Diagnosis Via Data Mining: Performance Analysis of Seven Different algorithms. Computer Science & Engineering: An International Journal (CSEIJ), 4(1), 35–46. https://doi.org/10.5121/cseij.2014.4104
DOI: https://doi.org/10.5121/cseij.2014.4104   Google Scholar

Shenouda, E. A. M. A. (2006). A Quantitative Comparison of Different MLP Activation Functions in Classification. In: J. Wang, Z. Yi, J. M. Zurada, B. L. Lu & H. Yin (Eds.), Advances in Neural Networks. Lecture Notes in Computer Science (vol. 3971). Springer. https://doi.org/10.1007/11759966_125
DOI: https://doi.org/10.1007/11759966_125   Google Scholar

Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. In Data classification: Algorithms and applications (chapter 2). Chapman and Hall/CRC. https://doi.org/10.1201/b17320
DOI: https://doi.org/10.1201/b17320   Google Scholar

Vijayalakshmi, S., & Priyadarshini, J. (2017). Breast Cancer Classification using RBF and BPN Neural Networks. International Journal of Applied Engineering Research, 12(15), 4775–4781.
  Google Scholar

Wahhab, H. T. A. (2015). Classification of acute leukemia using image processing and machine learning techniques. University of Malaya.
  Google Scholar

WBCD. (1995). Retrieved January 20, 2021 from https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original).
  Google Scholar

Wu, J., Zhuang, Q., & Tan, Y. (2020). Auxiliary Medical Decision System for Prostate Cancer Based on Ensemble Method. Computational and Mathematical Methods in Medicine, 2020, 6509596. https://doi.org/10.1155/2020/6509596
DOI: https://doi.org/10.1155/2020/6509596   Google Scholar

Yi, L., & Yi, W. (2017). Decision Tree Model in the Diagnosis of Breast Cancer. In 2017 International Conference on Computer Technology, Electronics and Communication (ICCTEC) (pp. 176–179). IEEE. https://doi.org/10.1109/ICCTEC.2017.00046
DOI: https://doi.org/10.1109/ICCTEC.2017.00046   Google Scholar

Zarei, M., Ansari, H., Keshavarz, P., & Zerafat, M. (2020). Prediction of pool boiling heat transfer coefficient for various nano-refrigerants utilizing artificial neural networks. Journal of Thermal Analysis and Calorimetry, 139(6), 3757–3768.
DOI: https://doi.org/10.1007/s10973-019-08746-z   Google Scholar

Download


Published
2021-09-30

Cited by

NAVEED, N. ., MADHLOOM, H. T. ., & HUSAIN, M. S. . (2021). BREAST CANCER DIAGNOSIS USING WRAPPER-BASED FEATURE SELECTION AND ARTIFICIAL NEURAL NETWORK. Applied Computer Science, 17(3), 19–30. https://doi.org/10.23743/acs-2021-18

Authors

Nawazish NAVEED 
nawazish.ibr@cas.edu.om
University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman

Authors

Hayan T. MADHLOOM 

University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman

Authors

Mohd Shahid HUSAIN 

University of Technology and Applied Sciences, CAS-Ibri, Dept. of IT Oman

Statistics

Abstract views: 265
PDF downloads: 43


License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.