FEW-SHOT LEARNING WITH PRE-TRAINED LAYERS INTEGRATION APPLIED TO HAND GESTURE RECOGNITION FOR DISABLED PEOPLE

Mohamed ELBAHRI

elbahri82_m@yahoo.fr
Djillali Liabes University (Algeria)
https://orcid.org/0000-0001-5361-1567

Nasreddine TALEB


(Algeria)

Sid Ahmed El Mehdi ARDJOUN


(Algeria)

Chakib Mustapha Anouar ZOUAOUI


(Algeria)

Abstract

Employing vision-based hand gesture recognition for the interaction and communication of disabled individuals is highly beneficial. The hands and gestures of this category of people have a distinctive aspect, requiring the adaptation of a deep learning vision-based system with a dedicated dataset for each individual. To achieve this objective, the paper presents a novel approach for training gesture classification using few-shot samples. More specifically, the gesture classifiers are fine-tuned segments of a pre-trained deep network. The global framework consists of two modules. The first one is a base feature learner and a hand detector trained with normal people hand’s images; this module results in a hand detector ad hoc model. The second module is a learner sub-classifier; it is the leverage of the convolution layers of the hand detector feature extractor. It builds a shallow CNN trained with few-shot samples for gesture classification. The proposed approach enables the reuse of segments of a pre-trained feature extractor to build a new sub-classification model. The results obtained by varying the size of the training dataset have demonstrated the efficiency of our method compared to the ones of the literature. 


Keywords:

CNN segmentation, few-shot learning, hand gesture, disabled people

Bambach, S., Lee, S., Crandall, D. J., & Yu, C. (2015). Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions. 2015 IEEE International Conference on Computer Vision (ICCV) (pp. 1949–1957). IEEE. https://doi.org/10.1109/ICCV.2015.226
DOI: https://doi.org/10.1109/ICCV.2015.226   Google Scholar

Bandini, A., & Zariffa, J. (2020). Analysis of the hands in egocentric vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(6), 6846-6866. https://doi.org/10.1109/TPAMI.2020.2986648
DOI: https://doi.org/10.1109/TPAMI.2020.2986648   Google Scholar

Bao, P., Maqueda, A. I., del-Blanco, C. R., & García, N. (2017). Tiny hand gesture recognition without localization via a deep convolutional network. IEEE Transactions on Consumer Electronics, 63(3), 251–257. https://doi.org/10.1109/TCE.2017.014971
DOI: https://doi.org/10.1109/TCE.2017.014971   Google Scholar

Barczak, A. L. C., Reyes, N. H., Abastillas, M., Piccio, A., & Susnjak, T. (2011). A new 2D static hand gesture colour image dataset for ASL gestures. Research Letters in the Information and Mathematical Sciences, 15.
  Google Scholar

El Moataz, A., Mammass, D., Mansouri, A., & Nouboud, F. (Eds.). (2020). Image and Signal Processing. 9th International Conference (ICISP 2020). Springer International Publishing. https://doi.org/10.1007/978-3-030-51935-3
DOI: https://doi.org/10.1007/978-3-030-51935-3   Google Scholar

Bhaumik, G., Verma, M., Govil, M. C., & Vipparthi, S. K. (2022). ExtriDeNet: An intensive feature extrication deep network for hand gesture recognition. The Visual Computer, 38(11), 3853–3866. https://doi.org/10.1007/s00371-021-02225-z
DOI: https://doi.org/10.1007/s00371-021-02225-z   Google Scholar

Chattoraj, S., Karan, V., Tanmay, P., (2017). Assistive system for physically disabled people using gesture recognition. 2017 IEEE 2nd International Conference on Signal and Image Processing (ICSIP) (pp. 60–65). IEEE. https://doi.org/10.1109/SIPROCESS.2017.8124506
DOI: https://doi.org/10.1109/SIPROCESS.2017.8124506   Google Scholar

Damaneh, M. M., Mohanna, F., & Jafari, P. (2023). Static hand gesture recognition in sign language based on convolutional neural network with feature extraction method using ORB descriptor and Gabor filter. Expert Systems with Applications, 211, 118559. https://doi.org/10.1016/j.eswa.2022.118559
DOI: https://doi.org/10.1016/j.eswa.2022.118559   Google Scholar

Dardas, N. H., & Georganas, N. D. (2011). Real-Time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Transactions on Instrumentation and Measurement, 60(11), 3592–3607. https://doi.org/10.1109/TIM.2011.2161140
DOI: https://doi.org/10.1109/TIM.2011.2161140   Google Scholar

Deng, X., Zhang, Y., Yang, S., Tan, P., Chang, L., Yuan, Y., & Wang, H. (2018). Joint hand detection and rotation estimation using CNN. IEEE Transactions on Image Processing, 27(4), 1888–1900. https://doi.org/10.1109/TIP.2017.2779600
DOI: https://doi.org/10.1109/TIP.2017.2779600   Google Scholar

Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88, 303–338. https://doi.org/10.1007/s11263-009-0275-4
DOI: https://doi.org/10.1007/s11263-009-0275-4   Google Scholar

Fang, L., Liang, N., Kang, W., Wang, Z., & Feng, D. D. (2020). Real-time hand posture recognition using hand geometric features and Fisher Vector. Signal Processing: Image Communication, 82, 115729. https://doi.org/10.1016/j.image.2019.115729
DOI: https://doi.org/10.1016/j.image.2019.115729   Google Scholar

Fathi, A., Farhadi, A., & Rehg, J. M. (2011). Understanding egocentric activities. 2011 International Conference on Computer Vision (pp. 407–414). IEEE. https://doi.org/10.1109/ICCV.2011.6126269
DOI: https://doi.org/10.1109/ICCV.2011.6126269   Google Scholar

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778). IEEE. https://doi.org/10.1109/CVPR.2016.90
DOI: https://doi.org/10.1109/CVPR.2016.90   Google Scholar

Henderson, P., & Ferrari, V. (2017). End-to-End Training of Object Class Detectors for Mean Average Precision. In S.-H. Lai, V. Lepetit, K. Nishino, & Y. Sato (Eds.), Computer Vision – ACCV 2016 (pp. 198–213). Springer International Publishing. https://doi.org/10.1007/978-3-319-54193-8_13
DOI: https://doi.org/10.1007/978-3-319-54193-8_13   Google Scholar

Hsiao, Y.-S., Sanchez-Riera, J., Lim, T., Hua, K.-L., & Cheng, W.-H. (2014). LaRED: A large RGB-D extensible hand gesture dataset. 5th ACM Multimedia Systems Conference (pp. 53–58). https://doi.org/10.1145/2557642.2563669
DOI: https://doi.org/10.1145/2557642.2563669   Google Scholar

Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-Excitation Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7132–7141). IEEE. https://doi.org/10.1109/CVPR.2018.00745
DOI: https://doi.org/10.1109/CVPR.2018.00745   Google Scholar

Huang, G., Liu, Z., Maaten, L. V. D., & Weinberger, K. Q. (2017). Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2261–2269). IEEE. https://doi.org/10.1109/CVPR.2017.243
DOI: https://doi.org/10.1109/CVPR.2017.243   Google Scholar

Huiwei, Z., Mingqiang, Y., Zhenxing, C., & Qinghe, Z. (2017). A method for static hand gesture recognition based on non-negative matrix factorization and compressive sensing. IAENG International Journal of Computer Science, 44(1), 52–59.
  Google Scholar

Hung, C.-H., Bai, Y.-W., & Wu, H.-Y. (2015). Home appliance control by a hand gesture recognition belt in LED array lamp case. 2015 IEEE 4th Global Conference on Consumer Electronics (GCCE) (pp. 599–600). IEEE. https://doi.org/10.1109/GCCE.2015.7398611
DOI: https://doi.org/10.1109/GCCE.2015.7398611   Google Scholar

Hung, C.-H., Bai, Y.-W., & Wu, H.-Y. (2016). Home outlet and LED array lamp controlled by a smartphone with a hand gesture recognition. 2016 IEEE International Conference on Consumer Electronics (ICCE) (pp. 5–6). IEEE. https://doi.org/10.1109/ICCE.2016.7430502
DOI: https://doi.org/10.1109/ICCE.2016.7430502   Google Scholar

Ishiyama, H., & Kurabayashi, S. (2016). Monochrome glove: A robust real-time hand gesture recognition method by using a fabric glove with design of structured markers. 2016 IEEE Virtual Reality (VR), 187–188. https://doi.org/10.1109/VR.2016.7504716
DOI: https://doi.org/10.1109/VR.2016.7504716   Google Scholar

Kapitanov, A., Kvanchiani, K., Nagaev, A., Kraynov, R., & Makhlyarchuk, A. (2022). HaGRID - HAnd Gesture recognition image dataset. ArXiv abs/2206.08219. https://doi.org/10.48550/arXiv.2206.08219
  Google Scholar

Li, Y., Ye, Z., & Rehg, J. M. (2015). Delving into egocentric actions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 287–295). IEEE. https://doi.org/10.1109/CVPR.2015.7298625
DOI: https://doi.org/10.1109/CVPR.2015.7298625   Google Scholar

Li, Z., Tang, H., Peng, Z., Qi, G.-J., & Tang, J. (2023). Knowledge-guided semantic transfer network for few-shot image recognition. IEEE Transactions on Neural Networks and Learning Systems, 1–15. https://doi.org/10.1109/TNNLS.2023.3240195
DOI: https://doi.org/10.1109/TNNLS.2023.3240195   Google Scholar

Liang, H., Yuan, J., & Thalman, D. (2015). Egocentric hand pose estimation and distance recovery in a single RGB image. 2015 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6). IEEE. https://doi.org/10.1109/ICME.2015.7177448
DOI: https://doi.org/10.1109/ICME.2015.7177448   Google Scholar

Likitlersuang, J., Sumitro, E. R., Cao, T., Visée, R. J., Kalsi-Ryan, S., & Zariffa, J. (2019). Egocentric video: A new tool for capturing hand use of individuals with spinal cord injury at home. Journal of NeuroEngineering and Rehabilitation, 16, 83. https://doi.org/10.1186/s12984-019-0557-1
DOI: https://doi.org/10.1186/s12984-019-0557-1   Google Scholar

Liu, G., Dundar, A., Shih, K. J., Wang, T.-C., Reda, F. A., Sapra, K., Yu, Z., Yang, X., Tao, A., & Catanzaro, B. (2023). Partial convolution for padding, inpainting, and image synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5), 6096–6110. https://doi.org/10.1109/TPAMI.2022.3209702
DOI: https://doi.org/10.1109/TPAMI.2022.3209702   Google Scholar

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). SSD: Single shot MultiBox detector. In B. Leibe, J. Matas, N. Sebe, & M. Welling (Eds.), Computer Vision – ECCV 2016 (pp. 21–37). Springer International Publishing. https://doi.org/10.1007/978-3-319-46448-0_2
DOI: https://doi.org/10.1007/978-3-319-46448-0_2   Google Scholar

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
DOI: https://doi.org/10.1023/B:VISI.0000029664.99615.94   Google Scholar

Mittal, A., Zisserman, A., & Torr, P. (2011). Hand detection using multiple proposals. Procedings of the British Machine Vision Conference 2011 (pp. 75.1-75.11). https://doi.org/10.5244/C.25.75
DOI: https://doi.org/10.5244/C.25.75   Google Scholar

Mohammed, A. A. Q., Lv, J., & Islam, M. S. (2019). A Deep Learning-Based End-to-End composite system for hand detection and gesture recognition. Sensors, 19(23), 5282. https://doi.org/10.3390/s19235282
DOI: https://doi.org/10.3390/s19235282   Google Scholar

Nuzzi, C., Pasinetti, S., Pagani, R., Coffetti, G., & Sansoni, G. (2021, March 8). HANDS: A dataset of static Hand-Gestures for Human-Robot Interaction. https://doi.org/10.17632/ndrczc35bt.1
DOI: https://doi.org/10.1016/j.dib.2021.106791   Google Scholar

Panwar, M. (2012). Hand gesture recognition based on shape parameters. 2012 International Conference on Computing, Communication and Applications (pp. 1-6). IEEE. https://doi.org/10.1109/ICCCA.2012.6179213
DOI: https://doi.org/10.1109/ICCCA.2012.6179213   Google Scholar

Pirsiavash, H., & Ramanan, D. (2012). Detecting activities of daily living in first-person camera views. 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 2847–2854). IEEE. https://doi.org/10.1109/CVPR.2012.6248010
DOI: https://doi.org/10.1109/CVPR.2012.6248010   Google Scholar

Pugeault, N., & Bowden, R. (2011). Spelling it out: Real-time ASL fingerspelling recognition. 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops) (pp. 1114–1119). https://doi.org/10.1109/ICCVW.2011.6130290
DOI: https://doi.org/10.1109/ICCVW.2011.6130290   Google Scholar

Rahim, M. A., Shin, J., & Yun, K. S. (2021). Hand gesture-based sign alphabet recognition and sentence interpretation using a convolutional neural network. Annals of Emerging Technologies in Computing, 4(4), 20-27. https://doi.org/10.33166/AETiC.2020.04.003
DOI: https://doi.org/10.33166/AETiC.2020.04.003   Google Scholar

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 779–788). IEEE. https://doi.org/10.1109/CVPR.2016.91
DOI: https://doi.org/10.1109/CVPR.2016.91   Google Scholar

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems, 28.
  Google Scholar

Sahoo, J. P., Ari, S., & Patra, S. K. (2019). Hand gesture recognition using PCA based deep CNN reduced features and SVM Classifier. 2019 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS) (pp. 221–224). IEEE. https://doi.org/10.1109/iSES47678.2019.00056
DOI: https://doi.org/10.1109/iSES47678.2019.00056   Google Scholar

Sahoo, J. P., Sahoo, S. P., Ari, S., & Patra, S. K. (2022). RBI-2RCNN: Residual block intensity feature using a two-stage residual convolutional neural network for static hand gesture recognition. Signal, Image and Video Processing, 16(8), 2019–2027. https://doi.org/10.1007/s11760-022-02163-w
DOI: https://doi.org/10.1007/s11760-022-02163-w   Google Scholar

Sahoo, J. P., Sahoo, S. P., Ari, S., & Patra, S. K. (2023). DeReFNet: Dual-stream dense Residual fusion network for static hand gesture recognition. Displays, 77, 102388. https://doi.org/10.1016/j.displa.2023.102388
DOI: https://doi.org/10.1016/j.displa.2023.102388   Google Scholar

Sharma, A., Mittal, A., Singh, S., & Awatramani, V. (2020). Hand gesture recognition using image processing and feature extraction techniques. Procedia Computer Science, 173, 181–190. https://doi.org/10.1016/j.procs.2020.06.022
DOI: https://doi.org/10.1016/j.procs.2020.06.022   Google Scholar

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. ArXiv abs/1409.1556. https://doi.org/10.48550/arXiv.1409.1556
  Google Scholar

Srividya, M., Anala, M., Dushyanth, N., & Raju, D. V. S. K. (2019). Hand recognition and motion analysis using faster RCNN. 2019 4th International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS) (pp. 1–4). IEEE. https://doi.org/10.1109/CSITSS47250.2019.9031033
DOI: https://doi.org/10.1109/CSITSS47250.2019.9031033   Google Scholar

Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-ResNet and the impact of residual connections on learning. Thirty-First AAAI Conference on Artificial Intelligence (pp. 4278–4284). https://doi.org/10.1609/aaai.v31i1.11231
DOI: https://doi.org/10.1609/aaai.v31i1.11231   Google Scholar

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–9). IEEE. https://doi.org/10.1109/CVPR.2015.7298594
DOI: https://doi.org/10.1109/CVPR.2015.7298594   Google Scholar

Tang, H., Yuan, C., Li, Z., & Tang, J. (2022). Learning attention-guided pyramidal features for few-shot fine-grained recognition. Pattern Recognition, 130, 108792. https://doi.org/10.1016/j.patcog.2022.108792
DOI: https://doi.org/10.1016/j.patcog.2022.108792   Google Scholar

Utaminingrum, F., Fauzi, M. A., Wihandika, R. C., Adinugroho, S., Kurniawan, T. A., Syauqy, D., Sari, Y. A., & Adikara, P. P. (2017). Development of computer vision based obstacle detection and human tracking on smart wheelchair for disabled patient. 2017 5th International Symposium on Computational and Business Intelligence (ISCBI) (pp 1–5). IEEE. https://doi.org/10.1109/ISCBI.2017.8053533
DOI: https://doi.org/10.1109/ISCBI.2017.8053533   Google Scholar

Virender, R., Nikita, Y., & Pulkit, G. (2018). American sign language fingerspelling using hybrid discrete wavelet transform-gabor filter and convolutional neural network. Journal of Engineering Science and Technology, 13(9), 2655–2669.
  Google Scholar

Vu, A.-K. N., Nguyen, N.-D., Nguyen, K.-D., Nguyen, V.-T., Ngo, T. D., Do, T.-T., & Nguyen, T. V. (2022). Few-shot object detection via baby learning. Image and Vision Computing, 120, 104398. https://doi.org/10.1016/j.imavis.2022.104398
DOI: https://doi.org/10.1016/j.imavis.2022.104398   Google Scholar

Weiss, K., Khoshgoftaar, T. M., & Wang, D. (2016). A survey of transfer learning. Journal of Big Data, 3, 9. https://doi.org/10.1186/s40537-016-0043-6
DOI: https://doi.org/10.1186/s40537-016-0043-6   Google Scholar

Xu, C., Cai, W., Li, Y., Zhou, J., & Wei, L. (2020). Accurate hand detection from single-color images by reconstructing hand appearances. Sensors, 20(1), 192. https://doi.org/10.3390/s20010192
DOI: https://doi.org/10.3390/s20010192   Google Scholar

Yang, G., Wang, S., & Yang, J. (2019). Desire-Driven Reasoning for Personal Care Robots. IEEE Access, 7, 75203–75212. https://doi.org/10.1109/ACCESS.2019.2921112
DOI: https://doi.org/10.1109/ACCESS.2019.2921112   Google Scholar

Zhang, Y., Cao, C., Cheng, J., & Lu, H. (2018). EgoGesture: A new dataset and benchmark for egocentric hand gesture recognition. IEEE Transactions on Multimedia, 20(5), 1038–1050. https://doi.org/10.1109/TMM.2018.2808769
DOI: https://doi.org/10.1109/TMM.2018.2808769   Google Scholar

Zhao, A., Wu, H., Chen, M., & Wang, N. (2023). A spatio-temporal siamese neural network for multimodal handwriting abnormality screening of Parkinson’s Disease. International Journal of Intelligent Systems, 2023, 9921809. https://doi.org/10.1155/2023/9921809
DOI: https://doi.org/10.1155/2023/9921809   Google Scholar

Zheng, Q., Yang, M., Yang, J., Zhang, Q., & Zhang, X. (2018). Improvement of generalization ability of deep CNN via implicit regularization in two-stage training process. IEEE Access, 6, 15844–15869. https://doi.org/10.1109/ACCESS.2018.2810849
DOI: https://doi.org/10.1109/ACCESS.2018.2810849   Google Scholar

Download


Published
2024-06-30

Cited by

ELBAHRI, M., TALEB, N., ARDJOUN, S. A. E. M., & ZOUAOUI , C. M. A. (2024). FEW-SHOT LEARNING WITH PRE-TRAINED LAYERS INTEGRATION APPLIED TO HAND GESTURE RECOGNITION FOR DISABLED PEOPLE . Applied Computer Science, 20(2), 1–23. https://doi.org/10.35784/acs-2024-13

Authors

Mohamed ELBAHRI 
elbahri82_m@yahoo.fr
Djillali Liabes University Algeria
https://orcid.org/0000-0001-5361-1567

Authors

Nasreddine TALEB 

Algeria

Authors

Sid Ahmed El Mehdi ARDJOUN 

Algeria

Authors

Chakib Mustapha Anouar ZOUAOUI  

Algeria

Statistics

Abstract views: 357
PDF downloads: 140


License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.


Similar Articles

1 2 3 4 5 6 7 8 > >> 

You may also start an advanced similarity search for this article.