OPTIMIZING PEDESTRIAN TRACKING FOR ROBUST PERCEPTION WITH YOLOv8 AND DEEPSORT
Ghania Zidani
a:1:{s:5:"en_US";s:20:"University of Batna2";} (Algeria)
Djalal DJARAH
d.djarah@gmail.comUniversity of Ouargla (Algeria)
Abdslam BENMAKHLOUF
University of Ouargla (Algeria)
Laid KHETTACHE
(Algeria)
Abstract
Multi-object tracking is a crucial aspect of perception in the area of computer vision, widely used in autonomous driving, behavior recognition, and other areas. The complex and dynamic nature of environments, the ever-changing visual features of people, and the frequent appearance of occlusion interactions all impose limitations on the efficacy of existing pedestrian tracking algorithms. This results in suboptimal tracking precision and stability. As a solution, this article proposes an integrated detector-tracker framework for pedestrian tracking. The framework includes a pedestrian object detector that utilizes the YOLOv8 network, which is regarded as the latest state-of-the-art detector, that has been established. This detector provides an ideal detection base to address limitations. Through the combination of YOLOv8 and the DeepSort tracking algorithm, we have improved the ability to track pedestrians in dynamic scenarios. After conducting experiments on publicly available datasets such as MOT17 and MOT20, a clear improvement in accuracy and consistency was demonstrated, with MOTA scores of 63.82 and 58.95, and HOTA scores of 43.15 and 41.36, respectively. Our research highlights the significance of optimizing object detection to unleash the potential of tracking for critical applications like autonomous driving.
Keywords:
Object Detection, Tracking by Detection, Pedestrian Tracking, YOLOv8, Deep SORTReferences
Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., & Yan, J. (2016). POI: Multiple object tracking with high performance detection and appearance feature arXiv (Cornell University). https://doi.org/10.48550/arxiv.1610.06136
Google Scholar
Xu, Y., Ošep, A., Ban, Y., Horaud, R., Leal-Taixé, L., & Alameda-Pineda, X. (2019). How to train your deep multi-object tracker. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1906.06618
Google Scholar
Ciaparrone, G., Sánchez, F. L., Tabik, S., Troiano, L., Tagliaferri, R., & Herrera, F. (2020). Deep learning in video multi-object tracking: A survey. Neurocomputing, 381, 61–88. https://doi.org/10.1016/j.neucom.2019.11.023
Google Scholar
Kamal, R., Chemmanam, A.J., Jose, B., Mathews, S., & Varghese, E. (2020, October 20-22). Construction safety surveillance using machine learning. International Symposium on Networks, Computers and Communications, Montreal, QC, Canada.
Google Scholar
Ess, A., Schindler, K., Leibe, B., & Van Gool, L. (2010). Object detection and tracking for autonomous navigation in dynamic environments. The International Journal of Robotics Research, 29(14), 1707–1725. https://doi.org/10.1177/0278364910365417
Google Scholar
Behrendt, K., Novak, L., & Botros, R. (2017, May 29-June 3). A deep learning approach to traffic lights: Detection, tracking, and classification. IEEE International Conference on Robotics and Automation, Singapore (ICRA), Singapore.
Google Scholar
Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP) (pp. 3464-3468). IEEE. https://doi.org/10.1109/ICIP.2016.7533003
Google Scholar
Bochinski, E., Eiselein, V., & Sikora, T. (2017). Highspeed tracking-by-detection without using image information. In AVSS (pp. 1-6). IEEE.
Google Scholar
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. In ECCV (pp. 1-21). Springer.
Google Scholar
Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., & Wei, Y. (2022). Motr: End-to-end multiple object tracking with transformer. In ECCV (pp. 659-675). Springer.
Google Scholar
Okuma, K., et al. (2004). A boosted particle filter: Multitarget detection and tracking. European Conference on Computer Vision. Springer.
Google Scholar
Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8), 790-799.
Google Scholar
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv. http://xxx.lanl.gov/abs/1506.02640
Google Scholar
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. arXiv. http://xxx.lanl.gov/abs/1506.02640
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C. (2016). SSD: Single shot multibox detector. In Computer Vision – ECCV 2016 (pp. 21–37). Springer.
Google Scholar
Abbas, S. M., & Singh, S. (2018). Region-based object detection and classification using faster R-CNN. IEEE. https://doi.org/10.1109/ciact.2018.8480413
Google Scholar
Mao, Q.C., Sun, H.M., Liu, Y.B., & Jia, R.S. (2019). Mini-YOLOv3: Real-time object detector for embedded applications. IEEE Access, 7, 133529–133538.
Google Scholar
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., & Kim, T.K. (2021). Multiple object tracking: A literature review. Artificial Intelligence, 293, 103448.
Google Scholar
Bergmann, P., Meinhardt, T., & Leal-Taixe, L. (2019). Tracking without bells and whistles. In ICCV (pp. 941-951).
Google Scholar
Pang, B., Li, Y., Zhang, Y., Li, M., & Lu, C. (2020). Tubetk: Adopting tubes to track multi-object in a one-step training model. In CVPR (pp. 6308-6318).
Google Scholar
Peng, J., Wang, C., Wan, F., Wu, Y., Wang, Y., Tai, Y., Wang, C., Li, J., Huang, F., & Fu, Y. (2020). Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In ECCV (pp. 145-161). Springer.
Google Scholar
Wang, Z., Zheng, L., Liu, Y., Li, Y., & Wang, S. (2020). Towards real-time multi-object tracking. In ECCV (pp. 107-122). Springer.
Google Scholar
Munjal, B., Aftab, A. R., Amin, S., Brandlmaier, M. D., Tombari, F., & Galasso, F. (2020). Joint detection and tracking in videos with identification features. Image and Vision Computing, 100, 103932. https://doi.org/10.1016/j.imavis.2020.103932
Google Scholar
Feng, W., Bai, L., Yao, Y., Gan, W., Wu, W., & Ouyang, W. (2023). Similarity- and quality-guided relation learning for joint detection and tracking. IEEE Transactions on Multimedia, 1-13. https://doi.org/10.1109/tmm.2023.3279670
Google Scholar
Wang, Y., Kitani, K., & Weng, X. (2021). Joint object detection and multi-object tracking with graph neural networks. 2021 IEEE International Conference on Robotics and Automation (ICRA). https://doi.org/10.1109/icra48506.2021.9561110
Google Scholar
De Magalhães Rosa, G. J., & Papa, J. P. (2022). Learning to weight similarity measures with siamese networks: A case study on optimum-path forest. In Elsevier eBooks (pp. 155-173). Elsevier. https://doi.org/10.1016/b978-0-12-822688-9.00015-3
Google Scholar
Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2), 83-97. https://doi.org/10.1002/nav.3800020109
Google Scholar
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. European Conference on Computer Vision, Glasgow, United Kingdom.
Google Scholar
Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics. GitHub. https://github.com/ultralytics/ultralytics
Google Scholar
Chen, L., Ai, H., Zhuang, Z., & Shang, C. (2018). Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). IEEE. https://doi.org/10.1109/ICME.2018.8482164
Google Scholar
Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP) (pp. 3464-3468). IEEE. https://doi.org/10.1109/ICIP.2016.7533003
Google Scholar
Wojke, N., Bewley, A., & Paulus, D. (2017). Simple online and realtime tracking with a deep association metric. 2017 IEEE International Conference on Image Processing (ICIP). https://doi.org/10.1109/ICIP.2017.8296526
Google Scholar
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35-45. https://doi.org/10.1115/1.3662552
Google Scholar
Sun, Z., Chen, J., Chao, L., Ruan, W., & Mukherjee, M. (2021). A survey of multiple pedestrian tracking based on tracking-by-detection framework. IEEE Transactions on Circuits and Systems for Video Technology, 31(7), 1819-1833. https://doi.org/10.1109/TCSVT.2020.2973098
Google Scholar
Treven, J. R., & Cordova-Esparaza, D. M. (2023). A comprehensive review of yolo: From yolov1 to yolov8 and beyond. arXiv. http://arxiv.org/abs/2304.00501
Google Scholar
Solawetz, J., & Francesco. (2023). What is yolov8? The ultimate guide. Roboflow Blog. https://blog.roboflow.com/whats-new-in-yolov8/
Google Scholar
Korepanova, A.A., Oliseenko, V.D., & Abramov, M.V. (2020, May). Applicability of similarity coefficients in social circle matching. 2020 XXIII International Conference on Soft Computing and Measurements (SCM), St. Petersburg, Russia.
Google Scholar
Vijaymeena, M., & Kavitha, K. (2016). A survey on similarity measures in text mining. Machine Learning Applications: An International Journal, 3(1), 19-28.
Google Scholar
Kasturi, R., Goldgof, D., Soundararajan, P., Manohar, V., Garofolo, J., Bowers, R., Boonstra, M., Korzhova, V., & Zhang, J. (2009). Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 319-336. https://doi.org/10.1109/TPAMI.2008.111
Google Scholar
Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. European Conference on Computer Vision (ECCV) Workshop on Benchmarking Multi-Target Tracking, Amsterdam, The Netherlands.
Google Scholar
Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixé, L., & Leibe, B. (2020). HOTA: A higher order metric for evaluating multi-object tracking. International Journal of Computer Vision, 129(2), 548-578. https://doi.org/10.1007/s11263-020-01416-9
Google Scholar
Authors
Ghania Zidania:1:{s:5:"en_US";s:20:"University of Batna2";} Algeria
Authors
Abdslam BENMAKHLOUFUniversity of Ouargla Algeria
Authors
Laid KHETTACHEAlgeria
Statistics
Abstract views: 461PDF downloads: 230
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Toufik GHRIB, Yacine KHALDI, Purnendu Shekhar PANDEY, Yusef Awad ABUSAL, ADVANCED FRAUD DETECTION IN CARD-BASED FINANCIAL SYSTEMS USING A BIDIRECTIONAL LSTM-GRU ENSEMBLE MODEL , Applied Computer Science: Vol. 20 No. 3 (2024)
- Behnaz ESLAMI, Mehdi HABIBZADEH MOTLAGH, Zahra REZAEI, Mohammad ESLAMI, Mohammad AMIN AMINI, UNSUPERVISED DYNAMIC TOPIC MODEL FOR EXTRACTING ADVERSE DRUG REACTION FROM HEALTH FORUMS , Applied Computer Science: Vol. 16 No. 1 (2020)
- Manikandan SRIDHARAN, Delphin Carolina RANI ARULANANDAM, Rajeswari K CHINNASAMY, Suma THIMMANNA, Sivabalaselvamani DHANDAPANI, RECOGNITION OF FONT AND TAMIL LETTER IN IMAGES USING DEEP LEARNING , Applied Computer Science: Vol. 17 No. 2 (2021)
- ABDERRAHIM BAHANI, El Houssine Ech-Chhibat, Hassan SAMRI, Laila AIT MAALEM , Hicham AIT EL ATTAR , INTELLIGENT CONTROLLING THE GRIPPING FORCE OF AN OBJECT BY TWO COMPUTER-CONTROLLED COOPERATIVE ROBOTS , Applied Computer Science: Vol. 19 No. 1 (2023)
- Thanh-Lam BUI, Ngoc-Tien TRAN, NAVIGATION STRATEGY FOR MOBILE ROBOT BASED ON COMPUTER VISION AND YOLOV5 NETWORK IN THE UNKNOWN ENVIRONMENT , Applied Computer Science: Vol. 19 No. 2 (2023)
- Boutkhil SIDAOUI, PREDICTING STATES OF EPILEPSY PATIENTS USING DEEP LEARNING MODELS , Applied Computer Science: Vol. 20 No. 2 (2024)
- Mahmoud BAKR, Sayed ABDEL-GABER, Mona NASR, Maryam HAZMAN, TOMATO DISEASE DETECTION MODEL BASED ON DENSENET AND TRANSFER LEARNING , Applied Computer Science: Vol. 18 No. 2 (2022)
- Kevin Joy DSOUZA, Zahid Ahmed ANSARI, HISTOPATHOLOGY IMAGE CLASSIFICATION USING HYBRID PARALLEL STRUCTURED DEEP-CNN MODELS , Applied Computer Science: Vol. 18 No. 1 (2022)
- Esraa Alaa MAHAREEK, Doaa Rizk FATHY, Eman Karm ELSAYED, Nahed ELDESOUKY, Kamal Abdelraouf ELDAHSHAN, VIOLENCE PREDICTION IN SURVEILLANCE VIDEOS , Applied Computer Science: Vol. 20 No. 3 (2024)
- Waldemar SUSZYŃSKI, Małgorzata CHARYTANOWICZ, Wojciech ROSA, Leopold KOCZAN, Rafał STĘGIERSKI, DETECTION OF FILLERS IN THE SPEECH BY PEOPLE WHO STUTTER , Applied Computer Science: Vol. 17 No. 4 (2021)
You may also start an advanced similarity search for this article.