OPTIMIZING PEDESTRIAN TRACKING FOR ROBUST PERCEPTION WITH YOLOv8 AND DEEPSORT

Ghania Zidani


a:1:{s:5:"en_US";s:20:"University of Batna2";} (Algeria)

Djalal DJARAH

d.djarah@gmail.com
University of Ouargla (Algeria)

Abdslam BENMAKHLOUF


University of Ouargla (Algeria)

Laid KHETTACHE


(Algeria)

Abstract

Multi-object tracking is a crucial aspect of perception in the area of computer vision, widely used in autonomous driving, behavior recognition, and other areas. The complex and dynamic nature of environments, the ever-changing visual features of people, and the frequent appearance of occlusion interactions all impose limitations on the efficacy of existing pedestrian tracking algorithms. This results in suboptimal tracking precision and stability. As a solution, this article proposes an integrated detector-tracker framework for pedestrian tracking. The framework includes a pedestrian object detector that utilizes the YOLOv8 network, which is regarded as the latest state-of-the-art detector, that has been established. This detector provides an ideal detection base to address limitations. Through the combination of YOLOv8 and the DeepSort tracking algorithm, we have improved the ability to track pedestrians in dynamic scenarios. After conducting experiments on publicly available datasets such as MOT17 and MOT20, a clear improvement in accuracy and consistency was demonstrated, with MOTA scores of 63.82 and 58.95, and HOTA scores of 43.15 and 41.36, respectively. Our research highlights the significance of optimizing object detection to unleash the potential of tracking for critical applications like autonomous driving.


Keywords:

Object Detection, Tracking by Detection, Pedestrian Tracking, YOLOv8, Deep SORT

Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., & Yan, J. (2016). POI: Multiple object tracking with high performance detection and appearance feature arXiv (Cornell University). https://doi.org/10.48550/arxiv.1610.06136
  Google Scholar

Xu, Y., Ošep, A., Ban, Y., Horaud, R., Leal-Taixé, L., & Alameda-Pineda, X. (2019). How to train your deep multi-object tracker. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1906.06618
  Google Scholar

Ciaparrone, G., Sánchez, F. L., Tabik, S., Troiano, L., Tagliaferri, R., & Herrera, F. (2020). Deep learning in video multi-object tracking: A survey. Neurocomputing, 381, 61–88. https://doi.org/10.1016/j.neucom.2019.11.023
  Google Scholar

Kamal, R., Chemmanam, A.J., Jose, B., Mathews, S., & Varghese, E. (2020, October 20-22). Construction safety surveillance using machine learning. International Symposium on Networks, Computers and Communications, Montreal, QC, Canada.
  Google Scholar

Ess, A., Schindler, K., Leibe, B., & Van Gool, L. (2010). Object detection and tracking for autonomous navigation in dynamic environments. The International Journal of Robotics Research, 29(14), 1707–1725. https://doi.org/10.1177/0278364910365417
  Google Scholar

Behrendt, K., Novak, L., & Botros, R. (2017, May 29-June 3). A deep learning approach to traffic lights: Detection, tracking, and classification. IEEE International Conference on Robotics and Automation, Singapore (ICRA), Singapore.
  Google Scholar

Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP) (pp. 3464-3468). IEEE. https://doi.org/10.1109/ICIP.2016.7533003
  Google Scholar

Bochinski, E., Eiselein, V., & Sikora, T. (2017). Highspeed tracking-by-detection without using image information. In AVSS (pp. 1-6). IEEE.
  Google Scholar

Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. In ECCV (pp. 1-21). Springer.
  Google Scholar

Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., & Wei, Y. (2022). Motr: End-to-end multiple object tracking with transformer. In ECCV (pp. 659-675). Springer.
  Google Scholar

Okuma, K., et al. (2004). A boosted particle filter: Multitarget detection and tracking. European Conference on Computer Vision. Springer.
  Google Scholar

Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8), 790-799.
  Google Scholar

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv. http://xxx.lanl.gov/abs/1506.02640
  Google Scholar

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. arXiv. http://xxx.lanl.gov/abs/1506.02640
  Google Scholar

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C. (2016). SSD: Single shot multibox detector. In Computer Vision – ECCV 2016 (pp. 21–37). Springer.
  Google Scholar

Abbas, S. M., & Singh, S. (2018). Region-based object detection and classification using faster R-CNN. IEEE. https://doi.org/10.1109/ciact.2018.8480413
  Google Scholar

Mao, Q.C., Sun, H.M., Liu, Y.B., & Jia, R.S. (2019). Mini-YOLOv3: Real-time object detector for embedded applications. IEEE Access, 7, 133529–133538.
  Google Scholar

Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., & Kim, T.K. (2021). Multiple object tracking: A literature review. Artificial Intelligence, 293, 103448.
  Google Scholar

Bergmann, P., Meinhardt, T., & Leal-Taixe, L. (2019). Tracking without bells and whistles. In ICCV (pp. 941-951).
  Google Scholar

Pang, B., Li, Y., Zhang, Y., Li, M., & Lu, C. (2020). Tubetk: Adopting tubes to track multi-object in a one-step training model. In CVPR (pp. 6308-6318).
  Google Scholar

Peng, J., Wang, C., Wan, F., Wu, Y., Wang, Y., Tai, Y., Wang, C., Li, J., Huang, F., & Fu, Y. (2020). Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In ECCV (pp. 145-161). Springer.
  Google Scholar

Wang, Z., Zheng, L., Liu, Y., Li, Y., & Wang, S. (2020). Towards real-time multi-object tracking. In ECCV (pp. 107-122). Springer.
  Google Scholar

Munjal, B., Aftab, A. R., Amin, S., Brandlmaier, M. D., Tombari, F., & Galasso, F. (2020). Joint detection and tracking in videos with identification features. Image and Vision Computing, 100, 103932. https://doi.org/10.1016/j.imavis.2020.103932
  Google Scholar

Feng, W., Bai, L., Yao, Y., Gan, W., Wu, W., & Ouyang, W. (2023). Similarity- and quality-guided relation learning for joint detection and tracking. IEEE Transactions on Multimedia, 1-13. https://doi.org/10.1109/tmm.2023.3279670
  Google Scholar

Wang, Y., Kitani, K., & Weng, X. (2021). Joint object detection and multi-object tracking with graph neural networks. 2021 IEEE International Conference on Robotics and Automation (ICRA). https://doi.org/10.1109/icra48506.2021.9561110
  Google Scholar

De Magalhães Rosa, G. J., & Papa, J. P. (2022). Learning to weight similarity measures with siamese networks: A case study on optimum-path forest. In Elsevier eBooks (pp. 155-173). Elsevier. https://doi.org/10.1016/b978-0-12-822688-9.00015-3
  Google Scholar

Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2), 83-97. https://doi.org/10.1002/nav.3800020109
  Google Scholar

Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. European Conference on Computer Vision, Glasgow, United Kingdom.
  Google Scholar

Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics. GitHub. https://github.com/ultralytics/ultralytics
  Google Scholar

Chen, L., Ai, H., Zhuang, Z., & Shang, C. (2018). Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). IEEE. https://doi.org/10.1109/ICME.2018.8482164
  Google Scholar

Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP) (pp. 3464-3468). IEEE. https://doi.org/10.1109/ICIP.2016.7533003
  Google Scholar

Wojke, N., Bewley, A., & Paulus, D. (2017). Simple online and realtime tracking with a deep association metric. 2017 IEEE International Conference on Image Processing (ICIP). https://doi.org/10.1109/ICIP.2017.8296526
  Google Scholar

Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35-45. https://doi.org/10.1115/1.3662552
  Google Scholar

Sun, Z., Chen, J., Chao, L., Ruan, W., & Mukherjee, M. (2021). A survey of multiple pedestrian tracking based on tracking-by-detection framework. IEEE Transactions on Circuits and Systems for Video Technology, 31(7), 1819-1833. https://doi.org/10.1109/TCSVT.2020.2973098
  Google Scholar

Treven, J. R., & Cordova-Esparaza, D. M. (2023). A comprehensive review of yolo: From yolov1 to yolov8 and beyond. arXiv. http://arxiv.org/abs/2304.00501
  Google Scholar

Solawetz, J., & Francesco. (2023). What is yolov8? The ultimate guide. Roboflow Blog. https://blog.roboflow.com/whats-new-in-yolov8/
  Google Scholar

Korepanova, A.A., Oliseenko, V.D., & Abramov, M.V. (2020, May). Applicability of similarity coefficients in social circle matching. 2020 XXIII International Conference on Soft Computing and Measurements (SCM), St. Petersburg, Russia.
  Google Scholar

Vijaymeena, M., & Kavitha, K. (2016). A survey on similarity measures in text mining. Machine Learning Applications: An International Journal, 3(1), 19-28.
  Google Scholar

Kasturi, R., Goldgof, D., Soundararajan, P., Manohar, V., Garofolo, J., Bowers, R., Boonstra, M., Korzhova, V., & Zhang, J. (2009). Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 319-336. https://doi.org/10.1109/TPAMI.2008.111
  Google Scholar

Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. European Conference on Computer Vision (ECCV) Workshop on Benchmarking Multi-Target Tracking, Amsterdam, The Netherlands.
  Google Scholar

Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixé, L., & Leibe, B. (2020). HOTA: A higher order metric for evaluating multi-object tracking. International Journal of Computer Vision, 129(2), 548-578. https://doi.org/10.1007/s11263-020-01416-9
  Google Scholar

Download


Published
2024-03-30

Cited by

Zidani, G., DJARAH, D., BENMAKHLOUF, A., & KHETTACHE, L. (2024). OPTIMIZING PEDESTRIAN TRACKING FOR ROBUST PERCEPTION WITH YOLOv8 AND DEEPSORT. Applied Computer Science, 20(1), 72–84. https://doi.org/10.35784/acs-2024-05

Authors

Ghania Zidani 

a:1:{s:5:"en_US";s:20:"University of Batna2";} Algeria

Authors

Djalal DJARAH 
d.djarah@gmail.com
University of Ouargla Algeria

Authors

Abdslam BENMAKHLOUF 

University of Ouargla Algeria

Authors

Laid KHETTACHE 

Algeria

Statistics

Abstract views: 461
PDF downloads: 230


License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.


Similar Articles

1 2 3 4 5 6 7 > >> 

You may also start an advanced similarity search for this article.