OPTIMIZING PEDESTRIAN TRACKING FOR ROBUST PERCEPTION WITH YOLOv8 AND DEEPSORT
Ghania Zidani
a:1:{s:5:"en_US";s:20:"University of Batna2";} (Algeria)
Djalal DJARAH
d.djarah@gmail.comUniversity of Ouargla (Algeria)
Abdslam BENMAKHLOUF
University of Ouargla (Algeria)
Laid KHETTACHE
(Algeria)
Abstract
Multi-object tracking is a crucial aspect of perception in the area of computer vision, widely used in autonomous driving, behavior recognition, and other areas. The complex and dynamic nature of environments, the ever-changing visual features of people, and the frequent appearance of occlusion interactions all impose limitations on the efficacy of existing pedestrian tracking algorithms. This results in suboptimal tracking precision and stability. As a solution, this article proposes an integrated detector-tracker framework for pedestrian tracking. The framework includes a pedestrian object detector that utilizes the YOLOv8 network, which is regarded as the latest state-of-the-art detector, that has been established. This detector provides an ideal detection base to address limitations. Through the combination of YOLOv8 and the DeepSort tracking algorithm, we have improved the ability to track pedestrians in dynamic scenarios. After conducting experiments on publicly available datasets such as MOT17 and MOT20, a clear improvement in accuracy and consistency was demonstrated, with MOTA scores of 63.82 and 58.95, and HOTA scores of 43.15 and 41.36, respectively. Our research highlights the significance of optimizing object detection to unleash the potential of tracking for critical applications like autonomous driving.
Keywords:
Object Detection, Tracking by Detection, Pedestrian Tracking, YOLOv8, Deep SORTReferences
Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., & Yan, J. (2016). POI: Multiple object tracking with high performance detection and appearance feature arXiv (Cornell University). https://doi.org/10.48550/arxiv.1610.06136
Google Scholar
Xu, Y., Ošep, A., Ban, Y., Horaud, R., Leal-Taixé, L., & Alameda-Pineda, X. (2019). How to train your deep multi-object tracker. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1906.06618
Google Scholar
Ciaparrone, G., Sánchez, F. L., Tabik, S., Troiano, L., Tagliaferri, R., & Herrera, F. (2020). Deep learning in video multi-object tracking: A survey. Neurocomputing, 381, 61–88. https://doi.org/10.1016/j.neucom.2019.11.023
Google Scholar
Kamal, R., Chemmanam, A.J., Jose, B., Mathews, S., & Varghese, E. (2020, October 20-22). Construction safety surveillance using machine learning. International Symposium on Networks, Computers and Communications, Montreal, QC, Canada.
Google Scholar
Ess, A., Schindler, K., Leibe, B., & Van Gool, L. (2010). Object detection and tracking for autonomous navigation in dynamic environments. The International Journal of Robotics Research, 29(14), 1707–1725. https://doi.org/10.1177/0278364910365417
Google Scholar
Behrendt, K., Novak, L., & Botros, R. (2017, May 29-June 3). A deep learning approach to traffic lights: Detection, tracking, and classification. IEEE International Conference on Robotics and Automation, Singapore (ICRA), Singapore.
Google Scholar
Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP) (pp. 3464-3468). IEEE. https://doi.org/10.1109/ICIP.2016.7533003
Google Scholar
Bochinski, E., Eiselein, V., & Sikora, T. (2017). Highspeed tracking-by-detection without using image information. In AVSS (pp. 1-6). IEEE.
Google Scholar
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. In ECCV (pp. 1-21). Springer.
Google Scholar
Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., & Wei, Y. (2022). Motr: End-to-end multiple object tracking with transformer. In ECCV (pp. 659-675). Springer.
Google Scholar
Okuma, K., et al. (2004). A boosted particle filter: Multitarget detection and tracking. European Conference on Computer Vision. Springer.
Google Scholar
Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8), 790-799.
Google Scholar
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv. http://xxx.lanl.gov/abs/1506.02640
Google Scholar
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. arXiv. http://xxx.lanl.gov/abs/1506.02640
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C. (2016). SSD: Single shot multibox detector. In Computer Vision – ECCV 2016 (pp. 21–37). Springer.
Google Scholar
Abbas, S. M., & Singh, S. (2018). Region-based object detection and classification using faster R-CNN. IEEE. https://doi.org/10.1109/ciact.2018.8480413
Google Scholar
Mao, Q.C., Sun, H.M., Liu, Y.B., & Jia, R.S. (2019). Mini-YOLOv3: Real-time object detector for embedded applications. IEEE Access, 7, 133529–133538.
Google Scholar
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., & Kim, T.K. (2021). Multiple object tracking: A literature review. Artificial Intelligence, 293, 103448.
Google Scholar
Bergmann, P., Meinhardt, T., & Leal-Taixe, L. (2019). Tracking without bells and whistles. In ICCV (pp. 941-951).
Google Scholar
Pang, B., Li, Y., Zhang, Y., Li, M., & Lu, C. (2020). Tubetk: Adopting tubes to track multi-object in a one-step training model. In CVPR (pp. 6308-6318).
Google Scholar
Peng, J., Wang, C., Wan, F., Wu, Y., Wang, Y., Tai, Y., Wang, C., Li, J., Huang, F., & Fu, Y. (2020). Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In ECCV (pp. 145-161). Springer.
Google Scholar
Wang, Z., Zheng, L., Liu, Y., Li, Y., & Wang, S. (2020). Towards real-time multi-object tracking. In ECCV (pp. 107-122). Springer.
Google Scholar
Munjal, B., Aftab, A. R., Amin, S., Brandlmaier, M. D., Tombari, F., & Galasso, F. (2020). Joint detection and tracking in videos with identification features. Image and Vision Computing, 100, 103932. https://doi.org/10.1016/j.imavis.2020.103932
Google Scholar
Feng, W., Bai, L., Yao, Y., Gan, W., Wu, W., & Ouyang, W. (2023). Similarity- and quality-guided relation learning for joint detection and tracking. IEEE Transactions on Multimedia, 1-13. https://doi.org/10.1109/tmm.2023.3279670
Google Scholar
Wang, Y., Kitani, K., & Weng, X. (2021). Joint object detection and multi-object tracking with graph neural networks. 2021 IEEE International Conference on Robotics and Automation (ICRA). https://doi.org/10.1109/icra48506.2021.9561110
Google Scholar
De Magalhães Rosa, G. J., & Papa, J. P. (2022). Learning to weight similarity measures with siamese networks: A case study on optimum-path forest. In Elsevier eBooks (pp. 155-173). Elsevier. https://doi.org/10.1016/b978-0-12-822688-9.00015-3
Google Scholar
Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2), 83-97. https://doi.org/10.1002/nav.3800020109
Google Scholar
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., & Wang, X. (2022). Bytetrack: Multi-object tracking by associating every detection box. European Conference on Computer Vision, Glasgow, United Kingdom.
Google Scholar
Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics. GitHub. https://github.com/ultralytics/ultralytics
Google Scholar
Chen, L., Ai, H., Zhuang, Z., & Shang, C. (2018). Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). IEEE. https://doi.org/10.1109/ICME.2018.8482164
Google Scholar
Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016). Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP) (pp. 3464-3468). IEEE. https://doi.org/10.1109/ICIP.2016.7533003
Google Scholar
Wojke, N., Bewley, A., & Paulus, D. (2017). Simple online and realtime tracking with a deep association metric. 2017 IEEE International Conference on Image Processing (ICIP). https://doi.org/10.1109/ICIP.2017.8296526
Google Scholar
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35-45. https://doi.org/10.1115/1.3662552
Google Scholar
Sun, Z., Chen, J., Chao, L., Ruan, W., & Mukherjee, M. (2021). A survey of multiple pedestrian tracking based on tracking-by-detection framework. IEEE Transactions on Circuits and Systems for Video Technology, 31(7), 1819-1833. https://doi.org/10.1109/TCSVT.2020.2973098
Google Scholar
Treven, J. R., & Cordova-Esparaza, D. M. (2023). A comprehensive review of yolo: From yolov1 to yolov8 and beyond. arXiv. http://arxiv.org/abs/2304.00501
Google Scholar
Solawetz, J., & Francesco. (2023). What is yolov8? The ultimate guide. Roboflow Blog. https://blog.roboflow.com/whats-new-in-yolov8/
Google Scholar
Korepanova, A.A., Oliseenko, V.D., & Abramov, M.V. (2020, May). Applicability of similarity coefficients in social circle matching. 2020 XXIII International Conference on Soft Computing and Measurements (SCM), St. Petersburg, Russia.
Google Scholar
Vijaymeena, M., & Kavitha, K. (2016). A survey on similarity measures in text mining. Machine Learning Applications: An International Journal, 3(1), 19-28.
Google Scholar
Kasturi, R., Goldgof, D., Soundararajan, P., Manohar, V., Garofolo, J., Bowers, R., Boonstra, M., Korzhova, V., & Zhang, J. (2009). Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 319-336. https://doi.org/10.1109/TPAMI.2008.111
Google Scholar
Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. European Conference on Computer Vision (ECCV) Workshop on Benchmarking Multi-Target Tracking, Amsterdam, The Netherlands.
Google Scholar
Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixé, L., & Leibe, B. (2020). HOTA: A higher order metric for evaluating multi-object tracking. International Journal of Computer Vision, 129(2), 548-578. https://doi.org/10.1007/s11263-020-01416-9
Google Scholar
Authors
Ghania Zidania:1:{s:5:"en_US";s:20:"University of Batna2";} Algeria
Authors
Abdslam BENMAKHLOUFUniversity of Ouargla Algeria
Authors
Laid KHETTACHEAlgeria
Statistics
Abstract views: 365PDF downloads: 204
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Applied Computer Science are open-access and distributed under the terms of the Creative Commons Attribution 4.0 International License.
Similar Articles
- Workineh TESEMA, INEFFICIENCY OF DATA MINING ALGORITHMS AND ITS ARCHITECTURE: WITH EMPHASIS TO THE SHORTCOMING OF DATA MINING ALGORITHMS ON THE OUTPUT OF THE RESEARCHES , Applied Computer Science: Vol. 15 No. 3 (2019)
- Venkatesh BHANDAGE, Manohara PAI M. M., SEMANTIC SEGMENTATION OF ALGAL BLOOMS ON THE OCEAN SURFACE USING SENTINEL 3 CHL_NN BAND IMAGERY , Applied Computer Science: Vol. 20 No. 3 (2024)
You may also start an advanced similarity search for this article.