A text-guided vision model for enhanced recognition of small instances

Hyun-Ki JUNG

doi:10.35784/acs_7850

PDF

Published: Mar 31, 2026

DOI: https://doi.org/10.35784/acs_7850

Issue Vol. 22 No. 1 (2026)

Articles

Development of dead-reckoning sensor system for indoor environments
Toshihiro YUKAWA

1-19
A real-time adaptive traffic light control algorithm at urban intersections for smart cities
Chahrazad HAMBLI, Mourad AMAD

20-34
A text-guided vision model for enhanced recognition of small instances
Hyun-Ki JUNG

35-46
Reinforcement learning for solving optimization problems: Opportunities and limitations on the example of the assignment problem
Wojciech MISZTAL, Sybilla NAZAREWICZ

47-62
SCADA-Driven big data framework for fault prediction in spiral steel pipe manufacturing using fuzzy and neural network models
Bakhshali BAKHTIYAROV, Aynur JABIYEVA, Mahabbat KHUDAVERDIYEVA

63-81
Enhanced ELECTRE III method with interval-valued hesitant fuzzy linguistic sets for multi-criteria group decision-making in smart supply networks
Fadoua TAMTAM, Amina TOURABI

82-98
Models for calculating the integral quality indicator of the offset printing process for the IIOT-system
Vyacheslav REPETA, Pavlo RYVAK, Oleksandra KRYKHOVETS

99-109
A scalable and cost-effective forest fire detection approach using deep transfer learning on a Raspberry Pi cluster
Achraf Nasser Eddine BELFERD, Hamdan BENSENANE, Abdellatif RAHMOUN

110-122
Addressing non-stationarity with stochastic trend in the context of limited time series data: An experimental survey in healthcare analytics
Apollinaire BATOURE BAMANA, Yannick SOKDOU BILA LAMOU, David Jaures FOTSA-MBOGNE, Mahdi SHAFIEE KAMALABAD

123-139
Efficient multi-robot exploration of unknown environments using inverted ant colony optimization and reinforcement learning
Nabila RAHMOUNE, Adel RAHMOUNE

140-153
A comprehensive review of metaheuristic algorithms for mobile robot path planning
Sheren SADIQ, Araz ABRAHIM, Haval SADEEQ

154-170
Smart Autolube: Optimized machine learning-based pressure prediction for AIoT lubrication systems
Ali KHUMAIDI, Risanto DARMAWAN; Lukman ADITYA; Wardhana Halking HAMKA, Hudzaifah Al JIHAD

171-183
Application of artificial intelligence methods to determine the optimal process parameters in resistance projection welding of steel nuts
Szymon KARSKI, Michał AWTONIUK, Mirosław SZALA

184-198
Development of non-destructive vibration method for classification of bone fracture severity
Jignesh JANI, Nikunj RACHCHH

199-213
Quantifying pain: An AI-driven approach to detecting pain levels via facial expressions
Abeer A. Mohamad ALSHIHA

214-227

Authors

Hyun-Ki JUNG

stillhk3@uos.ac.kr

University of Seoul, Korea, Republic of

https://orcid.org/0009-0007-5325-6279

Abstract

As drone-based object detection technology continues to evolve, the demand is shifting from simply detecting objects to enabling users to accurately identify specific targets. For example, users can enter specific targets as prompts to accurately detect the desired objects. To address this need, an efficient text-guided object recognition model has been developed to improve the recognition of small objects. Specifically, an improved version of the existing YOLO-World model is presented. The proposed method replaces the C2f layer in the YOLOv8 backbone with a C3k2 layer, allowing for a more accurate representation of local features, especially for small objects or those with well-defined boundaries. In addition, the proposed architecture improves processing speed and efficiency by optimizing parallel processing, while contributing to a more lightweight model design. Comparative experiments on the VisDrone dataset show that the proposed model outperforms the original YOLO-World model, with precision increasing from 40.6% to 41.6%, recall from 30.8% to 31%, F1 score from 35% to 35.5%, and mAP@0.5 from 30.4% to 30.7%, confirming its improved accuracy. In addition, the model exhibits superior lightweight performance, with the number of parameters reduced from 4 million to 3.8 million and the FLOPs reduced from 15.7 billion to 15.2 billion. These results indicate that the proposed approach provides a practical and effective solution for accurate object detection in drone-based applications.

Keywords:

Object Detection, image processing, Artificial Intelligence

References

JUNG, H.-K. (2026). A text-guided vision model for enhanced recognition of small instances. Applied Computer Science, 22(1), 35–46. https://doi.org/10.35784/acs_7850

A text-guided vision model for enhanced recognition of small instances

Issue Vol. 22 No. 1 (2026)

Archives

Authors

Abstract

Keywords:

References

License

Article Sidebar

Issue Vol. 22 No. 1 (2026)

Archives

Main Article Content

Authors

Abstract

Keywords:

References

Article Details

License