[1]
G. ELSHAMY, M. ALFONSE, I. HEGAZY, and M. AREF, “A multi-modal transformer-based model for generative visual dialog system”, Appl. Comput. Sci., vol. 21, no. 1, pp. 1–17, Mar. 2025.