ELSHAMY, G., ALFONSE, M., HEGAZY, I., & AREF, M. (2025). A multi-modal transformer-based model for generative visual dialog system.
Applied Computer Science
,
21
(1), 1–17. https://doi.org/10.35784/acs_6856