A vision transformer model for multilingual image-based text  recognition

doi:10.12785/ijcds/xxxxxx

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Preprint
→
View Item

dc.date.accessioned	2024-01-07T22:38:24Z
dc.date.available	2024-01-07T22:38:24Z
dc.date.issued	2024-01-08
dc.identifier.uri	https://journal.uob.edu.bh:443/handle/123456789/5309
dc.description.abstract	Multilingual image-based text recognition is a tough problem with several practical applications. This work suggests an integrated ViT-YOLO model which integrates the strengths of the Vision Transformer (ViT) and You Only Look Once (YOLO) techniques to solve this challenge. The goal of the model is to correctly identify text in pictures with text in many languages. The ViT-YOLO model uses YOLO to locate text sections in pictures using patch extraction. Taking use of its robust image-understanding capabilities, the ViT model processes the derived patches for text recognition. To enhance the model's performance and robustness, a Generative Adversarial Network (GAN) is integrated for data augmentation. Experimental results demonstrate the superiority of the ViT-YOLO model over traditional methods and other deep learning models, achieving an impressive accuracy of 93.49%. These findings demonstrate that the proposed ViT-YOLO model holds significant promise in addressing multilingual text recognition challenges and paves the way for future advancements in multilingual image-based text recognition.	en_US
dc.language.iso	en	en_US
dc.publisher	Unversity of Bahrain	en_US
dc.subject	Text recognition, Vision Transformer (ViT), You Only Look Once (YOLO), Generative Adversarial Network (GAN), multilingual-text recognition.	en_US
dc.title	A vision transformer model for multilingual image-based text recognition	en_US
dc.identifier.doi	10.12785/ijcds/xxxxxx
dc.volume	15	en_US
dc.issue	1	en_US
dc.pagestart	1	en_US
dc.pageend	13	en_US
dc.source.title	International Journal of Computing and Digital Systems	en_US
dc.abbreviatedsourcetitle	IJCDS	en_US