We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Improving Text Recognition in Natural Images Using Contextual Modules and Transformer-based Decoding.
- Authors
Kumar, Kapil; Mishra, Abhishek Kumar
- Abstract
This research study proposes a comprehensive approach to improve text recognition in natural photographs. The approach addresses challenges such as curved text, low-resolution images, and efficient recognition algorithms by integrating rectification, visual feature extraction, semantic context modeling, and global context modeling. The model comprises several modules for text recognition. The rectification module normalizes non-uniform text areas using a spatial transformation network to recognize curved text accurately. Visual feature extraction captures complicated patterns and improves picture discrimination with ResNet50. The global context module addresses text sequence dependencies, whereas the semantic context module gathers semantic information. Transformer-based text decoding uses masked multi-head attention. Evaluating benchmark datasets (TotalText, CTW1500, and ICDR15) demonstrates promising results, with the ResNet50 backbone achieving impressive F-measures of 89.3%, 86.4%, and 93.7%, respectively. This research successfully combines rectification, visual feature extraction, semantic context modeling, and global context modeling techniques to address challenges in text recognition for natural photographs. The proposed strategies and components contribute to the model's overall improvement of text recognition.
- Subjects
TEXT recognition; IMAGE recognition (Computer vision); FEATURE extraction
- Publication
International Journal of Intelligent Engineering & Systems, 2024, Vol 17, Issue 1, p73
- ISSN
2185-310X
- Publication type
Article
- DOI
10.22266/ijies2024.0229.08