Makhmudov, Fazliddin; Mukhiddinov, Mukhriddin; Abdusalomov, Akmalbek; Avazov, Kuldoshbay; Khamdamov, Utkir; Cho, Young Im

doi:10.1142/S0219691320500526

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion.
Authors: Makhmudov, Fazliddin; Mukhiddinov, Mukhriddin; Abdusalomov, Akmalbek; Avazov, Kuldoshbay; Khamdamov, Utkir; Cho, Young Im
Abstract: Methods for text detection and recognition in images of natural scenes have become an active research topic in computer vision and have obtained encouraging achievements over several benchmarks. In this paper, we introduce a robust yet simple pipeline that produces accurate and fast text detection and recognition for the Uzbek language in natural scene images using a fully convolutional network and the Tesseract OCR engine. First, the text detection step quickly predicts text in random orientations in full-color images with a single fully convolutional neural network, discarding redundant intermediate stages. Then, the text recognition step recognizes the Uzbek language, including both the Latin and Cyrillic alphabets, using a trained Tesseract OCR engine. Finally, the recognized text can be pronounced using the Uzbek language text-to-speech synthesizer. The proposed method was tested on the ICDAR 2013, ICDAR 2015 and MSRA-TD500 datasets, and it showed an advantage in efficiently detecting and recognizing text from natural scene images for assisting the visually impaired.
Subjects: TEXT recognition; CONVOLUTIONAL neural networks; VOCODER; COMPUTER vision; IMAGE recognition (Computer vision); OPTICAL character recognition
Publication: International Journal of Wavelets, Multiresolution & Information Processing, 2020, Vol 18, Issue 6, pN.PAG
ISSN: 0219-6913
Publication type: Article
DOI: 10.1142/S0219691320500526

We found a match

Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion.

Makhmudov, Fazliddin; Mukhiddinov, Mukhriddin; Abdusalomov, Akmalbek; Avazov, Kuldoshbay; Khamdamov, Utkir; Cho, Young Im

TEXT recognition; CONVOLUTIONAL neural networks; VOCODER; COMPUTER vision; IMAGE recognition (Computer vision); OPTICAL character recognition

International Journal of Wavelets, Multiresolution & Information Processing, 2020, Vol 18, Issue 6, pN.PAG

0219-6913

Article

10.1142/S0219691320500526