We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
A Multi-Layer Holistic Approach for Cursive Text Recognition.
- Authors
Umair, Muhammad; Zubair, Muhammad; Dawood, Farhan; Ashfaq, Sarim; Bhatti, Muhammad Shahid; Hijji, Mohammad; Sohail, Abid
- Abstract
Urdu is a widely spoken and narrated language in several South-Asian countries and communities worldwide. It is relatively hard to recognize Urdu text compared to other languages due to its cursive writing style. The Urdu text script belongs to a non-Latin cursive family script like Arabic, Hindi and Chinese. Urdu is written in several writing styles, among which 'Nastaleeq' is the most popular and widely used font style. A gap still poses a challenge for localization/detection and recognition of Urdu Nastaleeq text as it follows modified version of Arabic script. This research study presents a methodology to recognize and classify Urdu text in Nastaleeq font, regardless of the text position in the image. The proposed solution is comprised of a two-step methodology. In the first step, text detection is performed using the Connected Component Analysis (CCA) and Long Short-Term Memory Neural Network (LSTM). In the second step, a hybrid Convolution Neural Network and Recurrent Neural Network (CNN-RNN) architecture is deployed to recognize the detected text. The image containing Urdu text is binarized and segmented to produce a single-line text image fed to the hybrid CNN-RNN model, which recognizes the text and saves it in a text file. The proposed technique outperforms the existing ones by achieving an overall accuracy of 97.47%.
- Subjects
CONVOLUTIONAL neural networks; RECURRENT neural networks; NATURAL language processing; HANDWRITING; TEXT recognition; TEXT files
- Publication
Applied Sciences (2076-3417), 2022, Vol 12, Issue 24, p12652
- ISSN
2076-3417
- Publication type
Article
- DOI
10.3390/app122412652