Chakraborty, Neelotpal; Mitra, Arkoprobho; Choudhury, Ayush; Mollah, Ayatullah Faruk; Basu, Subhadip; Sarkar, Ram

doi:10.1007/s11042-022-12596-7

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: How to handle bi/tri-lingual Indic texts in a single image? A new dataset of natural scene and born-digital images.
Authors: Chakraborty, Neelotpal; Mitra, Arkoprobho; Choudhury, Ayush; Mollah, Ayatullah Faruk; Basu, Subhadip; Sarkar, Ram
Abstract: Detection and language identification of multi-lingual texts in natural scene images (NSI) and born-digital images (BDI) are popular research problems in the domain of information retrieval. Several methods addressing these problems have been evaluated over the years upon mostly NSI based standard datasets. However, datasets highlighting bi/tri-lingual Indic texts in a single image are quite a few. Also, datasets housing BDIs with multi-lingual texts are hardly available. To this end, a new dataset called Mixed-lingual Indic Texts in Digital Images (MITDI) having 500 NSIs and 500 BDIs, is introduced where each image contains texts written in at least two of the either English, Bangla and Hindi languages which are quite commonly used in India. Overall, NSI pool contains 360 images with bi-lingual texts and 140 with tri-lingual texts, whereas BDI pool contains 489 images with bi-lingual texts and 11 with tri-lingual texts. To benchmark the performance on MITDI, a deep learning based Connectionist-DenseNet framework is built and evaluated for each data pool NSI, BDI and combined set. The proposed dataset can serve as an important resource for evaluating state-of-the-art methods in this domain. The dataset is publicly available at: https://github.com/NCJUCSE/MITDI
Subjects: INDIA; HINDI language; DIGITAL images; BENGALI language; INFORMATION retrieval; DEEP learning
Publication: Multimedia Tools & Applications, 2022, Vol 81, Issue 11, p15367
ISSN: 1380-7501
Publication type: Article
DOI: 10.1007/s11042-022-12596-7

We found a match

How to handle bi/tri-lingual Indic texts in a single image? A new dataset of natural scene and born-digital images.

Chakraborty, Neelotpal; Mitra, Arkoprobho; Choudhury, Ayush; Mollah, Ayatullah Faruk; Basu, Subhadip; Sarkar, Ram

INDIA; HINDI language; DIGITAL images; BENGALI language; INFORMATION retrieval; DEEP learning

Multimedia Tools & Applications, 2022, Vol 81, Issue 11, p15367

1380-7501

Article

10.1007/s11042-022-12596-7