We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Textmatcher: cross-attentional neural network to compare image and text.
- Authors
Arrigoni, Valentina; Repele, Luisa; Saccavino, Dario Marino
- Abstract
We study a multimodal-learning problem where, given an image containing a single-line (printed or handwritten) text and a candidate text transcription, the goal is to assess whether the text represented in the image corresponds to the candidate text. This problem, which we dub text matching, is primarily motivated by a real industrial application scenario of automated cheque processing, whose goal is to automatically assess whether the information in a bank cheque (e.g., issue date) match the data that have been entered by the customer while depositing the cheque to an automated teller machine (ATM). The problem finds more general application in several other scenarios too, e.g., personal-identity-document processing in user-registration procedures. We devise a machine-learning model specifically designed for the text-matching problem. The proposed model, termed TextMatcher, compares the two inputs by applying a novel cross-attention mechanism over the embedding representations of image and text, and it is trained in an end-to-end fashion on the desired distribution of errors to be detected. We demonstrate the effectiveness of TextMatcher on the automated-cheque-processing use case, where TextMatcher is shown to generalize well to future unseen dates, unlike existing models designed for related problems. We further assess the performance of TextMatcher on different distributions of errors on the public IAM dataset. Results attest that, compared to a naïve model, a variant with fully-connected layers instead of the cross-attention module and existing models for related problems, TextMatcher achieves higher performance on a variety of configurations.
- Subjects
AUTOMATED teller machines; MACHINE learning; IMAGE representation; TEXT recognition
- Publication
Machine Learning, 2024, Vol 113, Issue 4, p2045
- ISSN
0885-6125
- Publication type
Article
- DOI
10.1007/s10994-023-06418-6