We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
An automatic approach to identify word sense changes in text media across timescales.
- Authors
MITRA, SUNNY; MITRA, RITWIK; MAITY, SUMAN KALYAN; RIEDL, MARTIN; BIEMANN, CHRIS; GOYAL, PAWAN; MUKHERJEE, ANIMESH; Kozareva, Zornitsa; Nastase, Vivi; Mihalcea, Rada
- Abstract
In this paper, we propose an unsupervised and automated method to identify noun sense changes based on rigorous analysis of time-varying text data available in the form of millions of digitized books and millions of tweets posted per day. We construct distributional-thesauri-based networks from data at different time points and cluster each of them separately to obtain word-centric sense clusters corresponding to the different time points. Subsequently, we propose a split/join based approach to compare the sense clusters at two different time points to find if there is ‘birth’ of a new sense. The approach also helps us to find if an older sense was ‘split’ into more than one sense or a newer sense has been formed from the ‘join’ of older senses or a particular sense has undergone ‘death’. We use this completely unsupervised approach (a) within the Google books data to identify word sense differences within a media, and (b) across Google books and Twitter data to identify differences in word sense distribution across different media. We conduct a thorough evaluation of the proposed methodology both manually as well as through comparison with WordNet.
- Subjects
VOCABULARY; DIGITIZATION; NOUNS; COMPUTER networks; TIME-varying systems; GOOGLE Books (Web resource)
- Publication
Natural Language Engineering, 2015, Vol 21, Issue 5, p773
- ISSN
1351-3249
- Publication type
Article
- DOI
10.1017/S135132491500011X