We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Considerations about learning Word2Vec.
- Authors
Di Gennaro, Giovanni; Buonanno, Amedeo; Palmieri, Francesco A. N.
- Abstract
Despite the large diffusion and use of embedding generated through Word2Vec, there are still many open questions about the reasons for its results and about its real capabilities. In particular, to our knowledge, no author seems to have analysed in detail how learning may be affected by the various choices of hyperparameters. In this work, we try to shed some light on various issues focusing on a typical dataset. It is shown that the learning rate prevents the exact mapping of the co-occurrence matrix, that Word2Vec is unable to learn syntactic relationships, and that it does not suffer from the problem of overfitting. Furthermore, through the creation of an ad-hoc network, it is also shown how it is possible to improve Word2Vec directly on the analogies, obtaining very high accuracy without damaging the pre-existing embedding. This analogy-enhanced Word2Vec may be convenient in various NLP scenarios, but it is used here as an optimal starting point to evaluate the limits of Word2Vec.
- Subjects
NATURAL language processing
- Publication
Journal of Supercomputing, 2021, Vol 77, Issue 11, p12320
- ISSN
0920-8542
- Publication type
Article
- DOI
10.1007/s11227-021-03743-2