We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
CLUSTERING OF VARIABLES FOR MIXED DATA.
- Authors
Saracco, J.; Chavent, M.
- Abstract
This chapter presents clustering of variables which aim is to lump together strongly related variables. The proposed approach works on a mixed data set, i.e. on a data set which contains numerical variables and categorical variables. Two algorithms of clustering of variables are described: a hierarchical clustering and a k-means type clustering. A brief description of PCAmix method (that is a principal component analysis for mixed data) is provided, since the calculus of the synthetic variables summarizing the obtained clusters of variables is based on this multivariate method. Finally, the R packages ClustOfVar and PCAmixdata are illustrated on real mixed data. The PCAmix and ClustOfVar approaches are first used for dimension reduction (step 1) before applying in step 2 a standard clustering method to obtain groups of individuals.
- Subjects
MATHEMATICAL variables; BIG data; PRINCIPAL components analysis; STATISTICAL correlation; CALCULUS
- Publication
EAS Publications Series, 2016, Vol 77, p121
- ISSN
1633-4760
- Publication type
Article
- DOI
10.1051/eas/1677007