We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Outlier Detection and Data Cleaning in Multivariate Non-Normal Samples: The PAELLA Algorithm.
- Authors
Limas, Manuel Castejón; Ordieres Meré, Joaquín B.; Martínez Pisón Ascacibar, Francisco J.; Vergara González, Eliseo P.
- Abstract
A new method of outlier detection and data cleaning for both normal and non-normal multivariate data sets is proposed. It is based on an iterated local fit without a priori metric assumptions. We propose a new approach supported by finite mixture clustering which provides good results with large data sets. A multi-step structure, consisting of three phases, is developed. The importance of outlier detection in industrial modeling for open-loop control prediction is also described. The described algorithm gives good results both in simulations runs with artificial data sets and with experimental data sets recorded in a rubber factory. Finally, some discussion about this methodology is exposed.
- Subjects
OUTLIERS (Statistics); STATISTICS; STATISTICAL sampling; DATA editing; ALGORITHMS; MULTIVARIATE analysis; CLUSTER analysis (Statistics)
- Publication
Data Mining & Knowledge Discovery, 2004, Vol 9, Issue 2, p171
- ISSN
1384-5810
- Publication type
Article
- DOI
10.1023/B:DAMI.0000031630.50685.7c