We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Subspace sums for extracting non-random data from massive noise.
- Authors
Denton, Anne M.
- Abstract
An algorithm is introduced that distinguishes relevant data points from randomly distributed noise. The algorithm is related to subspace clustering based on axis-parallel projections, but considers membership in any projected cluster of a given side length, as opposed to a particular cluster. An aggregate measure is introduced that is based on the total number of points that are close to the given point in all possible 2 d projections of a d-dimensional hypercube. No explicit summation over subspaces is required for evaluating this measure. Attribute values are normalized based on rank order to avoid making assumptions on the distribution of random data. Effectiveness of the algorithm is demonstrated through comparison with conventional outlier detection on a real microarray data set as well as on time series subsequence data.
- Subjects
OUTLIERS (Statistics); NOISE; DOCUMENT clustering; CLUSTER analysis (Statistics); ELECTRONIC file management
- Publication
Knowledge & Information Systems, 2009, Vol 20, Issue 1, p35
- ISSN
0219-1377
- Publication type
Article
- DOI
10.1007/s10115-008-0176-9