We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
On data efficiency of univariate time series anomaly detection models.
- Authors
Sun, Wu; Li, Hui; Liang, Qingqing; Zou, Xiaofeng; Chen, Mei; Wang, Yanhao
- Abstract
In machine learning (ML) problems, it is widely believed that more training samples lead to improved predictive accuracy but incur higher computational costs. Consequently, achieving better data efficiency, that is, the trade-off between the size of the training set and the accuracy of the output model, becomes a key problem in ML applications. In this research, we systematically investigate the data efficiency of Univariate Time Series Anomaly Detection (UTS-AD) models. We first experimentally examine the performance of nine popular UTS-AD algorithms as a function of the training sample size on several benchmark datasets. Our findings confirm that most algorithms become more accurate when more training samples are used, whereas the marginal gain for adding more samples gradually decreases. Based on the above observations, we propose a novel framework called FastUTS-AD that achieves improved data efficiency and reduced computational overhead compared to existing UTS-AD models with little loss of accuracy. Specifically, FastUTS-AD is compatible with different UTS-AD models, utilizing a sampling- and scaling law-based heuristic method to automatically determine the number of training samples a UTS-AD model needs to achieve predictive performance close to that when all samples in the training set are used. Comprehensive experimental results show that, for the nine popular UTS-AD algorithms tested, FastUTS-AD reduces the number of training samples and the training time by 91.09–91.49% and 93.49–93.82% on average without significant decreases in accuracy.
- Subjects
TIME series analysis; MACHINE learning; HEURISTIC; SAMPLE size (Statistics); ALGORITHMS
- Publication
Journal of Big Data, 2024, Vol 11, Issue 1, p1
- ISSN
2196-1115
- Publication type
Article
- DOI
10.1186/s40537-024-00940-7