We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
TSP: Mining top-kclosed sequential patterns.
- Authors
Tzvetkov, Petre; Yan, Xifeng; Han, Jiawei
- Abstract
Sequential pattern mining has been studied extensively in the data mining community. Most previous studies require the specification of amin_supportthreshold for mining a complete set of sequential patterns satisfying the threshold. However, in practice, it is difficult for users to provide an appropriatemin_supportthreshold. To overcome this difficulty, we propose an alternative mining task: mining top-kfrequent closed sequential patterns of length no less thanmin_l, wherekis the desired number of closed sequential patterns to be mined andmin_l is the minimal length of each pattern. We mine the set ofclosed patternsbecause it is a compact representation of the complete set of frequent patterns. An efficient algorithm, called TSP, is developed for mining such patterns withoutmin_support. Starting at (absolute)min_support=1, the algorithm makes use of the length constraint and the properties of top-kclosed sequential patterns to perform dynamic support raising and projected database pruning. Our extensive performance study shows that TSP has high performance. In most cases, it outperforms the efficient closed sequential pattern-mining algorithm, CloSpan, even when the latter is running with the best tunedmin_supportthreshold. Thus, we conclude that, for sequential pattern mining, mining top-kfrequent closed sequential patterns withoutmin_supportis more preferable than the traditionalmin_support-based mining.
- Subjects
DATA mining; ALGORITHMS; NUCLEOTIDE sequence
- Publication
Knowledge & Information Systems, 2005, Vol 7, Issue 4, p438
- ISSN
0219-1377
- Publication type
Article
- DOI
10.1007/s10115-004-0175-4