作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (10): 61-63. doi: 10.3969/j.issn.1000-3428.2008.10.022

• 软件技术与数据库 • 上一篇    下一篇

基于符号化表示的时间序列频繁子序列挖掘

胡晓琳,陈晓云   

  1. (福州大学数学与计算机科学学院,福州 350002)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-05-20 发布日期:2008-05-20

Frequent Subsequence Mining in Time Series Based on Symbolic Representation

HU Xiao-lin, CHEN Xiao-yun   

  1. (Department of Mathematics and Computer Science, Fuzhou University, Fuzhou 350002)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-05-20 Published:2008-05-20

摘要: 提出一种新的基于符号化表示的时间序列频繁子序列的挖掘算法。利用基于PAA的分段线性表示法进行降维,通过在高斯分布下设置断点,实现时间序列符号化表示,利用投影数据库挖掘频繁子序列。该算法简单、新颖,运行快速,简化了子序列支持数的计算。

关键词: 数据挖掘, 频繁子序列, 时间序列, 符号化

Abstract: This paper proposes a new algorithm for mining frequent subsequence in time series based on symbolic representation. A dimensionality reduction technique called PAA linear segment representation is used. Under the Gaussian distribution, several breakpoints are set. The projected database is built to mine the frequent subsequence. The algorithm is simple and new, runs so fast, and reduces the cost of computing support counts of subsequences.

Key words: data mining, frequent subsequence, time series, symbolic

中图分类号: