作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (4): 55-57. doi: 10.3969/j.issn.1000-3428.2011.04.020

• 软件技术与数据库 • 上一篇    下一篇

基于分割模式的时间序列矢量符号化算法

陈湘涛1,2,李明亮1,陈玉娟1   

  1. (1. 湖南大学计算机与通信学院,长沙 410082;2. 中南大学信息科学与工程学院,长沙 410083)
  • 出版日期:2011-02-20 发布日期:2011-02-17
  • 作者简介:陈湘涛(1973-),男,副教授、博士,主研方向:数据库技术,数据挖掘,过程建模与控制;李明亮、陈玉娟,硕士研究生
  • 基金资助:
    国家自然科学基金资助项目(60634020)

Vector Symbolization Algorithm for Time Series Based on Segmentation Mode

CHEN Xiang-tao 1,2, LI Ming-liang 1, CHEN Yu-juan 1   

  1. (1. School of Computer and Communication, Hunan University, Changsha 410082, China;2. School of Information Science and Engineering, Central South University, Changsha 410083, China)
  • Online:2011-02-20 Published:2011-02-17

摘要: 针对符号化聚合近似算法(SAX)中时间序列必须等长分割的缺陷,提出一种基于分割模式的时间序列符号化算法(SMSAX)。利用三角阈值法对随机抽样的时间序列进行特征提取,计算时间序列最大压缩比,将其作为时间窗宽提取分割点,进而求出时间序列的分割模式。利用得到的分割模式对时间序列进行分割降维,通过均值和波动率对分割后的子序列进行向量符号化。根据时间序列特征对其进行不等长分割,并加入波动率消除奇异点的影响。实验结果表明,SMSAX能获得比SAX更精确的结果。

关键词: 分割模式, 时间序列, 降维, 子序列符号化

Abstract: Aiming at defects of equal-length segmentation of time series in symbolic aggregate approximation algorithm(SAX), a vector symbolic algorithm based on segmentation algorithm for time series(SMSAX) is presented. A triangular threshold method is used to extract features of time series which is sampled randomly. The time series maximum compression ratio is calculated as the time window width to extract segmentation points, and further the Segment Mode(SM) of the time series is found. The partition model is used to segment time series to reduce the dimensionality of them by using vector of mean and volatility of sub-sequences to symbolic them. The algorithm segments time sequences based on characters of them, and eliminates the impact of singular points with the fluctuation rate. Experimental results indicate that SMSAX is able to obtain more accurate results than SAX.

Key words: Segment Mode(SM), time series, dimension reducing, sub-sequence symbolization

中图分类号: