基于区域极值点的时间序列聚类算法

doi:10.3969/j.issn.1000-3428.2015.05.006

计算机工程

基于区域极值点的时间序列聚类算法

孙　雅^a,c,李志华^a,b,c

(江南大学a. 物联网工程学院轻工过程先进控制教育部重点实验室; b. 物联网技术应用教育部工程研究中心; c. 物联网工程学院计算机科学与技术系,江苏无锡214122)

收稿日期:2014-06-05 出版日期:2015-05-15 发布日期:2015-05-15
作者简介:孙　雅(1990 - ),女,硕士研究生,主研方向:物联网技术,数据挖掘;李志华,副教授、博士。
基金资助:
中央高校基本科研业务费专项基金资助项目(JUSRP211A41);江苏省产学研前瞻基金资助项目(BY2013015-23)。

Clustering Algorithm for Time Series Based on Locally Extreme Point

SUN Ya ^a,c ,LI Zhihua ^a,b,c

(a. Key Laboratory of Advanced Process Control for Light Industry,Ministry of Education, College of Internet of Things Engineering; b. Engineering Research Center of Internet of Things Technology Application,Ministry of Education; c. Department of Computer Science and Technology,College of Internet of Things Engineering,Jiangnan University,Wuxi 214122,China)

Received:2014-06-05 Online:2015-05-15 Published:2015-05-15

摘要/Abstract

摘要： 相异性或相似性度量是数据挖掘领域中的2 个基本问题。针对时间序列的相异性度量问题,给出时间序列的区域半径、区域极值点、区域等定义,提出一种区域极值点提取策略。通过提取有代表性的极值点以起到对时间序列数据约简和压缩的作用,进一步定义时间序列的动态时间弯曲距离度量其相异性。以此为基础提出一种新的时间序列层次聚类算法。仿真实验结果表明,与时间序列趋势特征提取等算法相比,该算法在数据的压缩效果和聚类准确率方面均有明显提高。

关键词: 时间序列, 区域极值点, 重描述, 数据压缩, 相似性度量, 层次聚类

Abstract: Dissimilarity or similarity is the key issue in data mining. data is hard to measure because of its original structure. Aiming at the problem of time series similarity measure,this paper proposes a re-description method based on locally extreme point of time series. In which,the original time series is described by extracting the locally extreme points from time series,reflecting the main features of the time series effectively and achieving the compression of time series data. Measuring the extreme series after equal-length treatment enhances the flexibility of the algorithm,and reduces its limitations. Based on the above,it is applied to hierarchical clustering of the time series. Simulation experimental results show that the clustering effect and data compression is obvious,and the clustering accuracy greatly improves compared with other algorithms based on time series trend features extraction.

Key words: time series, locally extreme point, re-description, data compression, similarity measure, hierarchical clustering

中图分类号:

TP391

孙雅,李志华. 基于区域极值点的时间序列聚类算法[J]. 计算机工程, doi: 10.3969/j.issn.1000-3428.2015.05.006.

SUN Ya,LI Zhihua. Clustering Algorithm for Time Series Based on Locally Extreme Point[J]. Computer Engineering, doi: 10.3969/j.issn.1000-3428.2015.05.006.

http://www.ecice06.com/CN/Y2015/V41/I5/33

参考文献

参考文献 [ 1 ]　Fu Tak Chung. A Review on Time Series Data Mining[ J]. Engineering Application of Artificial Intelligence, 2011,24(1):164-181. [ 2 ]　Krawczak M, Szkatua G. Time Series Envelopes for Classification [ C ] / / Proceedings of IEEE International Conference on Intelligent Systems. London, UK: IEEE Press,2010:156-161. [ 3 ]　Deepa V K, Geetha J R R. Rapid Development of Applications in Data Mining[C] / / Proceedings of 2013 International Conference on Green High Performance Computing. New Delhi,India:[s. n. ],2013:145-152. [ 4 ]　国宏伟,高学东,王　宏. 基于异时间窗划分的时间序列聚类[J]. 计算机工程,2007,33(21):3-5. [ 5 ]　闫相斌,李一军,崔广斌. 事件预测的时间序列数据挖掘方法[J]. 计算机工程,2006,32(5):29-31. [ 6 ]　Chan K,Fu A W. Efficient Time Series Matching by WAEPlets [ C] / / Proceedings of the 15th IEEE International Conference on Data Engineering. Sydney, Australia:IEEE Press,1999:117-126. [ 7 ]　余璟明,何希琼,程冬爱. 基于离散小波变换的时间序列数据挖掘[J]. 计算机应用,2005,25(3):652-653. [ 8 ]　Shen Jun,Bao Shudi. The PLR-DTW Method for ECG Based Biometric Identification[C] / / Proceedings of the 33rd Annual International Conference. Boston, USA: IEEE Press,2011:541-555. [ 9 ]　Lin J,Keogh E,Lonardi S,et al. A Symbolic Representation of Time Series with Implications for Streaming Algorithms[C]/ / Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego,USA:IEEE Press,2003: 2-11. [10]　Kengh E, Chakrabarti K, Pazzani M, et al. Dimensionality Reduction for Fast Similarity Search in Time Series Databases [J]. Journal of Knowledge and Information System,2001,3(3):263-286. [11]　Hung N Q V,Anh D T. An Improvement of PAA for Dimensionality Reduction in Large Time Series Databases [ C ] / / Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence. Hanoi,Vietnam:IEEE Press,2008:698-707. [12]　谢福鼎,李　迎,孙　岩,等. 一种基于关键点的时间序列聚类算法[J]. 计算机科学,2012,39(3):157-159. [13]　Chiu B,Keogh E,Lonardi S. Probabilistic Discovery of Time Series Motifs[C] / / Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press,2003:493-498. [14]　周　黔,吴铁军. 基于重要点的时间序列趋势特征提取方法[J]. 浙江大学学报,2007,41(11):1782-1787. [15]　王晓晔,孙济洲. 一种时间序列表示算法及其在聚类中的应用[J]. 系统工程与电子技术,2006,28(8): 1266-1269. [16]　孙吉贵,刘　杰,赵连宇. 聚类算法研究[J]. 软件学报,2008,19(1):48-61. [17]　刘慧婷,倪志伟. 基于EDM 与k-means 算法的时间序列聚类[J]. 模式识别与人工智能,2009,22(5):803-808. 编辑　索书志

[1]	汤卫芬, 高翠芳. 极值点自适应加权的动态时间规整算法[J]. 计算机工程, 2023, 49(7): 150-160.
[2]	蔡瑞初, 伍运金, 陈薇, 郝志峰. 面向多元时间序列的群体因果关系发现算法[J]. 计算机工程, 2023, 49(2): 127-135.
[3]	刘杭, 殷歆, 陈杰, 罗恒. 基于混合网络模型的多维时间序列预测[J]. 计算机工程, 2023, 49(1): 121-129.
[4]	李海林, 夏燕燕, 邹金串. 基于CPET时序聚类的中长跑耐力运动员选拔方法[J]. 计算机工程, 2022, 48(9): 262-268.
[5]	梁小慧, 郭晟楠, 万怀宇. 基于自适应小波分解的时间序列分类方法[J]. 计算机工程, 2022, 48(4): 81-88,98.
[6]	田盼盼, 陈璟. 基于层次聚类的生物网络全局比对算法[J]. 计算机工程, 2022, 48(2): 65-71,78.
[7]	李晓, 卢先领. 基于双重注意力机制和GRU网络的短期负荷预测模型[J]. 计算机工程, 2022, 48(2): 291-296,305.
[8]	陆怡, 王鹏, 汪卫. 基于子序列相似性的时间序列语义挖掘算法[J]. 计算机工程, 2022, 48(10): 88-94.
[9]	刘苗苗, 周从华, 张婷. 基于分段特征及自适应加权的DTW相似性度量[J]. 计算机工程, 2021, 47(8): 62-68,77.
[10]	李勇, 董思秀, 张强, 程方颀, 王常青. 注意力流网络中节点影响力的层级性研究[J]. 计算机工程, 2021, 47(8): 109-115,123.
[11]	陈田, 周洋, 任福继, 安鑫, 赵沪隐. 基于三态信号的改进游程编码压缩方法[J]. 计算机工程, 2021, 47(2): 219-225.
[12]	夏寒松, 张力生, 桑春艳. 基于LDTW的动态时间规整改进算法[J]. 计算机工程, 2021, 47(11): 108-120.
[13]	王韫烨, 孔珊. 基于检测器集层次聚类的否定选择算法[J]. 计算机工程, 2020, 46(6): 303-307.
[14]	陆慎涛, 葛洪伟. 一种抗噪的移动时间势能聚类算法[J]. 计算机工程, 2020, 46(5): 144-149.
[15]	史明阳, 王鹏, 汪卫. 有监督时间序列分割与状态识别算法[J]. 计算机工程, 2020, 46(5): 131-138.

选择文件类型/文献管理软件名称

选择包含的内容

基于区域极值点的时间序列聚类算法

Clustering Algorithm for Time Series Based on Locally Extreme Point

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于区域极值点的时间序列聚类算法

Clustering Algorithm for Time Series Based on Locally Extreme Point

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价