作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (7): 61-62,6. doi: 10.3969/j.issn.1000-3428.2010.07.021

• 软件技术与数据库 • 上一篇    下一篇

基于时空划分的数据流挖掘

袁正午1,2,袁松彪1   

  1. (1. 重庆邮电大学中韩合作GIS研究所,重庆 400065;2. 重庆大学土木工程学博士后流动站,重庆 400045)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-04-05 发布日期:2010-04-05

Data Stream Mining Based on Time and Space Partitioning

YUAN Zheng-wu1,2, YUAN Song-biao1   

  1. (1. Sino-Korea Chongqing GIS Research Center, Chongqing University of Posts & Telecommunications, Chongqing 400065; 2. Civil Engineering Mobile Station for Post Doctors, Chongqing University, Chongqing 400045)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-04-05 Published:2010-04-05

摘要: 基于时空划分的思想,设计概要数据结构的在线生成算法。概要数据结构保存流数据不同时刻的分布状态,以支持离线阶段的分类、聚类和关联规则发现等数据挖掘操作。研究时间粒度、量化向量调整和子区域索引等3项内存需求控制策略,以平衡概要数据结构的内存需求和内外存之间的I/O次数。

关键词: 数据流, 时空划分, 概要数据结构, 聚类

Abstract: Based on the idea of time and space partitioning, this paper designs synopsis data structures which contains the distributed status of data stream to support different data mining tasks such as classifying, clustering and association rules discovery. Three kinds of measures are researched to control the potential huge requirement of memory caused by space partitioning, so that the synopsis’ memory requirement and the number of I/O are balanced.

Key words: data stream, time and space partitioning, synopsis data structure, clustering

中图分类号: