摘要: 采用数据挖掘中的聚类算法对流程企业的大量的历史数据进行分析,采用基于欧几里德距离的加权K-means算法建立了参数的聚类模型,分析簇团内不同相似度时的参数个数比例,得到参数点离核指数的定义。针对实时检测出的异常点,结合CBLOF(t)的概念,提出了一种新的离群指数的定义。以此为基础,有效地对设备的运行状况进行监控,从而起到设备运行优化和故障预警的作用。
关键词:
聚类分析,
加权K-means算法,
离核指数,
离群指数,
流程企业
Abstract: To monitor process industry’s production, the large history data is analyzed by clustering algorithm. The equipment’s parameters clustering models are built by Feature Weight’s K-means algorithm. The proportion between quantity under different similarity factor and the whole cluster is calculated by different similarity methods, and then a new factor of scatter is defined. Based on the conception of CBLOF(t), a new definition of outlier is brought forward to study the real-time outlier when the equipments circulate. Based on the models, equipments process and monitor faults can be optimized.
Key words:
Clustering analysis,
Feature weight’s K-means algorithm,
Factor of scatter,
Factor of outlier,
Process industry
中图分类号:
闫 伟;张 浩;陆剑峰;袁 磊. 聚类分析理论研究及在流程企业中的应用[J]. 计算机工程, 2006, 32(17): 19-21,2.
YAN Wei; ZHANG Hao;LU Jianfeng; YUAN Lei. Study of Clustering Analysis and Its Application in Process Industry[J]. Computer Engineering, 2006, 32(17): 19-21,2.