Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2021, Vol. 47 ›› Issue (4): 313-320. doi: 10.19678/j.issn.1000-3428.0058187

• Development Research and Engineering Application • Previous Articles    

Research on Unsupervised Discretization Method for Multi-Magnitudes Emergency Data

GAO Tianyu, WANG Qingrong, YANG Yan, MA Chenkun   

  1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
  • Received:2020-04-28 Revised:2020-06-02 Published:2020-06-05

多量级应急数据无监督离散化方法研究

高天宇, 王庆荣, 杨妍, 马辰坤   

  1. 兰州交通大学 电子与信息工程学院, 兰州 730070
  • 作者简介:高天宇(1993-),男,硕士研究生,主研方向为数据挖掘与分析;王庆荣,教授、博士;杨妍、马辰坤,硕士研究生。
  • 基金资助:
    国家自然科学基金(71961016);教育部人文社会科学研究规划基金(15XJAZH002,18YJAZH148);甘肃省自然科学基金(18JR3RA125)。

Abstract: In the discretization of continuous multi-magnitude emergency data,the traditional unsupervised discretization methods usually fail to find the magnitude change point to complete discretization.This paper proposes an unsupervised discretization method for multi-magnitude emergency data.According to the difference of magnitude changes,the discrete data is sorted from largest to least.The accurate magnitude change point is obtained by combining the fitting function and the second derivative calculation as the data truncation point.Then the discrete class,which is composed of the larger data obtained from the truncation,is removed from the dataset to be discretized.The above steps are repeated until the data to be discretized meets the set discrete coefficient threshold,and finally all the data are discretized.Experimental results show that the proposed method can realize the uniform discretization of earthquake-related multi-magnitude emergency data,and its discretization coefficient is lower than that of the traditional methods such as equal frequency discretization and hierarchical clustering discretization.The method can effectively discretize the emergency data that hides difference between magnitudes.

Key words: multi-magnitudes difference, emergency data, discretization, discretization coefficient, earthquake information

摘要: 在对连续的多量级应急数据进行离散化时,采用传统无监督离散化方法难以找出量级变化点完成离散。提出一种针对多量级应急数据的无监督离散化方法。根据量级变化的差异性将离散数据由大到小排序,结合拟合函数和二阶导数计算得到准确的量级变化点作为数据截断点,将截断所得较大数据构成的离散类移出待离散数据集,不断重复上述操作直到待离散数据满足设定的离散系数阈值,最终完成全部数据的离散。实验结果表明,该方法实现了地震相关多量级应急数据的均匀离散,其离散系数较等频离散化、层次聚类离散化等传统方法更低,可有效离散化隐藏多量级差异的应急数据。

关键词: 多量级差异, 应急数据, 离散化, 离散系数, 地震信息

CLC Number: