作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (1): 75-84. doi: 10.19678/j.issn.1000-3428.0060307

• 人工智能与模式识别 • 上一篇    下一篇

面向置换检验的冗余对比模式过滤算法

吴军, 欧阳艾嘉, 张琳   

  1. 遵义师范学院 信息工程学院, 贵州 遵义 563000
  • 收稿日期:2020-12-16 修回日期:2021-02-14 发布日期:2021-01-26
  • 作者简介:吴军(1990-),男,讲师,主研方向为数据挖掘、深度学习;欧阳艾嘉,教授、博士;张琳,副教授、硕士。
  • 基金资助:
    国家自然科学基金(61662090);贵州省教育厅青年科技人才成长项目(黔教合KY字[2017]250);贵州省教育厅工程研究中心项目(黔教合KY字[2016]018)。

Redundant Contrast Pattern Filtering Algorithm for Permutation Testing

WU Jun, OUYANG Aijia, ZHANG Lin   

  1. School of Information Engineering, Zunyi Normal University, Zunyi, Guizhou 563000, China
  • Received:2020-12-16 Revised:2021-02-14 Published:2021-01-26

摘要: 置换检验方法在进行对比模式挖掘时,返回结果中存在许多冗余对比模式。利用Charm方法挖掘样本集合中的对比模式,提出基于固定属性置换的FSPRP和FEPRP算法,依次为不同长度的对比模式构建零分布,从而过滤冗余对比模式。FSPRP算法通过生成一定数量的置换样本集合构建零分布,FEPRP算法则通过计算每个模式的对比性度量值分布合并建立零分布。实验结果表明,FSPRP和FEPRP算法相较于比较约束法能够过滤较多数量的冗余对比模式,并且FEPRP算法生成的零分布更接近精确零分布。

关键词: 数据挖掘, 对比模式挖掘, 置换检验, 冗余对比模式过滤, 固定属性置换

Abstract: The existing contrast pattern mining algorithms based on permutation testing suffer from redundant contrast patterns.To address the problem, two algorithms are proposed, which use the fixed attribute permutation procedure (including FSPRP and FEPRP) for filtering redundant contrast patterns.The Charm method is used to mine the contrast patterns in the sample set, and to construct the null distributions for the contrast patterns of different lengths.The FSPRP algorithm constructs the null distributions by generating a number of permutated data sets, whereas the FEPRP algorithm constructs the null distributions by calculating the contrast measure distribution of each pattern.The experimental results show that the FSPRP algorithm and the FEPRP algorithm can successfully filter out a certain number of redundant contrast patterns than the comparison constraint methods.Additionally, the null distributions generated by the FEPRP algorithm are closer to the exact null distributions.

Key words: data mining, contrast pattern mining, permutation testing, redundant contrast pattern filtering, fixed attribute permutation

中图分类号: