作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (4): 68-76. doi: 10.19678/j.issn.1000-3428.0064192

• 人工智能与模式识别 • 上一篇    下一篇

一种高效的周期团挖掘方法

杜明, 郝燕, 周军锋, 谭玉婷   

  1. 东华大学 计算机科学与技术学院, 上海 201620
  • 收稿日期:2022-03-16 修回日期:2022-05-19 发布日期:2022-08-08
  • 作者简介:杜明(1975-),男,副教授、博士,主研方向为自然语言处理、信息查询、数据分析;郝燕(通信作者),硕士研究生;周军锋,教授、博士、博士生导师、CCF会员;谭玉婷,博士。
  • 基金资助:
    国家自然科学基金(61472339,61873337);上海市自然科学基金(20ZR1402700)。

An Efficient Method for Mining Periodic Cliques

DU Ming, HAO Yan, ZHOU Junfeng, TAN Yuting   

  1. School of Computer Science and Technology, DongHua University, Shanghai 201620, China
  • Received:2022-03-16 Revised:2022-05-19 Published:2022-08-08

摘要: 周期团是在时态网络上出现时机满足特定周期要求的完全子图,周期团挖掘用于挖掘时态图中具有周期性的团。针对现有周期团挖掘方法效率低的问题,设计三种高效的剪枝策略EMP-FlagVex、EMP-FlagEdge和EMP-FlagEdge+,并提出一种基于边上时间戳序列的求解方法EMP。枚举满足要求的极大团,并对枚举出的极大团进行周期验证。验证操作是提取极大团每条边上的时间戳集合,并对集合中出现的时间点进行计数。若某个时间点出现的次数等于提取的集合个数,则将其放入新集合。在此基础上,判断新集合中的序列是否具有周期性。实验结果表明,相比基础方法EMP,将EMP与EMP-FlagEdge+剪枝策略相结合的方法在PS、Lkml、Enron等数据集上的运行时间加快了15倍以上。相比MPC算法,基于顶点度数的EMP-FlagVex剪枝策略的挖掘效率提高约1倍,基于边上时间戳序列长度的EMP-FlagEdge剪枝策略的挖掘效率提高10倍,基于周期子序列长度的EMP-FlagEdge+剪枝策略的挖掘效率提高约30倍。

关键词: 时态网络, 周期团, 时间戳, 剪枝策略, 最长周期序列, 周期长度

Abstract: Periodic cliques are complete subgraphs that meet specific periodic requirements when they appear on the temporal network.Periodic cliques mining is used to mine periodic cliques in the temporal graph.Considering the low efficiency of the existing periodic cliques mining methods, three efficient pruning strategies EMP-FlagVex, EMP-FlagEdge, and EMP-FlagEdge+ are designed, and a solution method based on the edge timestamp sequence EMP is proposed.Enumerate the maximum cliques that meet the requirements, and periodically verify the maximum cliques listed.The verification operation involves extracting the timestamp set on each side of the maximum cliques and counting the time points that appear in the set.If the number of occurrences of a time point is equal to the number of extracted sets, it will be placed in a new set.On this basis, it can be judged whether the sequence in the new set is periodic.The experimental results show that the running time of the method combining EMP with EMP-FlagEdge+ pruning strategy on PS, Lkml, Enron and other data sets is approximately 15 times faster than the basic method EMP.Compared with the MPC algorithm, the mining efficiency of the EMP-FlagVex pruning strategy based on vertex degree is approximately 1 time higher, the EMP-FlagEdge pruning strategy based on the length of the timestamp sequence on the edge is 10 times higher, and the EMP-FlagEdge+ pruning strategy based on the length of the periodic subsequence is approximately 30 times higher.

Key words: temporal network, periodic cliques, timestamp, pruning strategy, longest period sequence, period length

中图分类号: