作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (11): 94-105, 114. doi: 10.19678/j.issn.1000-3428.0066258

• 人工智能与模式识别 • 上一篇    下一篇

基于分解循环结构的流程模型挖掘方法

王康1, 刘聪1,2, 王路3,*, 曾庆田1   

  1. 1. 山东科技大学 电子信息工程学院, 山东 青岛 266590
    2. 山东理工大学 计算机科学与技术学院, 山东 淄博 255000
    3. 山东科技大学 计算机科学与工程学院, 山东 青岛 266590
  • 收稿日期:2022-11-14 出版日期:2023-11-15 发布日期:2023-11-08
  • 通讯作者: 王路
  • 作者简介:

    王康(1998-), 男, 硕士研究生, 主研方向为流程挖掘

    刘聪, 教授、博士

    曾庆田, 教授、博士

  • 基金资助:
    国家自然科学基金(61902222); 山东省泰山学者工程专项基金(ts20190936); 山东省泰山学者工程专项基金(tsqn201909109); 山东省自然科学基金优秀青年基金(ZR2021YQ45); 山东省高等学校青创科技计划创新团队项目(QC2021948080); 教育部人文社会科学研究青年基金项目(20YJCZH159); 山东省自然科学基金青年基金(ZR2022QF020)

Process Model Mining Method Based on Decomposed Cycle Structure

Kang WANG1, Cong LIU1,2, Lu WANG3,*, Qingtian ZENG1   

  1. 1. School of Electronic Information Engineering, Shandong University of Science and Technology, Qingdao 266590, Shandong, China
    2. School of Computer Science and Technology, Shandong University of Technology, Zibo 255000, Shandong, China
    3. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, Shandong, China
  • Received:2022-11-14 Online:2023-11-15 Published:2023-11-08
  • Contact: Lu WANG

摘要:

模型挖掘作为流程挖掘的热点领域之一,旨在从事件日志中生成描述业务流程的模型。事件日志包含具有可分解循环依赖关系的活动,此类活动既无法使用过滤非频繁活动的方式将其过滤,也不能当作混沌活动处理,导致流程模型精确度较低。现有方法不能在含有噪声的情况下根据有无循环结构划分事件日志,进而无法在无循环结构子日志上正确识别具有可分解循环依赖关系的活动,且需要依赖活动属性。为克服现有方法的不足,提高挖掘模型质量,提出分离循环结构和可分解循环依赖关系的分解流程模型挖掘框架。首先基于启发式方法将事件日志根据有无循环结构划分为两部分,在无循环结构事件日志中根据活动间可达关系频率和直接跟随关系频率识别具有可分解循环依赖关系的活动,进而将具有可分解循环依赖关系的活动从有循环结构事件日志中过滤,以识别事件日志的循环结构并投影得到子日志集合。然后使用现有流程模型挖掘方法挖掘子模型并基于边界活动分支结构关系合并子模型。实验结果表明,该方法基于ProM平台实现,并基于公开事件日志与直接使用Inductive Miner、基于最大划分框架和基于阶段的业务流程模型挖掘方法相比,精确度提高了0.08~0.42,复杂度降低了3.86~45.92。

关键词: 分解流程挖掘, 模型挖掘, 启发式挖掘, 可分解循环依赖关系, 模型质量

Abstract:

Model mining-one of the hot areas of process mining-aims to generate models describing business processes from event logs. Event logs may contain activities with decomposable cyclic dependencies, which cannot be filtered by filtering infrequent activities nor treated as chaotic activities and can lead to low precision of process models. The existing methods cannot divide the event logs according to the presence or absence of cyclic structures in the presence of noise and thus cannot correctly identify activities with decomposable cyclic dependencies on sub-logs without cyclic structures, and the use of the existing methods is dependent on activity attributes. To overcome the shortage of existing methods and improve the quality of mining models, a decomposable process model mining framework that separates the cyclic structure and decomposable cyclic dependencies is proposed. First, the event log is divided into two parts on the basis of heuristics, and the activities with decomposable cyclic dependencies are identified in the event log with no cyclic structure according to the frequency of inter-activity reachable relations and direct following relations. Then, the activities with decomposable cyclic dependencies are filtered from the event log with a cyclic structure to identify the cyclic structure of the event log and to project the set of sub-logs. Finally, existing process model mining techniques are used to mine sub-models and merge sub-models according to the boundary activity branch structure relationship.The proposed method is implemented using the ProM platform, and its performance is quantitatively compared with that of the maximal based framework, stage-based discovery of business process model methods, and the direct use of Inductive Miner to mine models based on public event logs. Experiments indicate that compared with the other methods, the precision of the proposed method is 0.08-0.42 higher, and the complexity is reduced by 3.86-45.92.

Key words: decomposed process mining, model mining, heuristic mining, decomposable cyclic dependencies, model quality