作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (5): 65-72. doi: 10.19678/j.issn.1000-3428.0058042

• 人工智能与模式识别 • 上一篇    下一篇

虚拟空间中在线同源用户行为相似性研究

马满福1,2, 张凯旋1,2, 李勇1,2, 王常青3, 张强1,2   

  1. 1. 西北师范大学 计算机科学与工程学院, 兰州 730070;
    2. 甘肃省物联网工程研究中心, 兰州 730070;
    3. 中国互联网络信息中心 互联网基础技术开放实验室, 北京 100190
  • 收稿日期:2020-04-13 修回日期:2020-06-01 发布日期:2020-06-05
  • 作者简介:马满福(1968-),教授、博士,主研方向为大数据、机器学习;张凯旋,硕士研究生;李勇(通信作者),副教授、博士;王常青,高级工程师、博士;张强,教授、博士。
  • 基金资助:
    国家自然科学基金(71764025,61863032,61662070);甘肃省高等学校科学研究项目(2018A-001);甘肃省教育科学规划课题研究项目(GS[2018]GHBBKZ021,GS[2018]GHBBKW007)。

Research on Behavioral Similarity among Online Homologous Users in Virtual Space

MA Manfu1,2, ZHANG Kaixuan1,2, LI Yong1,2, WANG Changqing3, ZHANG Qiang1,2   

  1. 1. College of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China;
    2. Gansu Provincial IoT Engineering Research Center, Lanzhou 730070, China;
    3. Domain Named System Laboratory, China Internet Network Information Center, Beijing 100190, China
  • Received:2020-04-13 Revised:2020-06-01 Published:2020-06-05

摘要: 虚拟空间中在线同源用户具有相似行为特征,但现有相似性度量算法难以对其进行有效识别。提出一种基于序列对齐的在线同源用户识别算法,根据在线用户行为日志提取点击流数据,采用序列对齐方法计算在线用户的行为相似度,将其用行为相似度矩阵表示并对用户进行层次聚类,以识别虚拟空间中的在线同源用户,同时分析不同维度的用户特征属性对用户行为相似性的影响程度。实验结果表明,该算法能准确识别出在线同源用户,用户行为相似性受性别、户籍和教育程度3种特征属性影响较大,受年龄、社会阶层和收入水平的影响较小。

关键词: 行为特征, 在线同源用户, 序列对齐, 行为相似性, 特征属性

Abstract: Online homologous users in virtual space have similar behavior characteristics,but the existing similarity measurement algorithms are difficult to effectively identify them.To address the problem,this paper proposes an online homologous user identification algorithm based on sequence alignment.The click stream data is extracted from online user behavior logs,and then processed by using the sequence alignment method to calculate the behavioral similarity of online users,which is represented as the behavioral similarity matrix.On this basis,the hierarchical clustering is carried out for users to verify the existence of online homologous users in virtual space.At the same time,the influence of different dimensions of user characteristics on their behavioral similarity is analyzed.Experimental results show that the proposed algorithm can identify online homologous users accurately,and the similarity of user behavior is principally influenced by gender,registered residence and education level,and is less affected by age,social class and income level.

Key words: behavior characteristic, online homologous users, sequence alignment, behavioral similarity, feature attribute

中图分类号: