作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (24): 53-56. doi: 10.3969/j.issn.1000-3428.2012.24.013

• 软件技术与数据库 • 上一篇    下一篇

基于序列匹配的作业相似度检测系统

王晓英 1,靳 力 2,王晓青 1,黄维通 3   

  1. (1. 青海大学计算机技术与应用系,西宁 810016;2. 青海省信息中心,西宁 810008;3. 清华大学计算机科学与技术系,北京 100084)
  • 收稿日期:2011-09-05 修回日期:2011-11-03 出版日期:2012-12-20 发布日期:2012-10-18
  • 作者简介:王晓英(1982-),女,副教授、博士,主研方向:高性能计算,云计算,并行计算;靳 力,博士;王晓青,教授;黄维通,副教授、博士
  • 基金资助:
    国家“质量工程”基金资助项目“面向西部地区的信息技术专业应用型人才培养模式创新实验区”;青海大学教育教学改革研究基金资助项目(JY1011003)

Homework Similarity Detection System Based on Sequence Matching

WANG Xiao-ying 1, JIN Li 2, WANG Xiao-qing 1, HUANG Wei-tong 3   

  1. (1. Department of Computer Technology and Application, Qinghai University, Xining 810016, China;2. Information Center of Qinghai Province, Xining 810008, China; 3. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China)
  • Received:2011-09-05 Revised:2011-11-03 Online:2012-12-20 Published:2012-10-18

摘要: 为辅助教师进行电子作业的批改和抄袭鉴别,设计并实现一种基于序列匹配的作业相似度检测系统。以班级为分组建立相似度计算模型,利用序列匹配算法计算公共子序列的长度,得到每组作业两两之间的相似度,并在此基础上进行聚类分析,给出可视化结果。实验结果表明,该系统具有较强的实用性,能够辅助教师在批改作业时快速高效地鉴别疑似抄袭的情况。

关键词: 电子作业, 相似度检测, 抄袭检测, 序列匹配, 相似度聚类, 公共子序列

Abstract: Aiming at helping teachers verify the originality of students reports during teaching, this paper presents the design and development of a similarity detection system based on sequence matching. An explicit similarity measurement model is established, the length of common subsequence is calculated based on the sequence matching algorithm, and the similarity between each pair of students documents in the same group is obtained. The similarity matrix is further normalized and classified into groups, incorporating the impact of document templates. Comparison results are visualized which are intuitively understandable for teachers to learn the similarity distribution across the whole class. Experimental results show the feasibility and practicability of the designed system, which can help teachers quickly detect the plagiarism.

Key words: electronic homework, similarity detection, plagiarism detection, sequence matching, similarity clustering, common subsequence

中图分类号: