Abstract:
An algorithm for plagiarism-detection of scientific papers based on local word-frequency fingerprint is presented. Sentence is regarded as the basic component elements of a document, and extracting efficient keywords, sorting and reconstructing them. According to the code and word-frequency, the fingerprints are get to compute text similarity degree. The identification experiments on SOGOU-T database are done with the algorithm. Experimental results show that it partly overcomes the shortage of existing plagiarism-detection of scientific papers, and it has better performance on identification precision and identification speed.
Key words:
plagiarism-detection,
digital fingerprint,
local word-frequency,
similarity
摘要:
提出一种基于局部词频指纹的论文抄袭检测算法。将句子看成文档的基本构成元素,对其进行有效关键词提取排序重构,根据编码和词频联合方式获取句子指纹,以此计算文本间相似度。在新闻网页精简集SOGOU-T上的实验结果表明,该算法在一定程度上克服了现有论文抄袭检测算法检测精度低的缺点,具有较快的检测速度。
关键词:
抄袭检测,
数字指纹,
局部词频,
相似度
CLC Number:
QIN Yu-Beng, LENG Jiang-Kui, WANG Xiu-Kun, WANG Chun-Li. Plagiarism-detection Algorithm for Scientific Papers Based on Local Word-frequency Fingerprint[J]. Computer Engineering, 2011, 37(6): 193-194.
秦玉平, 冷强奎, 王秀坤, 王春立. 基于局部词频指纹的论文抄袭检测算法[J]. 计算机工程, 2011, 37(6): 193-194.