Abstract:
The paper proposes a fast spam detecting method under high-speed network. A part of the text is selected as a spread fingerprint row. After hashing the row of fingerprint, it is found that the contents of the text can be repeated. The method requires neither decode nor to handle all the contents of the mail. Besides, the amount of contents for the spread row is irrelevant with the size of the mail. Therefore, it can be applied as an effective method to detect spams under the high-speed network.
Key words:
Spam; Rabin fingerprint; High-speed network
摘要: 提出了高速网络环境下一种实时检测垃圾邮件的方法。将正文抽取一部分做指纹散列,散列后的指纹值可以发现重复的正文内容。不需要解码也不需要处理全部邮件内容,并且散列内容数量和邮件大小无关。尤其对于普通文本分类方法无法处理的二进制类型的垃圾邮件有较好的处理效果,适合在高速骨干网络环境下作为一种快速垃圾邮件检测的手段。初步实验证明,该方法具有较高的处理速度,重复内容判定准确。
关键词:
垃圾邮件;Rabin 指纹;高速网络环境
LIU Jie, CHENG Xueqi. Fast Spam Detecting Method Under High-speed Network[J]. Computer Engineering, 2006, 32(4): 139-141.
刘 杰,程学旗. 高速网络环境下的垃圾邮件快速检测技术[J]. 计算机工程, 2006, 32(4): 139-141.