Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2007, Vol. 33 ›› Issue (01): 267-269. doi: 10.3969/j.issn.1000-3428.2007.01.093

• Developmental Research • Previous Articles     Next Articles

Chinese Spam Filter System Based on Analysis Using Milter Interface

YANG Jie, ZHANG Jianzhong, SHEN Qingyong, HE Yun   

  1. (Department of Computer Science and Technology, Nankai University, Tianjin 300071)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-01-05 Published:2007-01-05

基于Milter实现的中文垃圾邮件过滤系统

杨 洁,张建忠,申庆永,何 云   

  1. (南开大学计算机科学与技术系,天津 300071)

Abstract: This paper presents a scheme of a real-time Chinese spam mail filtering system based on content analysis. The system works on Sendmail mail server under Linux. It utilizes Milter interface to get the real-time e-mail content, and then classifies and filters it combined with Chinese word segmentation and text categorization algorithms. It has high expansibility since it can embed many kinds of text categorization algorithms. Furthermore, these different text categorization algorithms are analyzed and compared by experiments.

Key words: Mail classification, Chinese word segmentation, Bayes, KNN

摘要: 提出一种基于内容的中文垃圾邮件实时过滤系统的实现方案,该系统建立在Linux的Sendmail邮件服务器上,通过Milter接口实时提取邮件内容,并结合中文分词及文本分类算法对邮件实施分类和过滤。该系统可嵌入多种文本分类算法,具有良好的可扩展性。通过测试对该系统内嵌入的不同分类算法模型进行了分析和比较。

关键词: 邮件分类, 中文分词, 贝叶斯算法, K近邻算法