计算机工程 ›› 2010, Vol. 36 ›› Issue (1): 40-42.doi: 10.3969/j.issn.1000-3428.2010.01.015

• 软件技术与数据库 • 上一篇    下一篇

基于存储过程的海量邮件数据挖掘

郭绍忠,甄 涛,贾 琦   

  1. (解放军信息工程大学信息工程学院,郑州 450002)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-01-05 发布日期:2010-01-05

Data Mining of Massive Mail Based on Storage Procedure

GUO Shao-zhong, ZHEN Tao, JIA Qi   

  1. (School of Information Engineering, PLA Information Engineering University, Zhengzhou 450002)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-01-05 Published:2010-01-05

摘要: 现有的邮件系统缺少对海量邮件数据的分析和挖掘功能,传统的对单封邮件的分类方式效率低下。针对该问题,研究文本挖掘特点,提出一种基于海量关系型数据库存储过程实现的高效的海量邮件内容数据挖掘算法,并对算法进行多个级别的性能优化。实验结果表明,该算法具有高效性、稳定性和普适性。

关键词: 邮件分类器, 数据挖掘, 存储过程

Abstract: It is short of the functions to analysis and mining great capacity mail data on existing mail data engine. Aiming at this problem, this paper describes and optimizes an efficient great capacity mail data mining algorithm based on directly storage procedure of Relational Database Management System(RDBMS) on performance on many levels after the character of text mining characteristic is studied. Experimental results demonstrate that this algorithm is effective, stable and adaptable.

Key words: mail classifier, data mining, storage procedure

中图分类号: