作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (9): 70-72,7. doi: 10.3969/j.issn.1000-3428.2008.09.025

• 软件技术与数据库 • 上一篇    下一篇

基于数据库编程语言的海量邮件数据分类技术

段 丹,郭绍忠,甄 涛,刘晓楠   

  1. (解放军信息工程大学信息工程学院,郑州 450002)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-05-05 发布日期:2008-05-05

Massive E-mail Data Classification Technology Based on Database Programming Language

DUAN Dan, GUO Shao-zhong, ZHEN Tao, LIU Xiao-nan   

  1. (Information Engineering Institue, PLA Information Engineering University, Zhengzhou 450002)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-05-05 Published:2008-05-05

摘要: 针对海量邮件数据的处理需求和实际业务需要,设计了基于数据库编程语言的海量邮件自动分类系统。该系统由特征学习模块、数据库查询模块和贝叶斯分类模块3部分构成。结合贝叶斯分类算法,利用PL/SQL语言与数据库交互时的高效性特点,在ORACLE PL/SQL存储过程中完成对未知邮件的特征提取和表示,实现对海量邮件数据的有效分类。

关键词: 海量邮件数据, 特征学习, 数据库编程语言, 存储过程, 贝叶斯分类

Abstract: Aiming at the requirement of the massive e-mail data processing and the practical operation demand, this paper designs and implements a massive e-mail classification system based on database programming language, which consists of feature study module, database query module and Bayesian classification module. Combined with the Bayesian classification technology, it makes use of efficient PL/SQL language to extract and express the e-mail feature during the ORACLE PL/SQL stored procedure. It implements the classification of massive e-mail data accurately and efficiently.

Key words: massive e-mail data, feature study, database programming language, stored procedure, Bayesian classification

中图分类号: