摘要: 针对海量邮件数据的处理需求和实际业务需要,设计了基于数据库编程语言的海量邮件自动分类系统。该系统由特征学习模块、数据库查询模块和贝叶斯分类模块3部分构成。结合贝叶斯分类算法,利用PL/SQL语言与数据库交互时的高效性特点,在ORACLE PL/SQL存储过程中完成对未知邮件的特征提取和表示,实现对海量邮件数据的有效分类。
关键词:
海量邮件数据,
特征学习,
数据库编程语言,
存储过程,
贝叶斯分类
Abstract: Aiming at the requirement of the massive e-mail data processing and the practical operation demand, this paper designs and implements a massive e-mail classification system based on database programming language, which consists of feature study module, database query module and Bayesian classification module. Combined with the Bayesian classification technology, it makes use of efficient PL/SQL language to extract and express the e-mail feature during the ORACLE PL/SQL stored procedure. It implements the classification of massive e-mail data accurately and efficiently.
Key words:
massive e-mail data,
feature study,
database programming language,
stored procedure,
Bayesian classification
中图分类号:
段 丹;郭绍忠;甄 涛;刘晓楠. 基于数据库编程语言的海量邮件数据分类技术[J]. 计算机工程, 2008, 34(9): 70-72,7.
DUAN Dan; GUO Shao-zhong; ZHEN Tao; LIU Xiao-nan. Massive E-mail Data Classification Technology Based on Database Programming Language[J]. Computer Engineering, 2008, 34(9): 70-72,7.