作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (16): 63-64. doi: 10.3969/j.issn.1000-3428.2010.16.023

• 软件技术与数据库 • 上一篇    下一篇

基于柔性匹配的中文文本特征提取方法

帅正化,周学广   

  1. (海军工程大学电子工程学院,武汉 430033)
  • 出版日期:2010-08-20 发布日期:2010-08-17
  • 作者简介:帅正化(1984-),男,硕士研究生,主研方向:信息安全;周学广,教授、博士生导师

Feature Extraction Method in Chinese Text Based on Flexible Matching

SHUAI Zheng-hua, ZHOU Xue-guang   

  1. (College of Electronic Engineering, Naval University of Engineering, Wuhan 430033)
  • Online:2010-08-20 Published:2010-08-17

摘要:

针对含有变形关键词的不良信息过滤问题,提出一种基于柔性匹配的中文文本特征信息提取方法。该方法采用柔性匹配技术识别和提取变形关键词,改进向量空间模型中特征项权重的计算方法,对具有变形形式的关键词赋予较高权重,从而提高特征信息的提取效率。实验结果表明,该方法可在保证过滤准确率的前提下,获得较高的召回率。

关键词: 柔性匹配, 特征信息提取, 变形关键词, 特征项权重

Abstract:

Aiming at the problem of filtering malicious information which contains transformed keyword, this paper presents a feature extraction method in Chinese text based on flexible matching. The method adopts flexible matching technology to identify transformed keyword, improves the computational method of feature term weight in Vector Space Model(VSM). The keyword which has transmutative form is endowed high weight to enhance extraction efficiency for feature information. Experimental result shows that the method of feature information extraction for filtering has high recall in the condition of ensuring precision.

Key words: flexible matching, feature information extraction, transmutative keyword, feature item weight

中图分类号: