计算机工程 ›› 2007, Vol. 33 ›› Issue (21): 155-156,.doi: 10.3969/j.issn.1000-3428.2007.21.055

• 人工智能及识别技术 • 上一篇    下一篇

信息安全中的变形关键词的识别

李 钝,曹元大,万月亮   

  1. (北京理工大学计算机科学技术学院,北京 100081)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-11-05 发布日期:2007-11-05

Transformed Keywords Identification in Information Security

LI Dun, CAO Yuan-da, WAN Yue-liang   

  1. (School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-11-05 Published:2007-11-05

摘要: 互联网中的不法分子为了逃避安全过滤,将不良信息中的文本进行变形,并在在网络中散布。为了识别和过滤这些不良文本,该文分析了其变形的特征,根据词同现和字符编码规则的不同对文本进行预处理,从文本中抽出包含有变形特征的有害词串。针对这些有害词串中各字符相邻、有序频繁出现的特点,提出采用基于关联规则自学习算法提取具有安全特色的关键词。实验表明,该方法可以改善传统方法在安全过滤过程中无法识别变形关键词的现状,对主题过滤提供补充,提高基于内容的安全过滤的效率。

关键词: 关联规则, 安全过滤, 关键词识别, 变形文本

Abstract: In order to prevent the spread of the ill metamorphosed texts in Internet which escapes from the traditional security filtering, a security identification method is presented. The features of metamorphosed characters in the ill texts are analyzed, they are recognized according to the character co-occurrence and the different codes of the characters and symbols, then the extraction algorithm based on association rules is proposed to update the ill feature dictionary. The experiments show that it can improve the current situation that the metamorphosed terms could not be identified using the traditional methods and improve the efficiency and the capability of feature identification as the complement of the topic filtering.

Key words: association rules, security filtering, keywords identification, transformed text

中图分类号: