Abstract:
Bring forward an improved algorithm which judge the kind of text through optimizing dictionary match. This algorithm judges the kind of text basing on real-time analysis of the content of text, by analyzing two-hundred thousand characters per second, it judges bad text in Web real-time and effectively. The bad Web filter based on the algorithm of anti-interferential pretreatment and anti-miscarriage of justice improves the recognition rate(above 95%) and reduces the anti-miscarriage of justice rate. It offers more substantial foundation of keeping away unuseful information.
Key words:
Web filter algorithm,
Anti-jamming preprocessing,
Dictionary match algorithm
摘要: 提出了一种通过优化词典匹配判定文本性质的改进算法。通过基于实时分析文本内容来判定文本性质,每秒可分析20万个汉字,实时有效地识别网页上的不良文本。可抗干扰的不良网页过滤器是基于防干扰预处理原理和防误判算法设计开发的,使识别率95%以上、误判率降低1%以下,为进一步防堵垃圾信息提供了基础。
关键词:
网页过滤算法,
防至扰预处理,
词典匹配算法
CLC Number:
LAI Yonghao; XIE Zanfu. Research on Anti-jamming Bad Web Filter Algorithm[J]. Computer Engineering, 2007, 33(11): 98-99.
赖勇浩;谢赞福. 防干扰的不良网页过滤算法研究[J]. 计算机工程, 2007, 33(11): 98-99.