作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (20): 21-25. doi: 10.3969/j.issn.1000-3428.2012.20.006

• 专栏 • 上一篇    下一篇

基于知识库的多层级中文文本查错推理模型

吴 林,张仰森   

  1. (北京信息科技大学智能信息处理研究所,北京 100192)
  • 收稿日期:2011-11-03 修回日期:2011-12-28 出版日期:2012-10-20 发布日期:2012-10-17
  • 作者简介:吴 林(1988-),男,硕士研究生,主研方向:智能信息处理;张仰森,教授
  • 基金资助:

    :国家自然科学基金资助项目(60873013, 61070119);北京大学计算语言学教育部重点实验室开放课题基金资助项目(KLCL-1005);北京市属市管高等学校人才强教计划基金资助项目(PHR201007131)

Reasoning Model of Multi-level Chinese Text Error-detecting Based on Knowledge Bases

WU Lin, ZHANG Yang-sen   

  1. (Institute of Intelligence Information Processing, Beijing Information Science and Technology University, Beijing 100192, China)
  • Received:2011-11-03 Revised:2011-12-28 Online:2012-10-20 Published:2012-10-17

摘要:

以往的中文文本查错研究主要针对字词错误,对句法、语义的错误推理研究不够。为此,利用统计模型和大规模人民日报语料库构建并扩充查错知识库,针对文本中字词、语法以及语义3个层次的错误,提出相应的多层级查错推理模型。设计并实现3个层级的文本查错算法,构建自动查错系统进行综合查错。实验结果表明,该系统查错性能较优,召回率达到85.62%。

关键词: 中文文本\知识库, 多层级查错, 查错推理, 语义查错

Abstract:

The previous search on text error-detecting focused on words-level, and the syntax and semantic errors reasoning are not paid enough attention. Words, syntax and semantic knowledge bases are enlarged and constructed by taking advantage of statistic model and massive People’s Daily corpus to resolve the relevant level error reasoning, and the relevant multi-level reasoning model is put forward. The comprehensive three-level text proofreading algorithm is implemented. The constructed system can check various text errors. Experimental results show that the system has a better performance, and the recall rate is 85.62%.

Key words: Chinese text, knowledge bases, multi-level error-detecting, error-detecting reasoning, semantic error-detecting

中图分类号: