作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (15): 164-167. doi: 10.3969/j.issn.1000-3428.2011.15.052

• 人工智能及识别技术 • 上一篇    下一篇

基于条件随机场的中文时间短语识别

朱莎莎,刘宗田,付剑锋,朱 芳   

  1. (上海大学计算机工程与科学学院,上海 200072)
  • 收稿日期:2011-02-10 出版日期:2011-08-05 发布日期:2011-08-05
  • 作者简介:朱莎莎(1987-),女,硕士研究生,主研方向:事件本体论,文本挖掘;刘宗田,教授、博士生导师;付剑锋,博士研究生;朱 芳,硕士研究生
  • 基金资助:
    国家自然科学基金资助项目(60975033);上海市重点学科建设基金资助项目(J50103);上海大学研究生创新基金资助项目(SH UCX091041, SHUCX102174)

Chinese Temporal Phrase Recognition Based on Conditional Random Fields

ZHU Sha-sha, LIU Zong-tian, FU Jian-feng, ZHU Fang   

  1. (School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China)
  • Received:2011-02-10 Online:2011-08-05 Published:2011-08-05

摘要: 传统时间短语识别方法存在中文文本时间短语边界定位不准确和长距离依赖的问题。为此,提出一种基于条件随机场(CRFs)的时间短语识别方法。采用基于机器学习的方法识别时间短语,分析中文文本中时间短语的词法、句法和上下文信息等语言学特征,将时间短语分为日期型和事件型2种类型,并半自动构建3个常用词表作为外部特征。在此基础上,引入能整合不同层面特征的CRFs方法,将识别问题转化为序列标注问题。实验结果表明,该方法在日期型时间短语和事件型时间短语识别上分别取得95.70%和85.75%的F1值,识别效果较好。

关键词: 中文时间短语, 时间短语识别, 条件随机场, 时间信息处理

Abstract: With complex and diverse language forms, temporal phrases are not perfectly recognized by traditional rule-based method. It is hard to extract an exact match for temporal phrases and recognize the long-distance-dependent temporal phrases representing time with many tokens in Chinese text. To solve these issues, based on the capability to integrate different levels features of Conditional Random Fields(CRFs) model, this paper presents a CRFs-based approach for temporal phrases recognition. By analyzing a set of linguistic features of time phrases in Chinese text such as lexical features, syntactic features and context information, temporal phrases are divided into two types, time-denoting temporal phrases and event-denoting temporal phrases. Three common vocabularies are semi-auto structured as external features. Experimental results show a performance reaching scores of 95.70% for F-measure to time-denoting temporal phrases and 85.75% for F-measure to event-denoting temporal phrases.

Key words: Chinese temporal phrase, temporal phrase recognition, Conditional Random Fields(CRFs), temporal information processing

中图分类号: