Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2012, Vol. 38 ›› Issue (17): 189-191. doi: 10.3969/j.issn.1000-3428.2012.17.052

• Networks and Communications • Previous Articles     Next Articles

Research of Unsupervised Chinese Noun Phrase Coreference Resolution

GAO Jun-wei 1,2, KONG Fang 1,2, ZHU Qiao-ming 1,2, LI Pei-feng 1,2, HUA Xiu-li 1,2   

  1. (1. School of Computer Science & Technology, Soochow University, Suzhou 215006, China; 2. Key Lab of Computer Information Processing Technology of Jiangsu Province, Suzhou 215006, China)
  • Received:2011-10-26 Revised:2011-12-15 Online:2012-09-05 Published:2012-09-03

无监督中文名词短语指代消解研究

高俊伟1,2,孔 芳1,2,朱巧明1,2,李培峰1,2,华秀丽1,2   

  1. (1. 苏州大学计算机科学与技术学院,江苏 苏州 215006;2. 江苏省计算机信息处理技术重点实验室,江苏 苏州 215006)
  • 作者简介:高俊伟(1986-),男,硕士研究生,主研方向:自然语言处理;孔 芳,副教授;朱巧明,教授、博士生导师;李培峰,副教授;华秀丽,硕士研究生
  • 基金资助:
    国家自然科学基金资助项目(90920004, 60970056, 61070123, 61003153);江苏省高校自然科学重大基础研究基金资助项目(08KJA 520002)

Abstract: The lack of public corpus is a big problem in the research of Chinese NLP. To eliminate the effect that lack of corpus to Chinese NLP, this paper presents a Chinese noun phrase coreference resolution system based on an unsupervised clustering approach and gives the details of the platform. The method adoptes three tools to evaluate the performance of the platform, in the case of auto, the average of F-measures achieves 59.43%. Experimental results show that the platform achieves good performance.

Key words: unsupervised, noun phrase, coreference resolution, clustering, natural language, corpus

摘要: 为减小语料库对中文指代消解的影响,设计一个基于无监督聚类的中文名词短语指代消解平台并给出其预处理、特征选择及聚类过程。采用3种评测工具对中文新闻语料进行评测,在自动情况下,平均F值为59.43%。实验结果表明,该中文指代消解平台能够较好地解决中文缺少语料库的问题。

关键词: 无监督, 名词短语, 指代消解, 聚类, 自然语言, 语料

CLC Number: