摘要: 研究了特定领域的文本的信息抽取,主要考虑了文本分布的观点。首先从未标注的语料中学习主题和主题间的关系,然后把它应用在同领域的文本信息抽取。经测试,其信息抽取的效果有所提高。
关键词:
主题;信息抽取;聚类;k 近邻;
Abstract: This paper studies information extraction of special domain and mainly considers the view of text distribution. First, it studies topic and relation of topic from un-annotated corpus, then applies it in text information extraction of the same domain. The experiment indicates that the result of the method improves a lot than previous ones to some extent.
Key words:
Topic; Information extraction; Clustering; K-means
郑家恒,菅小艳. 农作物信息抽取系统的设计与实现[J]. 计算机工程, 2006, 32(7): 197-198,220.
ZHENG Jiaheng, JIAN Xiaoyan. Design and Realization of the System of Farm Crop Information Extraction[J]. Computer Engineering, 2006, 32(7): 197-198,220.