Name Disambiguation Based on   Dependency Feature in Web Page Text

doi:10.3969/j.issn.1000-3428.2012.19.035

Computer Engineering ›› 2012, Vol. 38 ›› Issue (19): 133-136. doi: 10.3969/j.issn.1000-3428.2012.19.035

• Networks and Communications • Previous Articles Next Articles

Name Disambiguation Based on Dependency Feature in Web Page Text

YANG Xin-xin ^1,2, LI Pei-feng ^1,2, ZHU Qiao-ming ^1,2

(1. School of Computer Science & Technology, Soochow University, Suzhou 215006, China; 2. Jiangsu Provincial Key Lab of Computer Information Processing Technology , Suzhou 215006, China)

Received:2011-12-30 Online:2012-10-05 Published:2012-09-29

基于网页文本依存特征的人名消歧

杨欣欣^1,2，李培峰^1,2，朱巧明^1,2

(1. 苏州大学计算机科学与技术学院，江苏苏州 215006； 2. 江苏省计算机信息处理技术重点实验室，江苏苏州 215006)

作者简介:杨欣欣(1988－)，男，硕士研究生，主研方向：自然语言处理，人名消歧；李培峰，副教授；朱巧明，教授
基金资助:
国家自然科学基金资助项目(60970056, 61070123, 61003155)；江苏省自然科学基金资助项目(BK2008160)；高等学校博士学科点专项基金资助项目(20093201110006)；模式识别国家重点实验室开放课题基金资助项目

Abstract

Abstract: This paper works on the common ambiguity problem on Internet. The following is the proposed method: extract the dependency features which are related to the key name entities in the Web page text, while extract supporting features such as named entity extraction; cluster these features by a two-step cluster algorithm which clusters the documents with high reliability in the first stage and then merges the other documents to the existing clustering results. Experimental result shows that the proposed disambiguation system has better performance than common methods.

Key words: name ambiguity, dependency feature, name disambiguation, named entity, clustering

摘要： 研究互联网中的人名消歧问题。抽取与网页文本中人名关键字实体相关的依存特征及命名实体等辅助特征，利用二层聚类算法，根据依存特征将可信度高的文档聚类，使用辅助特征将剩余文档加到现有聚类结果中，由此实现人名消歧。实验结果证明，该方法消歧效果优于其他人名消歧方法。

关键词: 人名歧义, 依存特征, 人名消歧, 命名实体, 聚类

CLC Number:

TP391

YANG Xin-Xin, LI Pei-Feng, SHU Qiao-Meng. Name Disambiguation Based on Dependency Feature in Web Page Text[J]. Computer Engineering, 2012, 38(19): 133-136.

杨欣欣, 李培峰, 朱巧明. 基于网页文本依存特征的人名消歧[J]. 计算机工程, 2012, 38(19): 133-136.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.3969/j.issn.1000-3428.2012.19.035

http://www.ecice06.com/EN/Y2012/V38/I19/133

References

[1] Malin B, Airoldi E, Carley K M. A Network Analysis Model for Disambiguation of Names in Lists[J]. Computational & Mathematical Organization Theory, 2005, 11(2): 119-139.
[2] Bagga A, Baldwin B. Entity-based Cross-document Corefe- rencing Using the Vector Space Model[C]//Proc. of the 17th International Conference on Computational Linguistics. [S. l.]: IEEE Press, 1998: 75-85.
[3] Chen Ying, Jin Peng, Li Wenjie, et al. The Chinese Persons Name Disambiguation Evaluation: Exploration of Personal Name Disambiguation in Chinese News[C]//Proc. of CIPS- SIGHAN Joint Conference on Chinese Language Processing. Beijing, China: Chinese Information Processing Society of China, 2010: 346-352.
[4] Mann G, Yarowsky D. Unsupervised Personal Name Disambigu- ation[C]//Proc. of CoNLL’03. Edmonton, Canada: Association for Computational Linguistics, 2003: 33-40.
[5] Fleischman M, Hovy E. Multi-document Person Name Resolution[C]//Proc. of the 42nd Annual Meeting of the Association for Computational Linguistics. Madrid, Spain: [s. n.], 2004: 1-8.
[6] Chen Ying, Martin J. Towards Robust Unsupervised Personal Name Disambiguation[C]//Proc. of 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Pargue, Czech: [s. n.], 2007: 190-198.
[7] Ono S, Sato I, Yoshida M, et al. Person Name Disambiguation in Web Pages Using Social Network, Compound Words and Latent Topics[C]//Proc. of the 12th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Heidelberg, Germany: Springer-Verlag, 2008: 260-271.
[8] Malin B. Unsupervised Name Disambiguation via Social Network Similarity[C]//Proc. of 2005 SIAM International Conference on Data Mining. Newport Beach, USA: [s. n.], 2005: 93-102.
[9] Romano L, Buza K, Giuliano C. XMedia: Web People Search by Clustering with Machinely Learned Similarity Measures[C]// Proc. of Web People Search Evaluation Workshop at World Wide Web Conference. Madrid, Spain: [s. n.], 2009.
[10] 王厚峰. 指代消解的基本方法和实现技术[J]. 中文信息学报, 2002, 16(6): 45-48.
[11] Elmacioglu E, Fan Y, Su T, et al. PSNUS: Web People Name Disambiguation by Simple Clustering with Rich Features[C]// Proc. of the 4th International Workshop on Semantic Evaluations. Pargue, Czech: [s. n.], 2007: 268-271.

[1]	Meiguang ZHENG, Yong YANG. Personalized Federated Learning Algorithm Based on Mutual Information and Soft Clustering [J]. Computer Engineering, 2023, 49(8): 20-28.
[2]	Zeshui LI, Junzhong JI, Cuicui YANG. Functional Module Detection Based on Deep Network Embedding of Edge Weighing Information in PPIN [J]. Computer Engineering, 2023, 49(8): 69-76.
[3]	Changpei YANG, Liefa LIAO. Chinese Named Entity Recognition Based on Dilated Gated Convolution Feature Fusion [J]. Computer Engineering, 2023, 49(8): 85-95.
[4]	Yuyan JIANG, Chengfeng TAO, Ping LI. Deep Subspace Clustering Algorithm with Data Augmentation and Adaptive Self-Paced Learning [J]. Computer Engineering, 2023, 49(8): 96-103, 110.
[5]	Tianchen QIU, Xiaoying ZHENG, Yongxin ZHU, Songlin FENG. Federated Learning Architecture for Non-IID Data [J]. Computer Engineering, 2023, 49(7): 110-117.
[6]	Jiarong ZHANG, Jinsha YUAN, Jianing XU, Zhihong LUO. Mechanics Entities Recognition Algorithm Based on Multi-Meta Information Embedding and Collaborative Neural Network [J]. Computer Engineering, 2023, 49(7): 125-134.
[7]	CHEN Ming, LIU Rong, ZHANG Ye. Chinese Medical Entity Recognition Based on Multiple Attention Mechanism [J]. Computer Engineering, 2023, 49(6): 314-320.
[8]	WEI Ya, ZHANG Zhengjun, HE Kailin, TANG Li. Density Peak Clustering Algorithm Based on Relative Density [J]. Computer Engineering, 2023, 49(6): 53-61.
[9]	DAI Haolei, HUANG Yonghui, ZHOU Guoxu. Clustering Analysis Based on Hyper-graph Regularized Non-Negative Tensor Train Decomposition [J]. Computer Engineering, 2023, 49(6): 81-89.
[10]	GAO Xiaofang, YUAN Yuliang, WEN Jing, BAI Xuefei. Label Propagation Algorithm for Intersecting Multi-manifolds Clustering [J]. Computer Engineering, 2023, 49(6): 90-98.
[11]	MAO Liang, ZHAO Linjun, YU Dunhui, SUN Bin. Enterprise-Named Entity Recognition Model Based on Knowledge Distillation [J]. Computer Engineering, 2023, 49(5): 90-96.
[12]	ZHU Hong, NIU Haoran, ZHU Tong. Entity Recognition of Industry Figures Based on Character and Word Fusion and Adversarial Training [J]. Computer Engineering, 2023, 49(5): 56-62.
[13]	LI Xiaoteng, ZHANG Panpan, GOU Zhinan, GAO Kai. Multi-Modal Named Entity Recognition Method Based on Multi-Task Learning [J]. Computer Engineering, 2023, 49(4): 114-119.
[14]	LIAO Liefa, XIE Shusong. Chinese Named Entity Recognition Based on Attention Mechanism Feature Fusion [J]. Computer Engineering, 2023, 49(4): 256-262.
[15]	ZHANG Sheng, TANG Fan, ZHANG Tianqi, FAN Sen. FCM-SSGP Method for Ultra-Wideband Indoor Localization [J]. Computer Engineering, 2023, 49(3): 211-220.

Please choose a citation manager

Content to export

Name Disambiguation Based on Dependency Feature in Web Page Text

基于网页文本依存特征的人名消歧

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Name Disambiguation Based on Dependency Feature in Web Page Text

基于网页文本依存特征的人名消歧

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments