基于MDL和LSC的语义优选方法

doi:10.3969/j.issn.1000-3428.2011.17.004

计算机工程 ›› 2011, Vol. 37 ›› Issue (17): 15-18. doi: 10.3969/j.issn.1000-3428.2011.17.004

基于MDL和LSC的语义优选方法

李东明¹，张丽娟²，赵伟¹，石晶²

(1. 吉林农业大学信息技术学院，长春 130118；2. 长春工业大学计算机科学与工程学院，长春 130012)

收稿日期:2011-04-07 出版日期:2011-09-05 发布日期:2011-09-05
作者简介:李东明(1979－)，男，讲师、硕士，主研方向：智能信息处理，信息论；张丽娟，讲师、硕士；赵伟，教授、博士；石晶，讲师、博士
基金资助:
吉林省科研发展计划科技支撑基金资助重点项目(2010 0214)；吉林省科技发展计划青年基金资助项目(20100155)

Semantics Preference Method Based on MDL and LSC

LI Dong-ming ¹, ZHANG Li-juan², ZHAO Wei ¹, SHI Jing ²

(1. College of Information Technology, Jilin Agricultural University, Changchun 130118, China; 2. College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China)

Received:2011-04-07 Online:2011-09-05 Published:2011-09-05

摘要/Abstract

摘要：

为实现谓语动词对论元的自动选择，提出基于最小描述长度(MDL)和潜在语义聚类(LSC)的语义优选方法。基于MDL原则计算与动词搭配的名词的值，根据LSC模型的EM算法求取动、名词的搭配概率P(v,n)，并针对每一对动、名词计算和P(v,n)之和，将其作为衡量两者语义关联度的标准。实验结果表明，该方法的F1值达到85.26%，优于单独使用MDL或LSC方法。

关键词: 语义优选, 最小描述长度, 潜在语义聚类, 无指导学习, 期望极大化

Abstract:

To solve automatic predicate-verb choosing for argument, this paper gives semantics preference method based on Minumum Description Length(MDL) and Latent Semantic Clustering(LSC). MDL is used to calculate of each verb-noun pair. The probabilities of a verb preferring for a noun P(v,n) is computed based on LSC model and EM is used to evaluate the parameters. For the same verb-noun pair, the sum of and P(v,n) is considered to represent the association between the verb and the noun. Experiments show the F1 reaches 85.26%, and it is better than MDL or SCL methods.

Key words: semantics preference, Minumum Description Length(MDL), Latent Semantic Clustering(LSC), unsupervised learning, Expectation Maximization(EM)

中图分类号:

TP301

李东明, 张丽娟, 赵伟, 石晶. 基于MDL和LSC的语义优选方法[J]. 计算机工程, 2011, 37(17): 15-18.

LI Dong-Meng, ZHANG Li-Juan, DIAO Wei, DAN Jing. Semantics Preference Method Based on MDL and LSC[J]. Computer Engineering, 2011, 37(17): 15-18.

http://www.ecice06.com/CN/Y2011/V37/I17/15

参考文献

[1] 邵敬敏. 汉语语法的立体研究[M]. 北京: 商务印书馆, 2007.
[2] 徐波, 孙茂松, 靳光瑾. 中文信息处理若干重要问题[M]. 北京: 科学出版社, 2003.
[3] 俞士汶. 现代汉语语法信息词典详解[M]. 2版. 北京: 清华大学出版社, 2003.
[4] Mccarthy D, Sussex F E, Joshi V S, et al. Detecting Composi- tionality of Verb-object Combinations Using Selectional Preferences[C]//Proc. of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Prague, Czech: [s. n.], 2007.
[5] McCarthy D, Carroll J. Disambiguating Nouns, Verbs, and Adjec- tives Using Automatically Acquired Selectional Preferences[J]. Computational Linguistics, 2003, 29(4): 639-654.
[6] Wagner W, Schmid H, Schulte S. Verb Sense Disambiguation Using a Predicate-argument-clustering Model[C]//Proc. of CogSci Workshop on Distributional Semantics Beyond Concrete Concepts. Amsterdam, Holland: [s. n.], 2009: 23-28.
[7] Schulte S, Hying C, Scheible C, et al. Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences[C]//Proc. of the 46th Annual Meeting of the Association for Computational Linguistics. Columbus, USA: [s. n.], 2008: 496-504.
[8] Sun Lin, Korhonen A. Improving Verb Clustering with Automati- cally Acquired Selectional Preferences[C]//Proc. of Conference on Empirical Methods in Natural Language Processing. Beijing, China: [s. n.], 2009.
[9] Zanzotto F M, Pennacchiotti M. Pazienza M T. Discovering Asymmetric Entailment Relations Between Verbs Using Selectional Preferences[C]//Proc. of ACL’06. Sydney, Australia: [s. n.], 2006: 849-856.
[10] Mason Z J. Cormet: A Computational, Corpus-based Conventional Metaphor Extraction System[J]. Computational Linguistics, 2004, 30(1): 23-44.
[11] Zapirain B, Agirre E, Marquez L, et al. Improving Semantic Role Classification with Selectional Preferences[C]//Proc. of Annual Conference of the North American Chapter of the Association for Computational Linguistics. Los Angeles, USA: [s. n.], 2010.
[12] Young A C. The Effect of Selectional Preferences on Semantic Role Labeling[D]. [S. l.]: The University of Texas at Austin, 2009.
[13] Katrin E, Padó S, Padó U. A Fexible, Corpus-driven Model of Regular and Inverse Selectional Preferences[EB/OL]. (2010- 10-14). http://www.mitpressjournals.org/doi/abs/10.1162/coli_a_ 00017.
[14] Bergsma S, Lin Dekang, Goebel R. Discriminative Learning of Selectional Preference from Unlabeled Text[C]//Proc. of Conference on Empirical Methods in Natural Language Processing. Morristown, USA: [s. n.], 2008: 59-68.
[15] Mausam R A, Etzioni O. A Latent Dirichlet Allocation Method for Selectional Preferences[C]//Proc. of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Sweden: [s. n.], 2010.
[16] Erk K. A Simple, Similarity-based Model for Selectional Prefe- rences[C]//Proc. of the 45th Annual Meeting of the Association for Computational Linguistics Association for Computer Linguistics. Michigan, USA: [s. n.], 2007.
[17] Schulte S. Comparing Computational Models of Selectional Preferences-second-order Co-occurrence vs. Latent Semantic Clusters[C]//Proc. of International Conference on Language Resources and Evaluation. Valletta, Malta: [s. n.], 2010.
[18] Zheng Xuling, Zhou Changle, Li Tangqiu, et al. Automatic Acquisition of Chinese Semantic Collocation Rules Based on Association Rule Mining Technique[J]. Journal of Xiamen University: Natural Science, 2007, 46(3): 331-336.
[19] Wu Yunfang, Duan Huiming, Yu Shiwen. Verb’s Selectional Preference on Object[J]. Spoken and Written Language in Practice, 2005, 21(2): 121-128.
[20] Jia Yuxiang, Yu Shiwen. Automatic Acquisition of Selectional Pre- ference and Its Application to Metaphor Processing[C]//Proc. of the 4th National Student Conference on Computationl Linguistics. Taiyuan, China: [s. n.], 2008.
[21] Li Hang, Yamanishi K, Topic Analysis Using a Finite Mixture Model[J]. Information Processing & Management, 2003, 39(4): 521-541.
[22] Mats R. Two-dimensional Clusters in Grammatical Relations[J]. Inducing Lexicons with the EM Algorithm, 1998, 4(3): 7-24.
[23] Wagner A. Learning Thematic Role Relations for Lexical Semantic Nets[D]. [S. l.]: Tubingen University, 2004.

选择文件类型/文献管理软件名称

选择包含的内容

基于MDL和LSC的语义优选方法

Semantics Preference Method Based on MDL and LSC

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 11

编辑推荐

Metrics

本文评价

[1]	杜诗晴, 王鹏, 汪卫. 一种基于MDL的日志序列模式挖掘算法[J]. 计算机工程, 2021, 47(2): 118-125.
[2]	谢彬,张琨,张云纯,蔡颖,蒋彤彤. 基于轨迹相似度的移动目标轨迹预测算法[J]. 计算机工程, 2018, 44(9): 177-183.
[3]	曹卫权,李智翔,魏强,褚衍杰. 基于区域分布概率密度估计的轨迹分类方法[J]. 计算机工程, 2018, 44(4): 262-267,286.
[4]	魏长宝,姚汝贤. 基于最小描述长度的图分割结构检测改进算法[J]. 计算机工程, 2016, 42(1): 231-236,242.
[5]	魏长宝,姚汝贤. 基于最小描述长度的图分割变化检测改进算法[J]. 计算机工程, 2015, 41(7): 274-279,284.
[6]	杨乐, 吴及, 吕萍. 语音检索中子词单元的构建算法[J]. 计算机工程, 2012, 38(24): 251-253.
[7]	朱世磊, 任丙印, 王大鸣, 仵国锋. 一种基于子空间分解的认知MIMO传输机制[J]. 计算机工程, 2012, 38(20): 1-3.
[8]	贾可新, 何子述. 基于CSMDEM算法的GMM学习方法[J]. 计算机工程, 2011, 37(19): 153-156.
[9]	张政伟. 模型未知的非双曲型非线性序列去噪算法[J]. 计算机工程, 2011, 37(15): 6-9.
[10]	冯林;于孝航;孙焘;沈骁;潘晓雯. 基于最长公共子序列距离的主旨模式挖掘算法[J]. 计算机工程, 2008, 34(14): 47-48.
[11]	普鑫. 基于混合概率PCA模型高光谱图像本征维数确定[J]. 计算机工程, 2007, 33(09): 204-206.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于MDL和LSC的语义优选方法

Semantics Preference Method Based on MDL and LSC

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 11

编辑推荐

Metrics

本文评价