基于双向语义的中文实体关系联合抽取方法

doi:10.19678/j.issn.1000-3428.0063788

摘要/Abstract

摘要： 现有中文实体关系抽取方法通常利用实体间的单向关系语义特征进行关系抽取，然而仅靠单向语义特征并不能完全利用实体间的语义关系，从而使得实体关系抽取的有效性受到影响。提出一种基于双向语义的中文实体关系联合抽取方法。利用RoBERTa预训练模型获取具有上下文信息的文本字向量表征，通过首尾指针标注识别句子中可能存在关系的实体。为了同时利用文本中的双向关系语义信息，将实体分别作为关系中的主体与客体来建立正负关系，并利用两组全连接神经网络构建正负关系映射器，从而对每一个输入实体同时从正关系与负关系的角度构建候选关系三元组。将候选关系三元组分别在正负关系下的概率分布序列与实体位置嵌入特征相结合，以对候选三元组进行判别，从而确定最终的关系三元组。在DuIE数据集上进行对比实验，结果表明，该方法的精确率与召回率优于MultiR、CoType等基线模型，其F1值达到0.805，相较基线模型平均提高了12.8%。

关键词: 实体关系联合抽取, 双向关系语义, 正负关系映射, 全连接神经网络, 预训练语言模型

Abstract: Existing Chinese entity relationship extraction methods typically use the semantic features of one-way relationships between entities for relationship extraction.However, using only one-way semantic features cannot fully utilize the semantic relationships between entities, which affects the effectiveness of entity relation extraction.A joint extraction method for Chinese entity relationships based on bidirectional semantics is proposed.The RoBERTa pretraining model is used to obtain the text word vector representation with context information, and the entities that may have relationships in sentences are identified based on the first and last pointer labels.Entities are respectively regarded as the subject and object in the relationship to establish positive and negative relationships to simultaneously use the semantic information of two-way relationship in the text. In addition, two sets of fully connected neural networks are used to build a positive and negative relationship mapper to simultaneously construct candidate relationship triplets for each input entity from the perspective of positive and negative relationships.The probability distribution sequence of candidate relation triples under positive and negative relationships is combined with the entity position embedding feature to identify the candidate triples and determine the final relation triplet.The comparison experiment on the DuIE dataset shows that the precision and recall rates of this method are better than those of baseline models, such as MultiR and CoType, and its F1 value reaches 0.805, which is 12.8% higher than that of the baseline models.

Key words: entity relationship joint extraction, bidirectional relational semantics, positive-negative relationship mapping, fully connected neural network, pretrained language model

中图分类号:

TP181

禹克强, 黄芳, 吴琪, 欧阳洋. 基于双向语义的中文实体关系联合抽取方法[J]. 计算机工程, 2023, 49(1): 92-99,112.

YU Keqiang, HUANG Fang, WU Qi, OUYANG Yang. Joint Extraction Method for Chinese Entity Relationship Based on Bidirectional Semantics[J]. Computer Engineering, 2023, 49(1): 92-99,112.

https://www.ecice06.com/CN/Y2023/V49/I1/92

图/表 11

20230701175408

20230701175411

20230701175414

20230701175417

20230701175421

20230701175424

20230701175427

20230701175430

20230701175434

20230701175437

20230701175440

参考文献

[1] 李冬梅, 张扬, 李东远, 等.实体关系抽取方法研究综述[J].计算机研究与发展, 2020, 57(7):1424-1448. LI D M, ZHANG Y, LI D Y, et al.Review of entity relation extraction methods[J].Journal of Computer Research and Development, 2020, 57(7):1424-1448.(in Chinese)
[2] 江旭, 钱雪忠, 宋威.结合残差BiLSTM与句袋注意力的远程监督关系抽取[J].计算机工程, 2022, 48(10):110-115, 122. JIANG X, QIAN X Z, SONG W.Distantly supervised relationship extraction combined with residual BiLSTM and sentence bag attention[J].Computer Engineering, 2022, 48(10):110-115, 122.(in Chinese)
[3] 鄂海红, 张文静, 肖思琪, 等.深度学习实体关系抽取研究综述[J].软件学报, 2019, 30(6):1793-1818. E H H, ZHANG W J, XIAO S Q, et al.Survey of entity relationship extraction based on deep learning[J].Journal of Software, 2019, 30(6):1793-1818.(in Chinese)
[4] AUER S, BIZER C, KOBILAROV G, et al.DBpedia:a nucleus for a Web of open data[C]//Proceedings of the 2nd Asian Conference on Asian Semantic Web.Berlin, Germany:Springer, 2007:722-735.
[5] BOLLACKER K, EVANS C, PARITOSH P, et al.Freebase:a collaboratively created graph database for structuring human knowledge[C]//Proceedings of 2008 ACM SIGMOD International Conference on Management of Data.New York, USA:ACM Press, 2008:1247-1250.
[6] MIWA M, SÆTRE R, MIYAO Y, et al.A rich feature vector for protein-protein interaction extraction from multiple corpora[C]//Proceedings of 2009 Conference on Empirical Methods in Natural Language Processing.Washington D.C., USA:IEEE Press, 2009:121-130.
[7] KAMBHATLA N.Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations[C]//Proceedings of ACL DEMO'04.[S.l.]:ACL, 2004:178-181.
[8] TURIAN J, RATINOV L, BENGIO Y.Word representations:a simple and general method for semi-supervised learning[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL, 2010:384-394.
[9] NAYAK T, NG H T.Effective attention modeling for neural relation extraction[C]//Proceedings of the 23rd Conference on Computational Natural Language Learning.[S.l.]:ACL, 2019:603-612.
[10] ZENG D, LIU K, LAI S, et al.Relation classification via convolutional deep neural network[C]//Proceedings of the 25th International Conference on Computational Linguistics:Technical Papers.[S.l.]:ACL, 2014:2335-2344.
[11] 张东东, 彭敦陆.ENT-BERT:结合BERT和实体信息的实体关系分类模型[J].小型微型计算机系统, 2020, 41(12):2557-2562. ZHANG D D, PENG D L.ENT-BERT:entity relation classification model combining BERT and entity information[J].Journal of Chinese Computer Systems, 2020, 41(12):2557-2562.(in Chinese)
[12] VASHISHTH S, JOSHI R, PRAYAGA S S, et al.RESIDE:improving distantly-supervised neural relation extraction using side information[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing.[S.l.]:ACL, 2018:1257-1266.
[13] MIWA M, BANSAL M.End-to-end relation extraction using LSTMs on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL, 2016:1105-1116.
[14] KATIYAR A, CARDIE C.Going out on a limb:joint extraction of entity mentions and relations without dependency trees[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL, 2017:917-928.
[15] ZHENG S C, WANG F, BAO H Y, et al.Joint extraction of entities and relations based on a novel tagging scheme[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL, 2017:1227-1236.
[16] ZENG X R, ZENG D J, HE S Z, et al.Extracting relational facts by an end-to-end neural model with copy mechanism[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL, 2018:506-514.
[17] YU B, ZHANG Z, SHU X, et al.Joint extraction of entities and relations based on a novel decomposition strategy[C]//Proceedings of the 24th European Conference on Artificial Intelligence.Berlin, Germany:Springer, 2020:2282-2289.
[18] MOSTAFA D, STEPHAN G, ORIOL V, et al.Universal transformers[EB/OL].[2021-12-05].https://arxiv.org/pdf/1807.03819.pdf.
[19] LIU Y, OTT M, GOYAL N, et al.RoBERTa:a robustly optimized bert pretraining approach[EB/OL].[2021-12-05].https://arxiv.org/pdf/1907.11692v1.pdf.
[20] LI S J, HE W, SHI Y B, et al.DuIE:a large-scale Chinese dataset for information extraction[C]//Proceedings of CCF International Conference on Natural Language Processing and Chinese Computing.Berlin, Germany:Springer, 2019:791-800.
[21] HOFFMANN R, ZHANG C L, LING X, et al.Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL, 2011:541-550.
[22] REN X, WU Z Q, HE W Q, et al.CoType:joint extraction of typed entities and relations with knowledge bases[C]//Proceedings of the 26th International Conference on World Wide Web.New York, USA:ACM Press, 2017:1015-1024.
[23] 王勇超, 穆华岭, 周灵智, 等.基于指针网络的实体与关系联合抽取方法[J].计算机应用研究, 2021, 38(4):1004-1007, 1021. WANG Y C, MU H L, ZHOU L Z, et al.Joint extraction method of entity and relationship based on pointer network[J].Application Research of Computers, 2021, 38(4):1004-1007, 1021.(in Chinese)
[24] 陈仁杰, 郑小盈, 祝永新.融合实体类别信息的实体关系联合抽取[J].计算机工程, 2022, 48(3):46-53. CHEN R J, ZHENG X Y, ZHU Y X.Joint entity and relation extraction fusing entity type information[J].Computer Engineering, 2022, 48(3):46-53.(in Chinese)
[25] 葛君伟, 李帅领, 方义秋.基于字词混合的中文实体关系联合抽取方法[J].计算机应用研究, 2021, 38(9):2619-2623. GE J W, LI S L, FANG Y Q.Joint extraction method of Chinese entity relationship based on mixture of characters and words[J].Application Research of Computers, 2021, 38(9):2619-2623.(in Chinese)

选择文件类型/文献管理软件名称

选择包含的内容