语义与句法信息加强的二元标记实体关系联合抽取

doi:10.19678/j.issn.1000-3428.0064545

摘要/Abstract

摘要： 随着互联网技术不断地发展，数据信息呈爆炸性增长，迫切需要从海量数据中高效地提取关键信息，而实体关系抽取作为信息抽取的核心任务，发挥着不可替代的重要作用。现有基于深度学习的实体关系抽取方法存在误差累积、实体冗余、交互缺失、实体关系重叠等问题。为充分利用语句的语义信息和句法信息，提出一种加强语义信息与句法信息的二元标记实体关系联合抽取模型SSERel。通过对输入文本进行BERT编码，并对三元组主体的开始位置和结束位置进行预测标记，提取文本的全局语义特征、主体与每个词语的局部语义特征以及句法特征，并将其融合进编码向量。对语句每种关系的客体位置进行预测标记，最终完成三元组的提取。在NYT和WebNLG数据集上的实验结果表明，相比CasRel模型，该模型的F1值分别提升2.7和1.4个百分点，能够有效解决复杂数据中存在的重叠三元组和多三元组等问题。

关键词: 信息抽取, 实体关系联合抽取, 语义信息, 句法依存分析, 图卷积神经网络

Abstract: With the continuous development of Internet technology, the amount of data and information is growing explosively.Therefore, the efficient extraction of key information from massive data is an urgent requirement.As the core task of information extraction, entity relation extraction plays an important and irreplaceable role.However, the existing entity relation extraction methods based on deep learning have limitations such as error accumulation, entity redundancy, lack of interaction, and entity relation overlap.To fully use the semantic and syntactic information of the sentence, a binary marked entity relation joint extraction model, SSERel, which enhance the semantic and syntactic information, is proposed. The global semantic features of a text, local semantic features of the subject and each word, and syntactic features are extracted and fused into the coding vector by BERT coding the input text, and predicting and marking the start and end positions of the triplet subject.The object position of each relation of the statement is predicted and marked to complete the extraction of the final triple.The experimental results using the NYT and WebNLG datasets indicate that compared with the CasRel model, the F1 value of the SSERel model increases by 2.7 and 1.4 percentage points. Additionally, the SSERel model performs well on complex data with overlapping and multiple triples.

Key words: information extraction, joint extraction of entity relation, semantic information, syntactic dependency analysis, Graph Convolution Neural Network(GCNN)

中图分类号:

TP391

衡红军, 苗菁. 语义与句法信息加强的二元标记实体关系联合抽取[J]. 计算机工程, 2023, 49(4): 77-84.

HENG Hongjun, MIAO Jing. Joint Extraction of Binary Tagging Entity Relation for Enhanced Semantic and Syntactic Information[J]. Computer Engineering, 2023, 49(4): 77-84.

https://www.ecice06.com/CN/Y2023/V49/I4/77

图/表 14

20230415183357

20230415183400

20230415183403

20230415183406

20230415183410

20230415183416

20230415183419

20230415183422

20230415183426

20230415183430

20230415183433

20230415183436

20230415183439

20230415183443

参考文献

[1] 孙紫阳, 顾君忠, 杨静.基于深度学习的中文实体关系抽取方法[J].计算机工程, 2018, 44(9):164-170. SUN Z Y, GU J Z, YANG J.Chinese entity relation extraction method based on deep learning[J].Computer Engineering, 2018, 44(9):164-170.(in Chinese)
[2] JI S X, PAN S R, CAMBRIA E, et al.A survey on knowledge graphs:representation, acquisition, and applications[J].IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(2):494-514.
[3] 李冬梅, 张扬, 李东远, 等.实体关系抽取方法研究综述[J].计算机研究与发展, 2020, 57(7):1424-1448. LI D M, ZHANG Y, LI D Y, et al.Review of entity relation extraction methods[J].Journal of Computer Research and Development, 2020, 57(7):1424-1448.(in Chinese)
[4] MIWA M, BANSAL M.End-to-end relation extraction using LSTMs on sequences and tree structures[EB/OL].[2022-03-20].https://arxiv.org/pdf/1601.00770.pdf.
[5] KATIYAR A, CARDIE C.Going out on a limb:joint extraction of entity mentions and relations without dependency trees[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA:Association for Computational Linguistics, 2017:1-10.
[6] ZENG X R, ZENG D J, HE S Z, et al.Extracting relational facts by an end-to-end neural model with copy mechanism[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Melbourne, Australia:[s.n.], 2018:1-10.
[7] LI X Y, YIN F, SUN Z J, et al.Entity-relation extraction as multi-turn question answering[EB/OL].[2022-03-20].https://arxiv.org/abs/1905.05529v1.
[8] WEI Z P, SU J L, WANG Y, et al.A novel cascade binary tagging framework for relational triple extraction[C]//Proceeding of the 58th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2020:1-8.
[9] WANG Y C, YU B, ZHANG Y Y, et al.TPLinker:single-stage joint extraction of entities and relations through Token pair linking[EB/OL].[2022-03-20].https://arxiv.org/abs/2010.13415.
[10] NIU W C, CHEN Q, ZHANG W W, et al.GCN2-NAA:two-stage graph convolutional networks with node-aware attention for joint entity and relation extraction[C]//Proceedings of the 13th International Conference on Machine Learning and Computing.New York, USA:ACM Press, 2021:542-549.
[11] SUN K, ZHANG R C, MENSAH S, et al.Recurrent interaction network for jointly extracting entities and classifying relations[EB/OL].[2022-03-20].https://arxiv.org/abs/2005.00162v2.
[12] ZHENG S C, WANG F, BAO H Y, et al.Joint extraction of entities and relations based on a novel tagging scheme[EB/OL].[2022-03-20].https://arxiv.org/abs/1706. 05075v1.
[13] DAI D, XIAO X Y, LYU Y J, et al.Joint extraction of entities and overlapping relations using position-attentive sequence labeling[C]//Proceedings of the AAAI Conference on Artificial Intelligence.[S.l.]:AAAI Press, 2019:6300-6308.
[14] SUI D B, CHEN Y B, LIU K, et al.Joint entity and relation extraction with set prediction networks[EB/OL].[2022-03-20].https://arxiv.org/abs/2011.01675.
[15] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2017:6000-6010.
[16] SHEN Y L, MA X Y, TANG Y C, et al.A trigger-sense memory flow framework for joint entity and relation extraction[C]//Proceedings of the Web Conference.New York, USA:ACM Press, 2021:1704-1715.
[17] TIAN Y H, CHEN G M, SONG Y, et al. Dependency-driven relation extraction with attentive graph convolutional networks[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2021:4458-4471.
[18] DEVLIN J, CHANG M W, LEE K, et al.BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].[2022-03-20].https://arxiv.org/pdf/1810.04805.pdf.
[19] MANNING C, SURDEANU M, BAUER J, et al.The Stanford core NLP natural language processing toolkit[C]//Proceedings of the 52th Annual Meeting of Association for Computational Linguistics:System Demonstrations.Stroudsburg:ACL Press, 2014:55-60.
[20] RIEDEL S, YAO L M, MCCALLUM A.Modeling relations and their mentions without labeled text[C]//Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases.New York, USA:ACM Press, 2010:148-163.
[21] GARDENT C, SHIMORINA A, NARAYAN S, et al. Creating training corpora for NLG micro-planning[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2017:1-10.
[22] FU T J, LI P H, MA W Y.GraphRel:modeling text as relational graphs for joint entity and relation extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg, USA:Association for Computational Linguistics, 2019:1409-1418.
[23] BAI C Y, PAN L M, LUO S L, et al.Joint extraction of entities and relations by a novel end-to-end model with a double-pointer module[J].Neurocomputing, 2020, 377:325-333.
[24] ZENG X R, HE S Z, ZENG D J, et al.Learning the extraction order of multiple relational facts in a sentence with reinforcement learning[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2019:367-377.
[25] HONG Y, LIU Y, YANG S, et al.Improving graph convolutional networks based on relation-aware attention for end-to-end relation extraction[J].IEEE Access, 2020, 8:51315-51323.
[26] ZHAO K, XU H, CHENG Y, et al.Representation iterative fusion based on heterogeneous graph neural network for joint entity and relation extraction[J].Knowledge-Based Systems, 2021, 219:106888.

选择文件类型/文献管理软件名称

选择包含的内容