Text-Relation-Extraction Algorithm Based on Large-Language Model and Semantic Enhancement

doi:10.19678/j.issn.1000-3428.0068501

Abstract

Abstract:

Relation extraction is a basic and important task that aims to extract the relations between entities from unstructured text. Recent developments show that Large-Language Model (LLM) and basic models can improve the performance of several Natural Language Processing (NLP) tasks. These models utilize the language-representation ability of deep-learning and pre-training models and can automatically learn the semantic features of relations. A method to effectively use of a large model for solving the problems of entity overlap and unsatisfactory information exchange is yet to be revealed. Hence, a relational-extraction model based on large language is proposed. First, the Large-Language model Meta AI (LLaMA) is adapted to the task in this study via fine-tuning. To extract relations, the self-attention mechanism is used to enhance the correlation between entity pairs and information sharing between entities. Subsequently, average pooling is performed to generalize an entire sentence. A filtering matrix is designed for entity pairs, part-of-speech information is introduced to enhance semantics, and invalid triples are filtered out based on the relevance of entity pairs in the filtering matrix. Experimental results show that the F1 value results of the proposed model on the New York Times (NYT) and WebNLG open datasets are 93.1% and 90.4%, respectively. In the case where the LLaMA model becomes an encoder after fine-tuning, the proposed algorithm is superior to the baseline model in terms of accuracy and the F1 value index, thus verifying its effectiveness.

Key words: relation extraction, artificial intelligence, attention mechanism, Large-Language Model(LLM), part of speech

摘要：

关系抽取是一项基础且重要的任务, 旨在从非结构化文本中提取出实体之间的关系。最近研究证明, 大型语言模型(LLM)和基础模型相结合可以改进许多自然语言处理(NLP)任务的性能。这些模型利用深度学习和预训练模型的语言表示能力, 能够自动学习关系的语义特征。有效利用大模型来解决实体重叠和信息交互差等问题仍是一个挑战。针对以上问题, 提出基于大语言模型的关系抽取算法。对大型语言模型Meta AI(LLaMA)进行微调训练, 使其更加适应关系抽取的任务, 在提取关系的基础上, 使用自注意力机制增强实体对之间关联程度, 增强关系和实体之间的信息共享, 接着使用平均池化泛化到整个句子中。针对实体对设计一个过滤矩阵, 并引入词性信息进行语义增强, 根据过滤矩阵中实体对的相关性过滤掉无效的三元组。实验结果表明, 该算法在纽约时报(NYT)和WebNLG公开数据集上的F1值结果分别为93.1%、90.4%。在微调之后的LLaMA模型作为编码器的情况下, 所提算法在准确率和F1值指标上均优于基线模型, 验证了算法的有效性。

关键词: 关系抽取, 人工智能, 注意力机制, 大语言模型, 词性

Jingcan LI, Cuilin XIAO, Xiaoting QIN, Xia XIE. Text-Relation-Extraction Algorithm Based on Large-Language Model and Semantic Enhancement[J]. Computer Engineering, 2024, 50(4): 87-94.

李敬灿, 肖萃林, 覃晓婷, 谢夏. 基于大语言模型与语义增强的文本关系抽取算法[J]. 计算机工程, 2024, 50(4): 87-94.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0068501

http://www.ecice06.com/EN/Y2024/V50/I4/87

Figures/Tables 9

Fig.1 Relation extraction method based on large-language model and semantic enhancement

Fig.2 F1 value results of different λ values on NYT dataset

Fig.3 Ablation experimental results of this algorithm on WebNLG dataset

Fig.4 Ablation experimental results of this algorithm on NYT dataset

References 28

1	SHEN Y, HUANG X J. Attention-based convolutional neural network for semantic relation extraction[C]∥Proceedings of the 26th International Conference on Computational Linguistics. Washington D. C., USA: IEEE Press, 2016: 2526-2536.
2	QIN P D, XU W R, GUO J. Designing an adaptive attention mechanism for relation classification[C]∥Proceedings of International Joint Conference on Neural Networks. Anchorage, USA: IEEE Press, 2017: 4356-4362.
3	XU Y, MOU L L, LI G, et al. Classifying relations via long short term memory networks along shortest dependency paths[C]∥Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2015: 1785-1794.
4	MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2016: 1105-1116.
5	SUI D, CHEN Y, LIU K, et al. Joint entity and relation extraction with set prediction networks[EB/OL]. [2023-08-30]. https://arxiv.org/abs/2011.01675v1.
6	BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]∥Proceedings of NIPS'20. Cambridge, USA: MIT Press, 2020: 1877-1901.
7	HUANG Q, SUN Y B, XING Z C, et al. API entity and relation joint extraction from text via dynamic prompt-tuned language model[EB/OL]. [2023-08-30]. https://arxiv.org/abs/2301.03987.
8	WADHWA S, AMIR S, WALLACE B C. Revisiting relation extraction in the era of large language models[EB/OL]. [2023-08-30]. https://arxiv.org/abs/2305.05003.
9	RIEDEL S, YAO L M, MCCALLUM A. Modeling relations and their mentions without labeled text. Berlin, Germany: Springer, 2010.
10	GARDENT C, SHIMORINA A, NARAYAN S, et al. Creating training corpora for NLG micro-planning[EB/OL]. [2023-08-30]. https://hal.science/hal-01623744v1.
11	SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]∥Proceedings of NIPS'14. Cambridge, USA: MIT Press, 2014: 3104-3112.
12	ZHENG S C, WANG F, BAO H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme[EB/OL]. [2023-08-30]. https://arxiv.org/abs/1706.05075v1.
13	ZENG X R, ZENG D J, HE S Z, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2018: 506-514.
14	FEI H, REN Y F, JI D H. Boundaries and edges rethinking: an end-to-end neural model for overlapping entity relation extraction. Information Processing & Management, 2020, 57 (6): 102311. doi: 10.1016/j.ipm.2020.102311
15	DUAN G D, MIAO J Y, HUANG T X, et al. A relational adaptive neural model for joint entity and relation extraction. Frontiers in Neurorobotics, 2021, 15, 635492. doi: 10.3389/fnbot.2021.635492
16	LI C, TIAN Y. Downstream model design of pre-trained language model for relation extraction task[EB/OL]. [2023-08-30]. https://arxiv.org/pdf/2004.03786v1.pdf.
17	DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2023-08-30]. https://arxiv.org/abs/1810.04805v2.
18	LIU L P, WANG M L, HE X H, et al. Extracting relational facts based on hybrid Syntax-Guided transformer and pointer network. Journal of Intelligent & Fuzzy Systems, 2021, 40 (6): 12167- 12183. URL
19	YE H B, ZHANG N Y, DENG S M, et al. Contrastive triple extraction with generative transformer[C]∥Proceedings of AAAI Conference on Artificial Intelligence. [S. 1.]: AAAI Press, 2021: 14257-14265.
20	HANG T T, FENG J, WU Y R, et al. Joint extraction of entities and overlapping relations using source-target entity labeling. Expert Systems with Applications, 2021, 177, 114853. doi: 10.1016/j.eswa.2021.114853
21	TOUVRON H, MARTIN L, STONE K, et al. Llama 2: open foundation and fine-tuned chat models[EB/OL]. [2023-08-30]. https://arxiv.org/abs/2307.09288.
22	MANNING C, SURDEANU M, BAUER J, et al. The stanford CoreNLP natural language processing toolkit[C]∥Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2014: 55-60.
23	ZHANG L T, ZHANG L, SHI S, et al. LoRA-FA: memory-efficient low-rank adaptation for large language models fine-tuning[EB/OL]. [2023-08-30]. https://arxiv.org/abs/2308.03303.
24	RAFFEL C, SHAZEER N M, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21 (1): 5485- 5551.
25	CONNEAU A, KHANDELWAL K, GOYAL N, et al. Unsupervised cross-lingual representation learning at scale[EB/OL]. [2023-08-30]. https://arxiv.org/abs/1911.02116v2.
26	LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: a lite BERT for self-supervised learning of language representations[EB/OL]. [2023-08-30]. https://arxiv.org/abs/1909.11942v6.
27	BIRD S. NLTK-Lite: efficient scripting for natural language processing[C]∥Proceedings of the 4th International Conference on Natural Language Processing. [S. 1.]: Allied Publishers Private Limited, 2005: 11-18.
28	VASILIEV Y. Natural language processing with Python and spaCy: a practical introduction[M]. San Francisco, USA: No Starch Press, 2020.

[1]	Chi ZHANG, Zhong WANG, Tianhao JIANG, Kangmin XIE. Speech Enhancement Network Based on Parallel Multi-Attention [J]. Computer Engineering, 2024, 50(4): 68-77.
[2]	Jida ZHAO, Guoyong ZHEN, Chengqun CHU. Unmanned Aerial Vehicle Image Target Detection Algorithm Based on YOLOv8 [J]. Computer Engineering, 2024, 50(4): 113-120.
[3]	Mingxu MA, Hong MA, Huawei SONG. Pose Estimation Algorithm for Small Target Pedestrians in Urban Street View Based on YOLO-Pose [J]. Computer Engineering, 2024, 50(4): 177-186.
[4]	Haipeng WU, Yurong QIAN, Hongyong LENG. Multimodal Relation Extraction Based on Bidirectional Attention Mechanism [J]. Computer Engineering, 2024, 50(4): 160-167.
[5]	Yu AN, Haibo GE, Wenhao HE, Sai MA, Mengyang CHENG. Siamese Network Tracking Algorithm Based on Compensated Attention Mechanism [J]. Computer Engineering, 2024, 50(4): 187-196.
[6]	Yudan YANG, Junhua ZHANG, Yunfeng LIU. Segmentation of Spine Computed Tomography Images Based on Three-Dimensional Recurrent Residual Convolution [J]. Computer Engineering, 2024, 50(4): 237-246.
[7]	Minghu WANG, Zhikui SHI, Jia SU, Xinsheng ZHANG. Sequence Recommendation Method Based on RoBERTa and Graph-Enhanced Transformer [J]. Computer Engineering, 2024, 50(4): 121-131.
[8]	Anzheng WANG, Jianwu DANG, Biao YUE, Jingyu YANG. Road Crack Detection Based on Position Information and Attention Mechanism [J]. Computer Engineering, 2024, 50(4): 303-312.
[9]	Xinlin XIE, Dongxu YIN, Taoyuan ZHANG, Gang XIE. Multiscale Fusion Crowd Counting Algorithm Based on Attention Mechanism [J]. Computer Engineering, 2024, 50(3): 290-297.
[10]	Wentao YUAN, Wentao WEI, Demin GAO. Research on Multiview Convolutional Gesture Recognition with Fusion Attention Mechanism [J]. Computer Engineering, 2024, 50(3): 208-215.
[11]	Fangxin XU, Rong FAN, Xiaolu MA. Improved YOLOv7 Algorithm for Crowded Pedestrian Detection [J]. Computer Engineering, 2024, 50(3): 250-258.
[12]	Bochao ZHAO, Jiajun MA, Lei CUI, Wenpeng LUAN, Jing ZHU. Anomaly Detection for Photovoltaic Based on Improved VMD-XGBoost-BiLSTM Combination Model [J]. Computer Engineering, 2024, 50(3): 306-316.
[13]	Jiayuan ZHAO, Yuru ZHANG, Xiaodong SU, Hongyan XU, Shizhou LI, Yurong ZHANG. Implicit Modeling Network of Human Keypoints Based on Attention Mechanism [J]. Computer Engineering, 2024, 50(3): 317-325.
[14]	Haining YAN, Zhengtao YU, Yuxin HUANG, Ran SONG, Xi YANG. Event Detection Model Integrating Part of Speech Semantic Extension Information [J]. Computer Engineering, 2024, 50(3): 89-97.
[15]	Haochen XU, Manhua LIU. Facial Landmark Detection Based on Hierarchical Self-Attention Network [J]. Computer Engineering, 2024, 50(2): 239-246.

Please choose a citation manager

Content to export