作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (10): 137-144. doi: 10.19678/j.issn.1000-3428.0068399

• 人工智能与模式识别 • 上一篇    下一篇

融合多尺度跨度特征的谓语中心词识别模型

施竣潇1,2,3, 陈艳平1,2,3,*(), 穆肇南1   

  1. 1. 文本计算与认知智能教育部工程研究中心, 贵州 贵阳 550025
    2. 贵州大学公共大数据国家重点实验室, 贵州 贵阳 550025
    3. 贵州大学计算机科学与技术学院, 贵州 贵阳 550025
  • 收稿日期:2023-09-17 出版日期:2024-10-15 发布日期:2024-10-11
  • 通讯作者: 陈艳平
  • 基金资助:
    国家自然科学基金(62166007); 贵州省自然科学基金(黔科合基础-ZK[2022]027); 贵州省教育厅青年科技人才成长项目(黔教合KY字[2022]205号)

Predicate Center Word Recognition Model Fused with Multiscale Span Features

SHI Junxiao1,2,3, CHEN Yanping1,2,3,*(), MU Zhaonan1   

  1. 1. Text Computing and Cognitive Intelligence Engineering Research Center of National Education Ministry, Guiyang 550025, Guizhou, China
    2. State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, Guizhou, China
    3. College of Computer Science and Technology, Guizhou University, Guiyang 550025, Guizhou, China
  • Received:2023-09-17 Online:2024-10-15 Published:2024-10-11
  • Contact: CHEN Yanping

摘要:

针对谓语中心词识别模型中存在缺失跨度长度信息和多尺度跨度关联信息等问题, 提出一种融合多尺度跨度特征的汉语谓语中心词识别模型。首先, 使用ChineseBERT预训练语言模型和双向长短期记忆(BiLSTM)网络提取文本中包含上下文信息的字符向量序列; 其次, 利用线性神经网络对字符向量进行初步识别, 形成跨度遮蔽矩阵; 然后, 将字符向量序列二维化表示为跨度信息矩阵, 使用多尺度卷积神经网络(MSCNN)对跨度信息矩阵进行运算, 提取跨度的多尺度关联信息; 最后, 采用特征嵌入神经网络嵌入跨度的长度信息, 丰富跨度的特征向量以识别谓语中心词。实验结果表明, 该模型能够有效融合跨度的多尺度关联信息和长度信息, 提升谓语中心词识别的性能, 相比于同类模型中性能最优的谓语中心词识别模型的F1值提升了0.43个百分点。

关键词: 谓语中心词识别, 多尺度卷积, ChineseBERT预训练语言模型, 跨度长度信息, 多尺度跨度关联信息

Abstract:

To address the issues of missing span length and multiscale span correlation information in predicate center word recognition models, this study proposes a Chinese predicate center recognition model fused with multiscale span features. First, a Chinese Bidirectional Encoder Representations from Transformers (ChineseBERT) pre-trained language model and a Bidirectional Long Short-Term Memory (BiLSTM) network extract character vector sequences containing contextual information from the text. Second, a linear neural network performs the initial recognition of character vectors, forming a span-masking matrix. The character vector sequence is then represented in a two-dimensional format as a span information matrix, and a Multiscale Convolutional Neural Network(MSCNN) processes the span information matrix and extracts multiscale correlation information from the spans. Finally, a feature-embedding neural network embeds the length information of the spans, enriching the feature vectors of the spans for predicate head recognition. The experimental results demonstrate that this model can effectively integrate multiscale correlation and span length information, thereby enhancing the performance of predicate head recognition. Compared to the best-performing existing predicate center word recognition model, the proposed model achieves an improvement of 0.43 percentage points in F1 score.

Key words: predicate center word recognition, multiscale convolution, ChineseBERT pre-trained language model, span length information, multiscale span correlation information