Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (6): 116-126. doi: 10.19678/j.issn.1000-3428.0069205

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Cross-Domain Aspect Term Extraction Fusing Global and Local Semantics

LIU Dage1, YOU Jinguo1,2,*(), GENG Qiqi1   

  1. 1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
    2. Artificial Intelligence Key Laboratory of Yunnan Province, Kunming 650500, Yunnan, China
  • Received:2024-01-11 Online:2025-06-15 Published:2024-06-20
  • Contact: YOU Jinguo

融合全局与局部语义的跨领域方面词抽取

柳大格1, 游进国1,2,*(), 耿齐祁1   

  1. 1. 昆明理工大学信息工程与自动化学院,云南 昆明 650500
    2. 云南省人工智能重点实验室,云南 昆明 650500
  • 通讯作者: 游进国

Abstract:

Aspect Term Extraction (ATE) is a critical task in aspect-level sentiment analysis, and extraction and annotation costs are extremely high. When training and testing samples come from different domains, the performance of traditional methods often degrades significantly owing to the differences between the two samples. Existing methods focus on domain adaptation techniques based on rich semantic information within local contexts to achieve cross-domain ATE. However, they overlook the potential global long-range dependency relationships of aspect terms within the text, thereby limiting the performance, scalability, and robustness of the models. To address these issues, this study proposes a cross-domain ATE model known as CBiLSTM, which does not require additional manual labeling and integrates global and local semantic information. The model leverages semantic information as a pivot and first incorporates external semantic information into word embeddings to construct pivot information for both the source and target domains. It then performs parallel encoding of the global and local contextual semantic information, thereby better capturing comprehensive semantic features and bridging the gap between the source and target domains to achieve cross-domain ATE. CBiLSTM achieves an average F1-score of 53.87%, outperforming the current state-of-the-art model by 0.49 percentage points, on three benchmark datasets. Experimental results demonstrate the superior performance and lower computational cost of CBiLSTM.

Key words: semantic information, cross-domain, Aspect Term Extraction (ATE), Convolutional Neural Network (CNN), Bidirectional Long Short Term Memory (BiLSTM) network

摘要:

方面词抽取(ATE)是方面级情感分析的核心任务之一,提取和标注成本极高。当训练样本和测试样本来自不同领域时,由于两个样本之间存在差异,传统方法的性能往往会急剧下降。传统方法侧重考虑基于语义信息丰富的局部上下文的领域适应方法,以实现跨领域ATE,却忽略了方面词在文本中可能存在的全局长程依赖关系,从而使得模型的性能、可扩展性和鲁棒性受到一定程度的制约。针对上述问题,提出一种无须进行额外手动标记的融合全局与局部语义的跨领域方面词抽取模型CBiLSTM。该模型以语义信息作为枢轴,首先在词嵌入阶段融入外部语义信息并以此构建为源领域与目标领域的枢轴信息,然后结合全局与局部上下文语义信息进行并行编码,从而更好地捕获综合的语义特征信息,进一步弥合了源领域与目标领域之间的差异,最后实现方面词的跨领域抽取。在3个公测数据集上的实验结果表明,与基线模型相比,CBiLSTM模型的跨领域ATE的平均F1值达到53.87%,比当前最优模型提升了0.49百分点,具有较优的性能以及较低的计算成本。

关键词: 语义信息, 跨领域, 方面词抽取, 卷积神经网络, 双向长短期记忆网络