作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (1): 119-126. doi: 10.19678/j.issn.1000-3428.0060079

• 人工智能与模式识别 • 上一篇    下一篇

基于用户意图的微博文本生成技术研究

高永兵1, 黎预璇1, 高军甜1, 马占飞2   

  1. 1. 内蒙古科技大学 信息工程学院, 内蒙古 包头 014010;
    2. 包头师范学院 信息工程系, 内蒙古 包头 014010
  • 收稿日期:2020-11-23 修回日期:2021-01-22 发布日期:2021-01-26
  • 作者简介:高永兵(1974-),男,副教授、硕士,主研方向为文本挖掘、信息检索;黎预璇、高军甜,硕士研究生;马占飞,教授、博士。
  • 基金资助:
    国家自然科学基金(61762071);内蒙古自治区自然科学基金(2015MS0621)。

Research on Weibo Text Generation Technology Based on User Intention

GAO Yongbing1, LI Yuxuan1, GAO Juntian1, MA Zhanfei2   

  1. 1. School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, Inner Mongolia 014010, China;
    2. Department of Information Engineering, Baotou Teachers' College, Baotou, Inner Mongolia 014010, China
  • Received:2020-11-23 Revised:2021-01-22 Published:2021-01-26

摘要: 微博是个人和组织用户分享或获取简短实时信息的重要社交平台,微博文本自动生成技术能帮助用户在微博平台上快速实现各种社交意图。为辅助用户发表博文并表达社交意图,提出一种基于用户意图的微博文本生成技术,以挖掘提取微博文本特征,并在给定微博主题的条件下生成与用户意图相一致的微博文本。采用预训练语言模型与微调相结合的方法,在预训练语言模型GPT2上实现联合主题和用户意图的文本控制生成,以及具备用户对话功能的文本预测生成。实验结果表明,该技术生成的文本具有较高的可读性且符合微博文本语言风格,结合主题和5类用户意图的生成样本人工评分达77分以上。

关键词: 微博文本, 自动生成, 用户意图, 主题, 预训练语言模型, 微调

Abstract: Weibo is a mainstream social platform for individuals and organizational users to share or obtain short real-time information.The technique of Weibo text generation can help users quickly realize various social intentions on Weibo.In order to assist users in publishing blog posts and express social intentions, this paper proposes a Weibo text generation technology based on user intention, which mines and extracts Weibo text features, and generates Weibo texts that are consistent with the user intention under a given topic.Using the combination of pre-training language model and fine-tuning, the text control generation of joint topic and user intention and the text prediction generation with user dialogue function are realized on the pre-training language model, GPT2.The experimental results show that the proposed technology can generate texts with high readability and adherence to the language style of Weibo texts.The manual score of the generated samples combined with the theme and five types of user intention is more than 77 points.

Key words: Weibo text, automatic generation, user intention, topic, pre-training language model, fine-tuning

中图分类号: