计算机工程 ›› 2019, Vol. 45 ›› Issue (3): 293-299,308.doi: 10.19678/j.issn.1000-3428.0051932

• 开发研究与工程应用 • 上一篇    下一篇

基于多特征融合的财经新闻话题检测研究

谭梦婕,吕鑫,陶飞飞   

  1. 河海大学 计算机与信息学院,南京 211100
  • 收稿日期:2018-06-27 出版日期:2019-03-15 发布日期:2019-03-15
  • 作者简介:谭梦婕(1994—),女,硕士研究生,主研方向为数据挖掘、隐私保护;吕鑫(通信作者),讲师、博士;陶飞飞,副教授、博士。
  • 基金项目:

    国家重点研发计划(2016YFC0400910);国家自然科学基金面上项目(61272543);NSFC-广东联合基金重点项目(U1301252)。

Study of Financial News Topic Detection Based on Multi-feature Fusion

TAN Mengjie,L Xin,TAO Feifei   

  1. College of Computer and Information,Hohai University,Nanjing 211100,China
  • Received:2018-06-27 Online:2019-03-15 Published:2019-03-15

摘要:

为辅助投资者在短期内及时发现投资热点,结合财经新闻的特点,提出一种财经新闻话题检测模型。构建基于财经新闻的时间窗切分新闻流,根据新闻文本中的主题事件、特征词、新闻语义及金融命名实体提取文本特征,并应用最近邻-凝聚层次聚类算法获得话题簇。实验结果表明,与传统多特征话题检测模型相比,该模型可有效降低聚类算法运行时间,提高话题检测准确度,且在一定程度上协助投资者进行决策判断。

关键词: 财经新闻, 话题检测, 多特征融合, 凝聚层次聚类, K最近邻

Abstract:

In order to help investors find hot spots of investment in a short time,this paper combines the characteristics of the financial news and proposes a financial news topic detection model.The model constructs a time window based on financial news to segment news streams,combines the topic events,feature words,news semantics and financial name entities to extract text features,and applies the Nearest Neighbor-Hierarchical Agglomerative Clustering(NNHAC) algorithm to get the topic clusters.Experimental results show that,compared with tranditional multi-feature topic detection models,this model can effectively reduce the running time of the clustering algorithm,improve the accuracy of topic detection,and to a certain extent,it helps investors to make decision and judgement.

Key words: financial news, topic detection, multi-feature fusion, Hierarchical Agglomerative Clustering(HAC), K-Nearest Neighbor(KNN)

中图分类号: