作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

基于首播前搜索数据的电视剧流行度预测

朱寒婷,尹敏,贺樑   

  1. (华东师范大学 计算机科学与技术系,上海 200241)
  • 收稿日期:2016-06-14 出版日期:2017-07-15 发布日期:2017-07-15
  • 作者简介:朱寒婷(1991—),女,硕士研究生,主研方向为数据挖掘;尹敏,讲师;贺樑,教授、博士、博士生导师。
  • 基金资助:
    国家科技支撑计划项目(2015BAH01F02);上海市科学技术委员会科研计划项目(16511102702);上海市经济和信息化委员会项目(150643)。

TV Drama Popularity Prediction Based on Search Data Before the Premiere

ZHU Hanting,YIN Min,HE Liang   

  1. (Department of Computer Science and Technology,East China Normal University,Shanghai 200241,China)
  • Received:2016-06-14 Online:2017-07-15 Published:2017-07-15

摘要:

现有对视频网站电视剧流行度预测的研究中考虑因素较少,并且极少能在电视剧首播前进行预测,这会使视频网站在做出版权购买、广告投放等决策时考虑不全面并且出现预测时间滞后的问题。为此,提出一种在首播前预测视频网站电视剧流行度的方法,综合考虑电视剧剧名和演员搜索数据,通过分析时间序列确定最早预测时间,使用多元线性回归模型实现电视剧流行度的预测。实验结果表明,该方法可利用首播前第13—18天的剧名和演员的百度搜索指数对PPTV和优酷2014年、2015年上线的电视剧预测上线后30天的点播量,预测值与真实值之间的皮尔森相关系数分别达到0.943 7和0.967 6,具有较好的预测效果。

关键词: 电视剧流行度, 电视剧点播量排名, 多元线性回归, 特征融合, 最早预测时间, 百度搜索指数

Abstract: Existing methods for TV drama popularity prediction in video websites solely consider the single factor and most of them cannot predict before the premiere.This will lead to the video website making unreasonable decisions on the purchase of copyright and advertising with a few days’ time lag.To solve this problem,this paper proposes a method which can predict the TV drama popularity in video websites before its premiere.This method first uses the search data of TV drama such as name and actor comprehensively.Then the method calculates the earliest prediction time through time series analysis.Finally,based on multiple linear regression model,it gets the optimal feature and predicts the popularity.Experiments result shows that this method can use the thirteenth to eighteenth days’ Baidu search index of name and actor before the premiere to predict 30 days’ on demand quantity ranking after premiere for TV dramas launched on PPTV and YOUKU in 2014 and 2015.The Spearman correlation coefficient between the prediction rank and real rank reaches 0.943 7 on PPTV and 0.967 6 on YOUKU.The result shows that this method has a good prediction effect.

Key words: TV drama popularity, TV drama on demand quantity ranking, Multiple Linear Regression(MLR), feature fusion, earliest prediction time, Baidu search index

中图分类号: