计算机工程

• 开发研究与工程应用 • 上一篇    下一篇

基于科研在线文档库平台的标签推荐系统

蔡 芳1,2,沈 一1,2,南 凯1   

  1. (1. 中国科学院计算机网络信息中心,北京100190;2. 中国科学院大学,北京 100049)
  • 收稿日期:2013-03-05 出版日期:2014-05-15 发布日期:2014-05-14
  • 作者简介:蔡 芳(1990-),女,硕士研究生,主研方向:网络协同,推荐系统;沈 一,博士研究生;南 凯,研究员。
  • 基金项目:
    中国科学院十二五信息化基金资助项目“科研信息化应用推进工程(XXH12503)。

Tag Recommendation System Based on Duckling Document Library Platform

CAI Fang  1,2, SHEN Yi  1,2, NAN Kai  1   

  1. (1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China; 2. University of Chinese Academy of Sciences, Beijing 100049, China)
  • Received:2013-03-05 Online:2014-05-15 Published:2014-05-14

摘要: 科研在线文档库是一个面向团队的文档协同与管理工具,为虚拟团队提供合作平台。它采用标签系统的方式组织其中的所有文档。在文档库的使用过程中,出现了无标签文档数量的累积以及用户为文档添加的标签质量偏低问题,影响文档的分类和共享。针对该问题,采用适用于科研在线文档库平台的标签推荐方法,包括协同过滤以及关键词抽取2个部分,促使用户为文档添加合格的标签,提高文档系统的使用效率。协同过滤推荐部分的实验采用准确率和召回率衡量标准,关键词抽取部分采用用户调查的实验方式,实验证明为每个文档提供3个候选标签能够得到理想效果。在实际使用环境中,该系统具有较高的精确度和可靠性,简单易于实现。

关键词: 标签推荐, 标签系统, 协同过滤, 关键词抽取, 冷启动, 文档协同

Abstract: Duckling Document Library(DDL) is a tool for document collaboration and management among research teams. It provides a cooperation platform for virtual teams. Tag system is used to manage all the documents on it. During the use of the library, the number of documents without any tags is gradually accumulating and the quality of tags labeled by users to some documents is not so good. All these troubles impede the effective control of the documents. In order to solve these problems, this paper proposes a tag recommendation method suitable for the document library of research online platform, which includes collaboration filtering recommendation and keywords extraction recommendation, in this way users are prompted to add qualified tags and improve the efficiency of the document library. Precision and recall rate metrics are used in the collaboration filtering recommendation and user survey in the keywords extraction recommendation. Experimental results show that a recommended list of three tags can get desired effect. In production environment, this tag recommendation system has qualified accuracy, reliability and is easy to be implemented.

Key words: tag recommendation, tag system, collaborative filtering, keywords extraction, cold-start, document collaboration

中图分类号: