计算机工程 ›› 2010, Vol. 36 ›› Issue (23): 274-276,279.doi: 10.3969/j.issn.1000-3428.2010.23.092

• 开发研究与设计技术 • 上一篇    下一篇

基于词汇相似度的IPC与CLC映射

周林志1,齐建东1,王建新1,朱礼军2   

  1. (1. 北京林业大学信息学院, 北京 100083; 2. 中国科学技术信息研究所, 北京 100038)
  • 出版日期:2010-12-05 发布日期:2010-12-14
  • 作者简介:周林志(1984-),男,硕士研究生,主研方向:分类法互操作;齐建东、王建新,副教授;朱礼军,副研究员、博士
  • 基金项目:
    国家“十一五”科技支撑计划基金资助项目“知识组织系统的集成及服务体系研究与实现”(2006BAH03B03);中国科学技术信息研究所重点工作基金资助项目“汉语科技词系统建设与应用工程(新能源汽车领域)”(200KP0131)

Mapping Between IPC and CLC Based on Similarity of Words

ZHOU Linzhi1,QI Jiandong1,WANG Jianxin1,ZHU Lijun2   

  1. (1. School of Informatics, Beijing Forestry University, Beijing 100083, China; 2. Institute of S &T Information of China, Beijing 100038, China)
  • Online:2010-12-05 Published:2010-12-14

摘要: 专利作为一种具有特殊性质的文献,包含先进的技术方案,但存在管理困难、相对孤立、使用率低等弊端。针对该问题,定义分类法类目的概念模型,通过计算类目之间的概念相似度,为国际专利分类法与中国图书分类法建立类目映射。在计算类目相似度中引入与类目相关的词汇语义相似度计算,综合考虑类目的上下文环境对类目间关系的影响,降低专利数据的孤立性,实现专利数据与其他期刊数据的交互操作。实验表明,该方法能有效提高类目间相似度计算的准确率。

关键词: 分类法映射, 国际专利分类法, 中国图书分类法, 词汇相似度

Abstract: Patent literature, as a special kind of document, including plenty of advanced technology, meanwhile it exists some problems such as management difficulties, isolation, low utilization. Reducing the isolation, achieving the interoperability between the patent and other journal is an effective way to improve the utilization of patent. A conceptual model for a category in classification is defined, mapping between International Patent Classification(IPC) and Chinese Library Classification(CLC) is established by calculating similarity. The method is based on semantic similarity of words which represent a category, and the effect from context is also took into account. Experiments show that the method can effectively improve the calculated accuracy.

Key words: classification mapping, International Patent Classification(IPC), Chinese Library Classification(CLC), similarity of words

中图分类号: