摘要: 专利作为一种具有特殊性质的文献,包含先进的技术方案,但存在管理困难、相对孤立、使用率低等弊端。针对该问题,定义分类法类目的概念模型,通过计算类目之间的概念相似度,为国际专利分类法与中国图书分类法建立类目映射。在计算类目相似度中引入与类目相关的词汇语义相似度计算,综合考虑类目的上下文环境对类目间关系的影响,降低专利数据的孤立性,实现专利数据与其他期刊数据的交互操作。实验表明,该方法能有效提高类目间相似度计算的准确率。
关键词:
分类法映射,
国际专利分类法,
中国图书分类法,
词汇相似度
Abstract: Patent literature, as a special kind of document, including plenty of advanced technology, meanwhile it exists some problems such as management difficulties, isolation, low utilization. Reducing the isolation, achieving the interoperability between the patent and other journal is an effective way to improve the utilization of patent. A conceptual model for a category in classification is defined, mapping between International Patent Classification(IPC) and Chinese Library Classification(CLC) is established by calculating similarity. The method is based on semantic similarity of words which represent a category, and the effect from context is also took into account. Experiments show that the method can effectively improve the calculated accuracy.
Key words:
classification mapping,
International Patent Classification(IPC),
Chinese Library Classification(CLC),
similarity of words
中图分类号:
周林志, 齐建东, 王建新, 朱礼军. 基于词汇相似度的IPC与CLC映射[J]. 计算机工程, 2010, 36(23): 274-276,279.
ZHOU Lin-Zhi, JI Jian-Dong, WANG Jian-Xin, SHU Li-Jun. Mapping Between IPC and CLC Based on Similarity of Words[J]. Computer Engineering, 2010, 36(23): 274-276,279.