作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (8): 34-36. doi: 10.3969/j.issn.1000-3428.2011.08.012

• 软件技术与数据库 • 上一篇    下一篇

基于LSI的代码-文档可追溯关联挖掘研究

杨雪敏,张毅坤,崔颖安,张保卫,夏 辉   

  1. (西安理工大学计算机科学与工程学院,西安 710048)
  • 出版日期:2011-04-20 发布日期:2012-10-31
  • 作者简介:杨雪敏(1985-),女,硕士研究生,主研方向:软件测试;张毅坤,教授、博士;崔颖安,讲师、博士;张保卫,工程师、 硕士;夏 辉,讲师、硕士
  • 基金资助:
    陕西省自然科学基金资助项目(2009JM8003-1);陕西省教育厅专项基金资助项目(09JK679)

Research on Code and Documentation Traceability Association Mining Based on LSI

YANG Xue-min, ZHANG Yi-kun, CUI Ying-an, ZHANG Bao-wei, XIA Hui   

  1. (School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China)
  • Online:2011-04-20 Published:2012-10-31

摘要: 软件过程产品间可追溯关联挖掘对软件维护及需求跟踪等众多领域至关重要。基于此,提出一种基于潜在语义索引提取程序代码和中文文档关联信息的方法,该方法是对向量空间模型的改进,通过分析文本间隐含的语义结构来确定关联度,而不依赖于词项的匹配。实验结果表明,该方法不依赖于代码和文档预先定义的同义词库和知识库,并能一定程度上提高查全率和查准率。

关键词: 软件维护, 可追溯关联挖掘, 隐含语义索引, 信息检索, 跨语言信息检索

Abstract: Traceability link recovery among software process products is very important in many fields, such as software maintenance, as well as requirement trac. Based on Latent Semantic Indexing(LSI), the traceability recovery information can be extracted automatically from program source code and the related Chinese documentation. The obvious advantage is that the presented method does not rely on the pre-defined thesaurus and knowledge for the code and documentation, and to some extent, it improves the recall and precision.

Key words: software maintenance, traceability association mining, Latent Semantic Indexing(LSI), Information Retrieval(IR), Cross-Language Information Retrieval(CLIR)

中图分类号: