计算机工程 ›› 2018, Vol. 44 ›› Issue (6): 117-121,129.doi: 10.19678/j.issn.1000-3428.0048197

• 安全技术 • 上一篇    下一篇

源代码中的API密钥自动识别方法

薛敏 1,方勇 2,黄诚 1,刘亮 2   

  1. 1.四川大学 电子信息学院,成都 610065; 2.四川大学 网络空间安全学院,成都 610207
  • 收稿日期:2017-07-31 出版日期:2018-06-15 发布日期:2018-06-15
  • 作者简介:薛敏(1993—),女,硕士研究生,主研方向为Web安全;方勇,教授、博士;黄诚,博士;刘亮,讲师、博士。

Automatic Identification Method of API Key in Source Code

XUE Min  1,FANG Yong  2,HUANG Cheng  1,LIU Liang  2   

  1. 1.College of Electronics and Information,Sichuan University,Chengdu 610065,China; 2.College of Cybersecurity,Sichuan University,Chengdu 610207,China
  • Received:2017-07-31 Online:2018-06-15 Published:2018-06-15

摘要: 应用程序编程接口(API)密钥的泄露可能导致相关服务被恶意利用,从而造成难以预估的经济损失。为此,通过对样本进行基本特征统计和源代码静态结构分析,提取出不同项目代码中API密钥的共性特征,从而构建一种基于机器学习的自动识别源代码中API密钥的方法。实验结果表明,该识别方法的检索性能比全文匹配搜索、关键字搜索和信息熵值搜索等传统检测方式更优。

关键词: 应用程序编程接口密钥, 源代码, 机器学习, 静态结构, 信息熵

Abstract: The leak of Application Programming Interface(API) key may cause the illegal use of services,and then lead to unpredictable economic losses.The common characteristics of API keys in different project codes are extracted by analyzing the basic characteristics statistics and the source code static structure of the samples.Then,an automatic identification method based on machine learning is built to detect the API keys in the source code.The result of 10-fold cross-validation experiment results show that the identification method is better in retrieval performance than traditional detection approaches such as full-text matching search,keywords search and information entropy search.

Key words: Application Programming Interface(API) key, source code, machine learning, static structure, information entropy

中图分类号: