Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2023, Vol. 49 ›› Issue (5): 38-47. doi: 10.19678/j.issn.1000-3428.0064678

• Research Hotspots and Reviews • Previous Articles     Next Articles

Password Guessing Method Based on Improved PCFG Algorithm

LI Jingwen, ZHAO Kui   

  1. School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, China
  • Received:2022-05-11 Revised:2022-06-20 Published:2022-11-29

基于改进PCFG算法的口令猜测方法

李静雯, 赵奎   

  1. 四川大学 网络空间安全学院, 成都 610065
  • 作者简介:李静雯(1998-),女,硕士研究生,主研方向为大数据安全、口令安全;赵奎(通信作者),教授、博士。
  • 基金资助:
    国家自然科学基金(U19A2068,61872254)。

Abstract: Recently,password leaks have been occurring frequently. Accordingly,effective password guessing methods are an important means of securing passwords,among them,the method based on Probabilistic Context-Free Grammar(PCFG) is extremely effective. However,this method still has problems such as the inability to generate new substrings of password and inaccurate estimation of the probability of generating passwords.Thus,taking the PCFG-based password guessing method as the research object,its hit rate in the key stage of the password generation process is analyzed.Subsequently,an improved PCFG password guessing method based on Backoff-Recurrent Neural Network(Backoff-RNN) model and probability balance is proposed.In the password structure division stage,by analyzing the user's behavior and preference when constructing the password,the password is more finely divided into Chinese Pinyin and English words to extract a deeper structure information. In the password filling stage,the idea of Backoff is applied to the char-RNN model to generate long sequence substrings in the substructures to improve the accuracy and generation ability of the model. In the password probability calculation stage,the calculation method of password generation probability is improved to address the probability imbalance problem caused by inconsistent password structure length when using the traditional calculation rules.The experimental results demonstrate that the hit rate of the trawling attack of the proposed method is 20.6% and 22.4% higher than that of traditional method based on PCFG on the cross-datasets of Chinese and English language environments,respectively,and 2.8% higher than that of TarGuess-I model of the targeted attack on the dataset of Chinese language environment.

Key words: password guessing attack, natural language processing, Probabilistic Context-Free Grammar(PCFG), deep learning, password security

摘要: 近年来口令泄露事件频出,有效的口令猜测方法是保障口令安全的重要手段,其中基于概率上下文无关文法(PCFG)的口令猜测方法效果尤为显著,然而仍存在无法生成新的口令字符子段、对生成口令的概率估计不准确等问题。以基于PCFG的口令猜测方法为研究对象,对其在口令构造过程中关键阶段的命中率进行分析,提出基于Backoff-RNN与概率平衡的改进PCFG口令猜测方法。在口令结构划分阶段,通过分析用户在构造口令时的行为与偏好,将口令从汉语拼音和英文单词两方面进行更细粒度的结构划分,提取口令更深层次的结构信息。在口令填充阶段,将Backoff思想应用于字符级RNN模型,生成子结构中长序列字符子段,提高模型准确性和泛化能力。在口令概率计算阶段,改进口令生成概率的计算方法,解决了使用传统计算规则时因口令结构长度不一致造成的概率不平衡问题。实验结果表明:在中英文两种语言环境交叉数据集上,该方法的漫步口令猜测攻击命中率相较于基于PCFG的口令猜测方法分别提升了20.6%和22.4%;在中文语言环境数据集上,定向口令攻击命中率相较于TarGuess-I模型提升了2.8%。

关键词: 口令猜测攻击, 自然语言处理, 概率上下文无关文法, 深度学习, 口令安全

CLC Number: