Password Guessing Method Based on Improved PCFG Algorithm

doi:10.19678/j.issn.1000-3428.0064678

Abstract

Abstract: Recently，password leaks have been occurring frequently. Accordingly，effective password guessing methods are an important means of securing passwords，among them，the method based on Probabilistic Context-Free Grammar（PCFG） is extremely effective. However，this method still has problems such as the inability to generate new substrings of password and inaccurate estimation of the probability of generating passwords.Thus，taking the PCFG-based password guessing method as the research object，its hit rate in the key stage of the password generation process is analyzed.Subsequently，an improved PCFG password guessing method based on Backoff-Recurrent Neural Network（Backoff-RNN） model and probability balance is proposed.In the password structure division stage，by analyzing the user's behavior and preference when constructing the password，the password is more finely divided into Chinese Pinyin and English words to extract a deeper structure information. In the password filling stage，the idea of Backoff is applied to the char-RNN model to generate long sequence substrings in the substructures to improve the accuracy and generation ability of the model. In the password probability calculation stage，the calculation method of password generation probability is improved to address the probability imbalance problem caused by inconsistent password structure length when using the traditional calculation rules.The experimental results demonstrate that the hit rate of the trawling attack of the proposed method is 20.6% and 22.4% higher than that of traditional method based on PCFG on the cross-datasets of Chinese and English language environments，respectively，and 2.8% higher than that of TarGuess-I model of the targeted attack on the dataset of Chinese language environment.

Key words: password guessing attack, natural language processing, Probabilistic Context-Free Grammar（PCFG）, deep learning, password security

摘要： 近年来口令泄露事件频出，有效的口令猜测方法是保障口令安全的重要手段，其中基于概率上下文无关文法(PCFG)的口令猜测方法效果尤为显著，然而仍存在无法生成新的口令字符子段、对生成口令的概率估计不准确等问题。以基于PCFG的口令猜测方法为研究对象，对其在口令构造过程中关键阶段的命中率进行分析，提出基于Backoff-RNN与概率平衡的改进PCFG口令猜测方法。在口令结构划分阶段，通过分析用户在构造口令时的行为与偏好，将口令从汉语拼音和英文单词两方面进行更细粒度的结构划分，提取口令更深层次的结构信息。在口令填充阶段，将Backoff思想应用于字符级RNN模型，生成子结构中长序列字符子段，提高模型准确性和泛化能力。在口令概率计算阶段，改进口令生成概率的计算方法，解决了使用传统计算规则时因口令结构长度不一致造成的概率不平衡问题。实验结果表明：在中英文两种语言环境交叉数据集上，该方法的漫步口令猜测攻击命中率相较于基于PCFG的口令猜测方法分别提升了20.6%和22.4%；在中文语言环境数据集上，定向口令攻击命中率相较于TarGuess-I模型提升了2.8%。

关键词: 口令猜测攻击, 自然语言处理, 概率上下文无关文法, 深度学习, 口令安全

CLC Number:

TP309

LI Jingwen, ZHAO Kui. Password Guessing Method Based on Improved PCFG Algorithm[J]. Computer Engineering, 2023, 49(5): 38-47.

李静雯, 赵奎. 基于改进PCFG算法的口令猜测方法[J]. 计算机工程, 2023, 49(5): 38-47.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0064678

http://www.ecice06.com/EN/Y2023/V49/I5/38

Figures/Tables 14

References

[1] WANG P,WANG D,HUANG X.Advances in password security[J].Computer Research and Development,2016,53(10):2173-2188.
[2] BONNEAU J,HERLEY C,VAN OORSCHOT P C,et al.Passwords and the evolution of imperfect authentication[J].Communications of the ACM,2015,58(7):78-87.
[3] WEIR M,AGGARWAL S,DE MEDEIROS B,et al.Password cracking using probabilistic context-free grammars[C]//Proceedings of the 30th IEEE Symposium on Security and Privacy.Washington D.C.,USA:IEEE Press,2009:391-405.
[4] HRANICKÝ R,LIŠTIAK F,MIKUŠ D,et al.On practical aspects of PCFG password cracking[M].Berlin,Germany:Springer,2019.
[5] 章梦礼,张启慧,刘文芬,等.一种基于结构划分及字符串重组的口令攻击方法[J].计算机学报,2019,42(4):913-928. ZHANG M L,ZHANG Q H,LIU W F,et al.A method of password attack based on structure partition and string reorganization[J].Chinese Journal of Computers,2019,42(4):913-928.(in Chinese)
[6] 罗敏,张阳.一种基于姓名首字母简写结构的口令破解方法[J].计算机工程,2017,43(1):188-195,200. LUO M,ZHANG Y.A password cracking method based on name initials shorthand structure[J].Computer Engineering,2017,43(1):188-195,200.(in Chinese)
[7] NARAYANAN A,SHMATIKOV V.Fast dictionary attacks on passwords using time-space tradeoff[C]//Proceedings of the 12th ACM Conference on Computer and Communications Security.New York,USA:ACM Press,2005:364-372.
[8] MA J,YANG W N,LUO M,et al.A study of probabilistic password models[C]//Proceedings of IEEE Symposium on Security and Privacy.Washington D.C.,USA:IEEE Press,2014:689-704.
[9] 安亚巍,罗顺,朱智慧.基于马尔可夫链的口令破解算法[J].计算机工程,2018,44(11):119-122. AN Y W,LUO S,ZHU Z H.Password cracking algorithm based on Markov chain[J].Computer Engineering,2018,44(11):119-122.(in Chinese)
[10] MELICHER W,UR B,SEGRETI S M,et al.Fast,lean,and accurate:modeling password guessability using neural networks[C]//Proceedings of the 25th USENIX Conference on Security Symposium.New York,USA:ACM Press,2016:175-191.
[11] KATZ S.Estimation of probabilities from sparse data for the language model component of a speech recognizer[J].IEEE Transactions on Acoustics,Speech,and Signal Processing,1987,35(3):400-401.
[12] XIA Z Y,YI P,LIU Y Y,et al.GENPass:a multi-source deep learning model for password guessing[J].IEEE Transactions on Multimedia,2019,22(5):1323-1332.
[13] ZHANG Y,XIAN H Q,YU A M.CSNN:password guessing method based on Chinese syllables and neural network[J].Peer-to-Peer Networking and Applications,2020,13(6):2237-2250.
[14] NAM S,JEON S,KIM H,et al.Recurrent GANs password cracker for IoT password security enhancement[J].Sensors,2020,20(11):3106.
[15] PASQUINI D,GANGWAL A,ATENIESE G,et al.Improving password guessing via representation learning[C]//Proceedings of IEEE Symposium on Security and Privacy.Washington D.C.,USA:IEEE Press,2021:1382-1399.
[16] VERAS R,COLLINS C,THORPE J.On the semantic patterns of passwords and their security impact[C]//Proceedings of 2014 Network and Distributed System Security Symposium.San Diego,USA:Internet Society,2014:1-10.
[17] HOUSHMAND S,AGGARWAL S,FLOOD R.Next Gen PCFG password cracking[J].IEEE Transactions on Information Forensics and Security,2015,10(8):1776-1791.
[18] LI Y,WANG H N,SUN K.A study of personal information in human-chosen passwords and its security implications[C]//Proceedings of the 35th Annual IEEE International Conference on Computer Communications.Washington D.C.,USA:IEEE Press,2016:1-9.
[19] WANG D,ZHANG Z J,WANG P,et al.Targeted online password guessing:an underestimated threat[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.New York,USA:ACM Press,2016:1242-1254.
[20] HITAJ B,GASTI P,ATENIESE G,et al.PassGAN:a deep learning approach for password guessing[M].Berlin,Germany:Springer,2019.
[21] 周环,刘奇旭,崔翔,等.基于神经网络的定向口令猜测研究[J].信息安全学报,2018,3(5):25-37. ZHOU H,LIU Q X,CUI X,et al.Research on targeted password guessing using neural networks[J].Journal of Cyber Security,2018,3(5):25-37.(in Chinese)
[22] XU L Z,GE C,QIU W D,et al.Password guessing based on LSTM recurrent neural networks[C]//Proceedings of IEEE International Conference on Computational Science and Engineering(CSE) and IEEE International Conference on Embedded and Ubiquitous Computing(EUC).Washington D.C.,USA:IEEE Press,2017:785-788.
[23] LIU Y Y,XIA Z Y,YI P,et al.GENPass:a general deep learning model for password guessing with PCFG rules and adversarial generation[C]//Proceedings of IEEE International Conference on Communications.Washington D.C.,USA:IEEE Press,2018:1-6.
[24] 汪定,邹云开,陶义,等.基于循环神经网络和生成式对抗网络的口令猜测模型研究[J].计算机学报,2021,44(8):1519-1534. WANG D,ZOU Y K,TAO Y,et al.Password guessing based on recurrent neural networks and generative adversarial networks[J].Chinese Journal of Computers,2021,44(8):1519-1534.(in Chinese)
[25] YU L T,ZHANG W N,WANG J,et al.SeqGAN:sequence generative adversarial nets with policy gradient[C]//Proceedings of AAAI Conference on Artificial Intelligence.Palo Alto,USA:AAAI Press,2017:1-10.

Please choose a citation manager

Content to export