Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

LLM-Based Few-Shot Credential Tweaking Attack

  

  • Published:2026-04-20

基于大语言模型的少样本凭证调整攻击方法

Abstract: Passwords remain the most critical factor in identity verification and are widely used in various security scenarios. Enhancing password security relies heavily on the simulation and study of password guessing. In practice, data-driven credential tweaking attacks are highly constrained by the quantity and quality of training samples. Existing few-shot password guessing frameworks are not suitable for credential tweaking attacks. To address these issues, this paper proposes a few-shot credential tweaking attack method based on large language model and data augmentation technology. This method aims to automatically generate pseudo-aligned password data using a minimal number of high-quality samples, thereby reducing the high dependence on data quantity and quality in credential tweaking attacks. The contributions of this paper are as follows: 1) Based on reinforcement learning technology, a credential tweaking attack framework named PasswordRL is proposed. 2) Based on augmentation techniques, this paper proposes the few-shot credential tweaking attack framework PasswordRL-FS. Using four mainstream guessing methods as the baseline, this paper conducts comparative experiments on the aforementioned two frameworks on two real leaked password datasets. Experiments show that in real-world few-shot scenarios (number of training samples = 1000), with guess budgets of 5, 10, and 100, the hit rates of the proposed attack framework outperform the second-best baseline by 39.54%, 23.72%, and 42.40%, and the guess hit rates reach 83.72%, 81.85%, and 93.68% in data-rich scenarios (number of training samples > 107). These experiments demonstrate the effectiveness of the method proposed in this paper.

摘要: 口令仍是当前最为重要的身份认证因子,口令安全的提升离不开对口令猜测的模拟与研究。凭证调整攻击是一类广受关注的口令猜测方法。现实场景中,数据驱动的凭证调整攻击,其命中率受到训练样本数量与样本质量的高度制约。现有的少样本口令猜测框架不适用于凭证调整攻击任务。针对上述问题,提出基于大语言模型的少样本凭证调整攻击方法,利用尽可能少的高质量样本,自动化地合成伪对齐口令数据,有效地降低了数据驱动的凭证调整攻击对训练样本数量与训练样本质量的高度依赖。贡献主要包括:1)基于强化学习技术,提出了一套凭证调整攻击框架,称为PasswordRL。该框架使用混合强化学习与最大似然估计的损失函数,相较传统方法进一步提升猜测命中率;2)基于大语言模型与数据增强技术,提出少样本场景下的凭证调整攻击框架PasswordRL-FS。使用四种主流猜测方法作为基线,在两个真实泄露的口令数据集上,分别对上述提出的两个框架进行了比较实验。实验结果表明,在模拟真实环境的少样本场景(训练样本数=1000)中,猜测预算为5,10,100时,提出的猜测框架的命中率较次优模型分别(相对)提升了39.54%,23.72%,42.40%,并且,猜测命中率达到了多样本场景(训练样本数>107)的83.72%,81.85%,93.68%,上述实验结果证明了方法的有效性。