作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于参考利用大语言模型的网络钓鱼检测方案

  • 发布日期:2025-05-20

Reference-based phishing detection scheme using LLM

  • Published:2025-05-20

摘要: 在网络安全领域,网络钓鱼攻击日益复杂且频繁,传统基于预定义参考模板的网络钓鱼检测方案依赖品牌与域名映射列表,通过视觉特征匹配识别品牌意图并验证域名一致性,实现可解释的钓鱼检测。这类方案虽能抵御零日钓鱼攻击,但需持续更新参考列表以覆盖新兴品牌,而这也导致高昂的维护成本。为此,该方案利用大语言模型(LLM)和检索增强生成(RAG)技术提出了一种新颖的基于参考的网络钓鱼检测方案Phish-RAGLLM。Phish-RAGLLM无需依赖预定义的参考列表,将传统的视觉问题重构为语言问题,利用LLM蕴含的丰富品牌知识,并通过RAG技术结合外部品牌知识库增强模型生成能力,有效抑制了LLM可能出现的幻觉问题,提升了检测的精确度和鲁棒性。实验结果表明,与当前最佳模型PhishLLM相比,Phish-RAGLLM能权衡模型性能、推理成本以及知识库完备性,以GPT-3.5-turbo-instruct作为主干LLM,将F1分数提升了5.88%,运行效率提升了12.5%,且在面对数据集变化和提示注入攻击时表现出较强的鲁棒性。基于LLM的特性,Phish-RAGLLM对多语言钓鱼网站表现出良好的适应性,能够有效检测不同语言环境下的钓鱼网页。此外,实地评估表明该方案具有比VirusTotal这一威胁情报来源更为广泛的检测能力,进一步验证了其可行性和有效性。

Abstract: In the field of cybersecurity, phishing attacks are becoming increasingly complex and frequent. Traditional phishing detection schemes based on predefined reference templates rely on brand-domain mapping lists, using visual feature matching to identify brand intent and verify domain consistency for explainable detection. While these methods can counter zero-day phishing attacks, they face scalability challenges due to the need for continuous updates to reference lists to cover emerging brands, leading to high maintenance costs. To address these, the paper proposes Phish-RAGLLM, a novel reference-based phishing detection scheme leveraging Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG). By reframing traditional visual problems into language tasks, Phish-RAGLLM eliminates reliance on predefined reference lists, utilizing LLMs' extensive brand knowledge while enhancing generation capabilities through RAG integration with external brand knowledge bases. This approach effectively mitigates LLM hallucination issues and improves detection precision and robustness. Experimental results demonstrate that compared to the current state-of-the-art model PhishLLM, Phish-RAGLLM—using GPT-3.5-turbo-instruct as the main LLM—balances model performance, inference cost and knowledge base completeness, achieving 5.88% increase in F1score and a 12.5% improvement in operational efficiency. Moreover, it shows strong robustness against dataset variations and prompt injection attacks. Based on the characteristics of LLM, Phish-RAGLLM exhibits good adaptability to multilingual phishing websites, effectively detecting phishing webpages in different linguistic contexts. Furthermore, real-world evaluations reveal that Phish-RAGLLM has broader detection capabilities than VirusTotal (a threat intelligence source), further validating its feasibility and effectiveness.