PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants

Liu, Ruofan; Lin, Yun; Yu, Silas Yeo Shuen; Teoh, Xiwen; Liang, Zhenkai; Dong, Jin Song

计算机科学 > 密码学与安全

arXiv:2507.15393v1 (cs)

[提交于 2025年7月21日 ]

标题： PiMRef：利用知识库不变性检测和解释不断演化的鱼叉式网络钓鱼邮件

标题： PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants

Authors:Ruofan Liu, Yun Lin, Silas Yeo Shuen Yu, Xiwen Teoh, Zhenkai Liang, Jin Song Dong

摘要：钓鱼电子邮件是网络犯罪杀伤链中的关键组成部分，因为它们覆盖面广且成本低。它们不断演变的特性使得传统的基于规则和特征工程的检测器在攻击者和防御者之间的持续对抗中失效。大型语言模型（LLMs）的兴起进一步加剧了这一威胁，使攻击者能够以极低的成本制作高度可信的钓鱼电子邮件。本研究证明，LLMs可以生成针对受害者资料量身定制的心理上有说服力的钓鱼电子邮件，成功绕过几乎所有商业和学术检测器。为了应对这种威胁，我们提出了 PiMRef，这是首个基于参考的钓鱼电子邮件检测器，它利用基于知识的不变性。我们的核心见解是，有说服力的钓鱼电子邮件通常包含可证伪的身份声明，这些声明与现实世界的事实相矛盾。 PiMRef 将钓鱼检测重新定义为身份事实核查任务。给定一封电子邮件，PiMRef (i) 提取发件人声称的身份，(ii) 根据预定义的知识库验证发件人的域名合法性，并且 (iii) 检测推动用户参与的操作提示。矛盾的声明会被标记为钓鱼指标，并作为人类可理解的解释。与现有的方法如 D-Fence、HelpHed 和 ChatSpamDetector 相比，PiMRef 在标准基准如 Nazario 和 PhishPot 上提升了 8.8% 的精确度，而召回率没有损失。在为期三年的五所大学账户共 10,183 封电子邮件的真实世界评估中，PiMRef 实现了 92.1% 的精确度、87.9% 的召回率和中位运行时间 0.05 秒，其效果和效率均优于最先进的技术。

摘要： Phishing emails are a critical component of the cybercrime kill chain due to their wide reach and low cost. Their ever-evolving nature renders traditional rule-based and feature-engineered detectors ineffective in the ongoing arms race between attackers and defenders. The rise of large language models (LLMs) further exacerbates the threat, enabling attackers to craft highly convincing phishing emails at minimal cost. This work demonstrates that LLMs can generate psychologically persuasive phishing emails tailored to victim profiles, successfully bypassing nearly all commercial and academic detectors. To defend against such threats, we propose PiMRef, the first reference-based phishing email detector that leverages knowledge-based invariants. Our core insight is that persuasive phishing emails often contain disprovable identity claims, which contradict real-world facts. PiMRef reframes phishing detection as an identity fact-checking task. Given an email, PiMRef (i) extracts the sender's claimed identity, (ii) verifies the legitimacy of the sender's domain against a predefined knowledge base, and (iii) detects call-to-action prompts that push user engagement. Contradictory claims are flagged as phishing indicators and serve as human-understandable explanations. Compared to existing methods such as D-Fence, HelpHed, and ChatSpamDetector, PiMRef boosts precision by 8.8% with no loss in recall on standard benchmarks like Nazario and PhishPot. In a real-world evaluation of 10,183 emails across five university accounts over three years, PiMRef achieved 92.1% precision, 87.9% recall, and a median runtime of 0.05s, outperforming the state-of-the-art in both effectiveness and efficiency.

主题：	密码学与安全 (cs.CR) ; 人工智能 (cs.AI)
引用方式：	arXiv:2507.15393 [cs.CR]
	(或者 arXiv:2507.15393v1 [cs.CR] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.15393

提交历史

来自： Ruofan Liu [查看电子邮件]
[v1] 星期一， 2025 年 7 月 21 日 08:53:41 UTC (4,255 KB)

计算机科学 > 密码学与安全

标题： PiMRef：利用知识库不变性检测和解释不断演化的鱼叉式网络钓鱼邮件

标题： PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 密码学与安全

标题： PiMRef：利用知识库不变性检测和解释不断演化的鱼叉式网络钓鱼邮件 显示英文标题

标题： PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： PiMRef：利用知识库不变性检测和解释不断演化的鱼叉式网络钓鱼邮件