基于LLVM的多样化编译方法

doi:10.19678/j.issn.1000-3428.0068314

摘要/Abstract

摘要：

现有的软件多样化工具对多个C/C++源文件组成的项目多样化时，大多数都是对单个C/C++源文件中的所有函数采用相同的多样化方法，使得每个函数或每个源文件存在多样化方式单一、多样化方式缺乏针对性的问题。为此，基于LLVM中间表示，提出一种分组混淆和代码感知相结合的多样化编译方法。设计基于不同角度形成的混淆技术预选库，包含了多种混淆技术分组方案；编译时通过对遍历的每个函数进行代码分析和处理，感知到函数的混淆特征，针对性地选择相应的多样化分组策略，进而随机选择组内的多样化技术进行混淆，从而实现对每个函数所采用的多样化方案都大不相同，生成多样化的异构执行体集合，为拟态防御技术和移动目标防御技术提供了基础软件支撑。选择标准测试集和典型案例，从安全性和性能这2个方面验证所提方法的有效性。实验结果表明，该方法在保证安全性的同时，对性能也几乎没有影响，从而验证了所提的多样化编译方法在实际应用中的有效性和可行性。

关键词: 多样化编译, 代码混淆, 软件保护, 软件多样化, 主动防御

Abstract:

When diversifying projects composed of multiple C/C++ source files, most of the existing software diversification tools adopt the same diversification method for all functions in a single C/C++ source file, which leads to a single diversification method for each function or source file and a lack of targeted diversification methods. To address this issue, a diversified compilation method combining grouping obfuscation and code awareness based on a Low-Level Virtual Machine (LLVM) intermediate representation is proposed. First, this study designs a preselection library of confusion techniques based on different perspectives, which includes various grouping schemes for confusion techniques. During compilation, code analysis and processing are performed on each traversed function to determine its confusion characteristics. Targeted diversification grouping strategies are selected, and diversification techniques within the group are randomly selected to avoid confusion. This achieves a significantly different diversification scheme for each function, making the generated heterogeneous execution set more diverse and providing basic software support for mimetic and mobile target defense technologies. To verify the method′s effectiveness, a standard test set and typical cases are selected to verify both security and performance. The results indicate that the proposed method can ensure security while having almost no impact on the performance, thus verifying the proposed method′s effectiveness and feasibility in practical applications.

Key words: diversified compilation, code obfuscation, software protection, software diversification, active defense

陈迎超, 王俊超, 庞建民, 岳峰. 基于LLVM的多样化编译方法[J]. 计算机工程, 2025, 51(7): 275-283.

CHEN Yingchao, WANG Junchao, PANG Jianmin, YUE Feng. Diversified Compilation Method Based on LLVM[J]. Computer Engineering, 2025, 51(7): 275-283.

https://www.ecice06.com/CN/Y2025/V51/I7/275

图/表 10

图1 多样化编译架构

Fig.1 Architecture of diversified compilation

图2 代码感知模块流程

Fig.2 Procedure of code awareness module

图3 多样化编译分组方案框架

Fig.3 Framework of diversified compilation grouping scheme

图4 gadgets对比图

Fig.4 Comparison charts of gadgets

图5 程序的控制流图对比

Fig.5 Comparison of control flow diagrams for programs

图6 多样化编译后程序的时间和空间开销

Fig.6 Time and space overhead of program after diversified compilation

参考文献 28

1	HAN S, SHIN W, PARK J H, et al. A bad dream: subverting trusted platform module while you are sleeping[C]//Proceedings of the 27th USENIX Security Symposium (USENIX Security 18). New York, USA: ACM Press, 2018: 1229-1246.
2	MA H Y , JIA C F , LI S J , et al. Xmark: dynamic software watermarking using Collatz conjecture. IEEE Transactions on Information Forensics and Security, 2019, 14 (11): 2859- 2874.
3	BERLATO S , CECCATO M . A large-scale study on the adoption of anti-debugging and anti-tampering protections in android apps. Journal of Information Security and Applications, 2020, 52, 102463.
4	MERLO A, RUGGIA A, SCIOLLA L, et al. ARMAND: anti-repackaging through multi-pattern anti-tampering based on native detection[EB/OL]. [2023-07-20]. https://arxiv.org/abs/2012.09292v2.
5	PFEFFER K, MAI A, DABROWSKI A, et al. On the usability of authenticity checks for hardware security tokens[C]//Proceedings of the 30th USENIX Security Symposium (USENIX Security 21). Washington D. C., USA: IEEE Press, 2021: 37-54.
6	HU H C , WU J C , WANG Z P , et al. Mimic defense: a designed-in cybersecurity defense framework. IET Information Security, 2018, 12 (3): 226- 237.
7	邬江兴. 网络空间拟态防御研究. 信息安全学报, 2016, 1 (4): 1- 10.
	WU J X . Research on cyber mimic defense. Journal of Cyber Security, 2016, 1 (4): 1- 10.
8	CARVALHO M , FORD R . Moving-target defenses for computer networks. IEEE Security & Privacy, 2014, 12 (2): 73- 76.
9	庞建民, 张宇嘉, 张铮, 等. 拟态防御技术结合软件多样化在软件安全产业中的应用. 中国工程科学, 2016, 18 (6): 74- 78.
	PANG J M , ZHANG Y J , ZHANG Z , et al. Applying a combination of mimic defense and software diversity in the software security industry. Strategic Study of Chinese Academy of Engineering, 2016, 18 (6): 74- 78.
10	姚东, 张铮, 张高斐, 等. 多变体执行安全防御技术研究综述. 信息安全学报, 2020, 5 (5): 77- 94.
	YAO D , ZHANG Z , ZHANG G F , et al. A survey on multi-variant execution security defense technology. Journal of Cyber Security, 2020, 5 (5): 77- 94.
11	LARSEN P, HOMESCU A, BRUNTHALER S, et al. SoK: automated software diversity[C]//Proceedings of IEEE Symposium on Security and Privacy. Washington D. C., USA: IEEE Press, 2014: 276-291.
12	FRANZ M . Making multivariant programming practical and inexpensive. IEEE Security & Privacy, 2018, 16 (3): 90- 94.
13	张宇嘉, 张啸川, 庞建民. 代码混淆技术研究综述. 信息工程大学学报, 2017, 18 (5): 635- 640.
	ZHANG Y J , ZHANG X C , PANG J M . Survey on code obfuscation research. Journal of Information Engineering University, 2017, 18 (5): 635- 640.
14	LÁSZLÓ T, KISS Á. Obfuscating C + + programs viacontrol flow flattening[C]//Proceedings of the 10th Symposium on Programming Languages and Software Tools. [S. l. ]: AAAI Press, 2009: 3-19.
15	WANG C X, DAVIDSON J, HILL J, et al. Protection of software-based survivability mechanisms[C]//Proceedings of International Conference on Dependable Systems and Networks. Washington D. C., USA: IEEE Press, 2001: 193-202.
16	COLLBERG C, THOMBORSON C, LOW D. Manufacturing cheap, resilient, and stealthy opaque constructs[C]//Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. New York, USA: ACM Press, 1998: 184-196.
17	COHEN F B . Operating system protection through program evolution. Computers & Security, 1993, 12 (6): 565- 584.
18	COLLBERG C, THOMBORSON C, LOW D. Ataxonomy of obfuscating transformations[EB/OL]. [2023-07-20]. https://www.semanticscholar.org/paper/A-Taxonomy-of-Obfuscating-Transformations-Collberg-Thomborson/162ab36477962d47c9b2578478364b2e81c6e19a?p2df.
19	JUNOD P, RINALDINI J, WEHRLI J, et al. Obfuscator-LLVM—software protection for the masses[C]//Proceedings of the 1st International Workshop on Software Protection. Washington D. C., USA: IEEE Press, 2015: 3-9.
20	廖方圆, 甘植旺. Android系统签名的漏洞分析与检测. 计算机工程, 2019, 45 (8): 25- 30. doi: 10.19678/j.issn.1000-3428.0055156
	LIAO F Y , GAN Z W . Vulnerability analysis and detection of Android system signature. Computer Engineering, 2019, 45 (8): 25- 30. doi: 10.19678/j.issn.1000-3428.0055156
21	祁龙云, 吕小亮, 路红, 等. 汇编级顺序语句块的自动形式化规约及其验证. 计算机工程, 2019, 45 (10): 64-69, 77. doi: 10.19678/j.issn.1000-3428.0053152
	QI L Y , LÜ X L , LU H , et al. Automatic formal specification and its verification of assembly-level sequential statement blocks. Computer Engineering, 2019, 45 (10): 64-69, 77. doi: 10.19678/j.issn.1000-3428.0053152
22	Secure Systems Lab. Multicompiler[EB/OL]. [2023-07-20]. https://github.com/securesystemslab/multicompiler.
23	GoSSIP-SJTU. Armariris[EB/OL]. [2023-07-20]. https://github.com/GoSSIP-SJTU/Armariris.
24	Naville Zhang. Hikari[EB/OL]. [2023-07-20]. https://github.com/HikariObfuscator/Hikari.
25	LI C Y, HUANG T B, CHEN X R, et al. IOLLVM: enhance version of OLLVM[EB/OL]. [2023-07-20]. https://arxiv.org/abs/2203.03169.
26	李成扬, 黄天波, 陈夏润, 等. LLVM中间语言的控制流混淆方案. 计算机工程与应用, 2023, 59 (8): 263- 269.
	LI C Y , HUANG T B , CHEN X R , et al. Control flow obfuscation scheme for LLVM intermediate languages. Computer Engineering and Applications, 2023, 59 (8): 263- 269.
27	陈耀阳, 陈伟. 采用隐式跳转的控制流混淆技术. 计算机工程与应用, 2021, 57 (20): 125- 132.
	CHEN Y Y , CHEN W . Control flow obfuscation technology based on implicit jump. Computer Engineering and Applications, 2021, 57 (20): 125- 132.
28	田大江, 李成扬, 黄天波, 等. 基于底层虚拟机的标识符混淆方法. 计算机应用, 2022, 42 (8): 2540- 2547.
	TIAN D J , LI C Y , HUANG T B , et al. Identifier obfuscation method based on low level virtual machine. Journal of Computer Applications, 2022, 42 (8): 2540- 2547.

[1]	戴磊, 曹林, 郭亚男, 张帆, 杜康宁. 基于生成对抗网络的深度伪造跨模型防御方法[J]. 计算机工程, 2024, 50(10): 100-109.
[2]	何本伟, 郭云飞, 梁浩, 王庆丰. 面向二进制代码的细粒度软件多样化方法[J]. 计算机工程, 2024, 50(1): 138-144.
[3]	王梓, 王治华, 韩勇, 金建龙, 黄天明, 朱江. 面向电力系统网络安全的多层协同防御模型研究[J]. 计算机工程, 2021, 47(12): 131-140.
[4]	廖方圆, 甘植旺. Android系统签名的漏洞分析与检测[J]. 计算机工程, 2019, 45(8): 25-30.
[5]	王伟,曾俊杰,李光松. 动态异构冗余系统的安全性分析[J]. 计算机工程, 2018, 44(10): 42-45,50.
[6]	马世鑫, 刘粉林, 罗向阳, 芦斌. 基于互信息的k-gram软件胎记选取[J]. 计算机工程, 2012, 38(22): 43-46.
[7]	王国鑫, 朱宪花. 分布式信息安全防御系统的设计与实现[J]. 计算机工程, 2012, 38(06): 156-157.
[8]	杨乐, 周强强, 薛锦云. 基于垃圾代码的控制流混淆算法[J]. 计算机工程, 2011, 37(12): 23-25.
[9]	岳峰;庞建民;赵荣彩;白莉莉. 反汇编过程中call指令后混淆数据的识别[J]. 计算机工程, 2010, 36(7): 144-146.
[10]	陈俊杰, 施勇, 薛质, 陈欣. 基于SSDT及回调函数的键盘记录方法[J]. 计算机工程, 2010, 36(11): 120-122.
[11]	霍建雷;范训礼;房鼎益. Java标识符重命名混淆算法及其实现[J]. 计算机工程, 2010, 36(1): 146-148.
[12]	张一弛;庞建民;赵荣彩;韩小素. 可执行文件中子程序异常返回的识别[J]. 计算机工程, 2009, 35(2): 15-17.
[13]	刘佳娜，张林龙，钱松荣. 基于 ACE 的软件许可证系统设计和实现[J]. 计算机工程, 2006, 32(9): 128-130.
[14]	罗宏，蒋剑琴，曾庆凯. 用于软件保护的代码混淆技术[J]. 计算机工程, 2006, 32(11): 177-179.

选择文件类型/文献管理软件名称

选择包含的内容