作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (1): 68-78. doi: 10.19678/j.issn.1000-3428.0066797

• 人工智能与模式识别 • 上一篇    下一篇

面向类集成测试序列确定的强化学习方法

张晓天1, 王雅文1,2,*(), 谢志庆3, 金大海1, 宫云战1   

  1. 1. 网络与交换技术国家重点实验室, 北京 100876
    2. 广西密码学与信息安全重点实验室, 广西 桂林 541004
    3. 北京邮电大学计算机学院(国家示范性软件学院), 北京 100876
  • 收稿日期:2023-01-18 出版日期:2024-01-15 发布日期:2023-04-25
  • 通讯作者: 王雅文
  • 基金资助:
    国家自然科学基金(U1736110); 广西密码学与信息安全重点实验室研究课题(GCIS202103)

Reinforcement Learning Method for Class Integration Test Order Determination

Xiaotian ZHANG1, Yawen WANG1,2,*(), Zhiqing XIE3, Dahai JIN1, Yunzhan GONG1   

  1. 1. State Key Laboratory of Networking and Switching Technology, Beijing 100876, China
    2. Guangxi Key Laboratory of Cryptography and Information Security, Guilin 541004, Guangxi, China
    3. School of Computer Science(National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2023-01-18 Online:2024-01-15 Published:2023-04-25
  • Contact: Yawen WANG

摘要:

面向类集成测试序列的强化学习方法能够自适应地根据系统集成状态调整集成测试策略,是测试优化的关键技术之一,但现有方法普遍存在计算成本高且不适用于大规模软件系统、忽略测试风险的滞后性问题,大幅降低了适用性和可靠性。针对上述问题,提出一种具有重要值加权奖励的基于测试顺序的强化学习方法。优化强化学习建模,忽略节点在测试序列上的具体位置,减弱状态之间的相关性,提升模型可用性。结合深度强化学习模型,端到端地更新集成测试策略,减少值函数的误差。在奖励函数的设计上,引入修正的节点重要值,实现降低整体测试桩复杂度且提升关键类优先级的多目标优化求解。在SIR开源系统上的实验结果表明:优化的强化学习建模方式能够有效降低整体测试桩复杂度,并适用于大规模软件系统;融入修正节点重要值的奖励函数能够有效提升软件系统中关键类的优先级,平均提升幅度为55.38%。

关键词: 测试序列, 强化学习, 节点重要值, 奖励函数, 集成测试

Abstract:

The Reinforcement Learning(RL) strategy for class integration test order is one of the key technologies for test optimization. It can adaptively adjust the integration strategy according to the system integration state. However, the existing schemes have high computational costs, is unsuitable for large-scale software systems, and ignore the risk of testing, which greatly reduces their applicability and reliability. To address these issues, this study proposes a test order-based RL method with important value weighted rewards. First, the RL modeling is optimized, specific position of the node in the test order is ignored, correlation between states is weakened, and usability of the model is improved. Based on this, the test strategy can then be updated end-to-end by combining the deep RL model to reduce the value function error and be more accurate. Finally, the modified software node importance is introduced in the reward function to achieve a multi-objective optimization solution with low Overall Complexity(OCplx) and increased priority of key classes. The comparison and analysis of the models on the SIR open-source system proves that the proposed method can effectively reduce the complexity of the overall test stub and is suitable for large-scale software systems. Furthermore, the proposed reward function incorporating the modified node importance can effectively improve the priority of key classes in test orders, with an average increase of 55.38%.

Key words: test order, Reinforcement Learning(RL), node importance value, reward function, integration test