Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Workflow Generation Method Based on Monte Carlo Tree Search and Self-Refine

  

  • Published:2025-12-09

基于蒙特卡洛树搜索和自反馈优化的工作流生成方法

Abstract: To address the issues of limited single-round generation quality and low search efficiency in small-scale language models for automated workflow generation, a workflow generation method based on Monte Carlo Tree Search and Self-Refine (WGM-MCTSR) is proposed. This method enhances workflow generation performance through two core mechanisms: first, a workflow self-refine optimization mechanism is designed, which employs iterative generation-evaluation-reconstruction cycles to perform structural reconstruction or correction of workflows using feedback evaluation information, thereby compensating for the limited reasoning capabilities of small-scale language models; second, the selection and backpropagation phases of the Monte Carlo Tree Search algorithm are improved by introducing an Upper Confidence Bound Apply to Tree (UCT) selection strategy to replace the traditional soft-mix selection probability, and implementing a child node score backpropagation mechanism to dynamically adjust parent node selection probabilities, thus optimizing the search direction. Experimental results on six datasets including GSM8K, MATH, DROP, HotpotQA, HumanEval, and MBPP demonstrate that the method achieves solve rates of 70.11% and 23.45% in mathematical reasoning tasks, F1 scores of 54.87% and 52.47% in question-answering tasks, and pass rates of 81.83% and 58.82% in code generation tasks. Compared with existing workflow generation methods, the method achieves performance improvements of 5.4% on GSM8K and 9.6% on MATH, obtaining optimal results across all task types, which validates the effectiveness of the improved mechanisms in enhancing workflow generation efficiency and quality for small-scale language models.

摘要: 针对小规模语言模型在自动工作流生成中存在的单轮生成质量受限和搜索效率低下问题,提出一种基于蒙特卡洛树搜索和自反馈优化的工作流生成方法WGM-MCTSR(Workflow Generation Method based on Monte Carlo Tree Search and Self-Refine)。该方法通过两个核心机制提升工作流生成性能:一是设计工作流自反馈优化机制,采用生成-评估-重构的多轮迭代循环,利用反馈评估信息对工作流进行结构重构或修正,补偿小规模语言模型推理能力的不足;二是改进蒙特卡洛树搜索算法的选择和回溯阶段,引入上限置信区间(UCT)选择策略替代传统软混合选择概率,并通过子节点得分反向传播机制动态调整父节点被选概率,优化搜索方向。在GSM8K、MATH、DROP、HotpotQA、HumanEval和MBPP六个数据集上的实验表明,该方法在数学推理任务中解决率分别达到70.11%和23.45%,在问答任务中F1分数达到54.87%和52.47%,在代码生成任务中通过率达到81.83%和58.82%。与现有工作流生成方法相比,该方法在GSM8K上性能提升5.4%,在MATH上提升9.6%,在各类任务上均取得最优结果,验证了改进机制在提升小规模语言模型工作流生成效率和质量方面的有效性。