作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于强化学习算法的软硬结合象鼻抓取器设计

  • 发布日期:2025-07-31

Design of modular soft-hard trunk gripper based on Reinforcement Learning algorithm

  • Published:2025-07-31

摘要: 本研究旨在设计并实现一款新型的模块化仿生象鼻抓取器,采用基于近端策略优化(Proximal Policy Optimization, PPO)强化学习算法的运动控制策略。该抓取器巧妙地融合了刚性与柔性的设计原则,构建了模块化的结构,从而显著增强了系统的灵活性和可扩展性。在硬件设计层面,应用了刚柔结合的策略:刚性部分保障了结构的强度与稳定性,而柔性部分则致力于适应各种形状和硬度的物体,以实现精准抓取。通过对仿生象鼻抓取器运用PPO算法进行运动控制,本研究成功模拟了象鼻的复杂运动行为,并将训练完成的模型应用于实际抓取任务中。在物理仿真环境下,设置了多种抓取任务,并通过PPO算法的持续迭代优化进行训练,使得抓取器逐渐学会在各种环境条件下自适应调整运动轨迹的能力,以实现精确抓取不同的物体。实验结果证实了该模块化仿生象鼻抓取器在多样化抓取任务中的出色表现,其抓取成功率超过90%,且抓取动作流畅自如。本研究的成果不仅验证了PPO算法在解决复杂的机器人抓取任务上的有效性,同时也为模块化机器人系统的设计及应用开辟了新途径,未来有望在智能制造、医疗辅助和灾害救援等多个领域发挥重要作用。

Abstract: This study aims to design and implement a novel modular bionic trunk gripper, which adopts a motion control strategy based on the proximal policy optimization (PPO) reinforcement learning algorithm. The gripper cleverly combines the design principles of rigidity and flexibility to build a modular structure, which significantly enhances the flexibility and scalability of the system. At the hardware design level, a rigid-flexible combination strategy is applied: the rigid part ensures the strength and stability of the structure, while the flexible part is committed to adapting to objects of various shapes and hardness to achieve precise grasping. By using the PPO algorithm for motion control of the bionic trunk gripper, this study successfully simulated the complex motion behavior of the trunk and applied the trained model to actual grasping tasks. In a physical simulation environment, a variety of grasping tasks were set up, and the gripper was trained through continuous iterative optimization of the PPO algorithm, so that the gripper gradually learned the ability to adaptively adjust the motion trajectory under various environmental conditions to achieve accurate grasping of different objects. The experimental results confirmed the excellent performance of the modular bionic trunk gripper in a variety of grasping tasks, with a grasping success rate of more than 90% and smooth grasping movements. The results of this study not only verified the effectiveness of the PPO algorithm in solving complex robot grasping tasks, but also opened up new avenues for the design and application of modular robot systems. In the future, it is expected to play an important role in many fields such as intelligent manufacturing, medical assistance and disaster relief.