作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (4): 159-165,173. doi: 10.19678/j.issn.1000-3428.0063495

• 先进计算与数据处理 • 上一篇    下一篇

基于RISC‐V的FFmpeg多媒体算法库优化策略

张桢1,2, 梁军1, 贾海鹏2, 张云泉2, 李青1   

  1. 1. 北京联合大学 北京市信息服务工程重点实验室, 北京 100101;
    2. 中国科学院计算技术研究所 计算机体系结构国家重点实验室 北京 100190
  • 收稿日期:2021-12-09 修回日期:2022-02-19 发布日期:2023-04-07
  • 作者简介:张桢(1996-),男,硕士研究生,主研方向为并行算法优化;梁军(通信作者),教授;贾海鹏,高级工程师、博士;张云泉,研究员、博士;李青,副教授、博士。
  • 基金资助:
    国家自然科学基金(61972376);北京联合大学科研项目(ZK50202002)。

Optimization Strategy of FFmpeg Multimedia Algorithm Library Based on RISC-V

ZHANG Zhen1,2, LIANG Jun1, JIA Haipeng2, ZHANG Yunquan2, LI Qing1   

  1. 1. Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China;
    2. State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2021-12-09 Revised:2022-02-19 Published:2023-04-07

摘要: RISC-V处理器的广泛应用使得FFmpeg多媒体算法库在RISC-V平台上的高性能实现日益重要。提出一种基于RISC-V架构的系列优化策略,针对开源音视频多媒体FFmpeg算法库中不同特征和计算密度的算法,利用RISC-V指令集的扩展性对算法库中某些耗时的算法进行指令加速和并行优化。在深入研究RISC-V开源架构的基础上,构建一个基于RISC-V开源架构的高性能FFmpeg算法库。针对不连续访存类算法、数据依赖类算法、数据快速转换类算法,从向量单元配置、向量化访存、汇编优化、指令流水优化4个方面出发,大幅提升FFmpeg算法库在RISC-V处理器上的性能。实验结果表明,采用以上优化策略后的FFmpeg算法库在基于RISC-V架构的XT-910芯片上的性能得到明显提升,其中的不连续访存类算法、数据依赖类算法、数据快速转换类算法的加速比分别为8.20、3.67、3.62。

关键词: 开源指令集架构, FFmpeg多媒体算法库, 向量化访存, 汇编优化, 指令流水优化

Abstract: The widespread application of RISC-V processors has made the high-performance implementation of FFmpeg multimedia algorithm library on the RISC-V platform increasingly important.This study proposes a series of RISC-V architecture-based optimization strategies aimed at algorithms with different characteristics and computational densities in the open source audio and video multimedia FFmpeg algorithm library and uses the extensibility of the RISC-V instruction set to accelerate and optimize the instructions of few time-consuming algorithms in the library.Based on an in-depth study of the RISC-V open source architecture, a high-performance FFmpeg algorithm library based on RISC-V is built.The performance of the FFmpeg algorithm library on RISC-V processors is significantly improved with the aim of discontinuous memory retrieval, data dependency, and fast data conversion algorithms in four aspects:vector unit configuration, vectorized memory access, assembly optimization, and instruction pipeline optimization.The experimental results show that adoption of the aforementioned optimization strategy significantly improved the performance of the FFmpeg algorithm library on the XT-910 chip based on RISC-V architecture, and the speedup ratios of the discontinuous memory access, data dependency, and data fast conversion algorithms are 8.20, 3.67, and 3.62, respectively.

Key words: open source Instruction Set Architecture(ISA), FFmpeg multimedia algorithm library, vectorized memory access, assembly optimization, instruction pipeline optimization

中图分类号: