作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (11): 294-303. doi: 10.19678/j.issn.1000-3428.0069556

• 图形图像处理 • 上一篇    下一篇

面向视频编码的前处理技术研究

吕梦帆, 商习武*(), 李国平, 王国中   

  1. 上海工程技术大学电子电气工程学院, 上海 201620
  • 收稿日期:2024-03-13 修回日期:2024-06-24 出版日期:2025-11-15 发布日期:2024-08-20
  • 通讯作者: 商习武
  • 基金资助:
    国家自然科学基金(62001283)

Research on Pre-processing Techniques for Video Coding

LÜ Mengfan, SHANG Xiwu*(), LI Guoping, WANG Guozhong   

  1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2024-03-13 Revised:2024-06-24 Online:2025-11-15 Published:2024-08-20
  • Contact: SHANG Xiwu

摘要:

视频数据量的迅猛增长给有限带宽带来了严峻挑战, 为此需提升视频编码效率。视频编码前处理技术能够在不改变编码器核心算法和参数设置的基础上, 降低视频的数据量, 以达到提升视频编码效率的目的, 具备良好的兼容性。提出一种退化补偿多维重建(DCMR)前处理方法, 旨在多维度提取视频图像中与后续编码过程密切相关的特征, 并将这些特征重建为视频图像。首先, 设计退化补偿模型, 在去除编码噪声的同时恢复传输过程中引起的图像退化; 其次, 构建轻量级的多维特征重建网络, 结合残差学习和特征蒸馏原理, 从空间和通道维度提取编码相关特征, 并对提取到的特征进行重建; 最后, 为了恢复去噪过程中丢掉的高频细节, 在DCMR中添加加载着加权引导滤波细节增强卷积模块的辅助分支。在损失函数方面, 选择平均绝对值误差(MAE)损失和多尺度结构相似性(MS-SSIM)损失的组合, 通过分配不同的权重实现多目标优化。在部署阶段, 直接将DCMR集成到现有的任意标准视频编码器前, 无须更改任何编码、流媒体以及解码设置。实验结果表明, DCMR方法可以在H.266/VVC下实现BD-rate(VMAF)平均提高21.6%、BD-rate(MOS)平均提高6.98%的性能增益。

关键词: 视频编码, 前处理技术, 高频信息, 细节增强, H. 266/VVC

Abstract:

The rapid increase in video data volume poses severe challenges when available bandwidth is limited, necessitating an improvement in video coding efficiency. Video pre-coding processing techniques can reduce video data volume without altering the core algorithms and parameter settings of the encoder, thereby enhancing video coding efficiency while demonstrating good compatibility. This paper proposes a Degradation Compensation and Multi-dimensional Reconstruction (DCMR) pre-processing method, which focuses on extracting features from video images across multiple dimensions that are closely related to the subsequent coding process and reconstructing these features into video images. First, a degraded compensation model is designed to remove coding noise while restoring the image degradation caused during transmission. Second, a lightweight multi-dimensional feature reconstruction network is constructed that combines the principles of residual learning and feature distillation to extract coding-related features from both the spatial and channel dimensions and reconstruct the extracted features. Finally, to restore the high-frequency details lost during the denoising process, an auxiliary branch incorporating a weighted guided filter-based detail enhancement convolution module is added to DCMR. In terms of loss functions, a combination of the Mean Absolute Error (MAE) loss and Multi-Scale Structural Similarity Index Measure (MS-SSIM) loss is selected to achieve multi-objective optimization by assigning different weights. During the deployment phase, DCMR can be directly integrated into any existing standard video encoder without modifying the coding, streaming media, or decoding settings. Experimental results demonstrate that the DCMR method can achieve average performance gains of 21.6% and 6.98% in terms of BD-rate (VMAF) and BD-rate (MOS) under H.266/VVC.

Key words: video coding, pre-processing techniques, high-frequency information, detail enhancement, H. 266/VVC