面向视频编码的前处理技术研究

doi:10.19678/j.issn.1000-3428.0069556

摘要/Abstract

摘要：

视频数据量的迅猛增长给有限带宽带来了严峻挑战, 为此需提升视频编码效率。视频编码前处理技术能够在不改变编码器核心算法和参数设置的基础上, 降低视频的数据量, 以达到提升视频编码效率的目的, 具备良好的兼容性。提出一种退化补偿多维重建(DCMR)前处理方法, 旨在多维度提取视频图像中与后续编码过程密切相关的特征, 并将这些特征重建为视频图像。首先, 设计退化补偿模型, 在去除编码噪声的同时恢复传输过程中引起的图像退化; 其次, 构建轻量级的多维特征重建网络, 结合残差学习和特征蒸馏原理, 从空间和通道维度提取编码相关特征, 并对提取到的特征进行重建; 最后, 为了恢复去噪过程中丢掉的高频细节, 在DCMR中添加加载着加权引导滤波细节增强卷积模块的辅助分支。在损失函数方面, 选择平均绝对值误差(MAE)损失和多尺度结构相似性(MS-SSIM)损失的组合, 通过分配不同的权重实现多目标优化。在部署阶段, 直接将DCMR集成到现有的任意标准视频编码器前, 无须更改任何编码、流媒体以及解码设置。实验结果表明, DCMR方法可以在H.266/VVC下实现BD-rate(VMAF)平均提高21.6%、BD-rate(MOS)平均提高6.98%的性能增益。

关键词: 视频编码, 前处理技术, 高频信息, 细节增强, H. 266/VVC

Abstract:

The rapid increase in video data volume poses severe challenges when available bandwidth is limited, necessitating an improvement in video coding efficiency. Video pre-coding processing techniques can reduce video data volume without altering the core algorithms and parameter settings of the encoder, thereby enhancing video coding efficiency while demonstrating good compatibility. This paper proposes a Degradation Compensation and Multi-dimensional Reconstruction (DCMR) pre-processing method, which focuses on extracting features from video images across multiple dimensions that are closely related to the subsequent coding process and reconstructing these features into video images. First, a degraded compensation model is designed to remove coding noise while restoring the image degradation caused during transmission. Second, a lightweight multi-dimensional feature reconstruction network is constructed that combines the principles of residual learning and feature distillation to extract coding-related features from both the spatial and channel dimensions and reconstruct the extracted features. Finally, to restore the high-frequency details lost during the denoising process, an auxiliary branch incorporating a weighted guided filter-based detail enhancement convolution module is added to DCMR. In terms of loss functions, a combination of the Mean Absolute Error (MAE) loss and Multi-Scale Structural Similarity Index Measure (MS-SSIM) loss is selected to achieve multi-objective optimization by assigning different weights. During the deployment phase, DCMR can be directly integrated into any existing standard video encoder without modifying the coding, streaming media, or decoding settings. Experimental results demonstrate that the DCMR method can achieve average performance gains of 21.6% and 6.98% in terms of BD-rate (VMAF) and BD-rate (MOS) under H.266/VVC.

Key words: video coding, pre-processing techniques, high-frequency information, detail enhancement, H. 266/VVC

吕梦帆, 商习武, 李国平, 王国中. 面向视频编码的前处理技术研究[J]. 计算机工程, 2025, 51(11): 294-303.

LÜ Mengfan, SHANG Xiwu, LI Guoping, WANG Guozhong. Research on Pre-processing Techniques for Video Coding[J]. Computer Engineering, 2025, 51(11): 294-303.

https://www.ecice06.com/CN/Y2025/V51/I11/294

图/表 11

图1 DCMR前处理方法整体架构

Fig.1 Overall architecture of DCMR pre-processing method

图2 DCMR前处理方法部署流程

Fig.2 Deployment workflow of DCMR pre-processing method

图3 DCM结构

Fig.3 DCM structure

图4 DCM退化前后对比

Fig.4 Comparison before and after DCM degradation

图5 SCconv处理前后RFDB的输出特征图对比

Fig.5 Comparison of output feature maps of RFDB before and after SCconv processing

图6 WDEC细节增强前后对比

Fig.6 Comparison before and after WDEC detail enhancement

图7 DCMR的训练损失曲线

Fig.7 The training loss curve of DCMR

图8 H.266/VVC与DCMR + H.266/VVC的对比

Fig.8 Comparison between H.266/VVC and DCMR + H.266/VVC

图9 VMAF下VVC Class A~E数据集的RD Curves

Fig.9 RD Curves of VVC Class A~E dataset under VMAF

图10 消融实验结果

Fig.10 Results of ablation experiment

参考文献 36

1	李佳. 5G+4K超高清视频产业的发展前景探讨. 中文科技期刊数据库(全文版)经济管理, 2023 (2): 4.
	LI J . Discussion on the development prospect of 5G+4K ultra-HD video industry. Economic Management (Full-Text Version), 2023 (2): 4.
2	BROSS B , WANG Y K , YE Y , et al. Overview of the Versatile Video Coding (VVC) standard and its applications. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31 (10): 3736- 3764. doi: 10.1109/TCSVT.2021.3101953
3	朱秀昌, 唐贵进. H.266/VVC: 新一代通用视频编码国际标准. 南京邮电大学学报(自然科学版), 2021, 41 (2): 1- 11.
	ZHU X C , TANG G J . H.266/VVC: versatile video coding international standard. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2021, 41 (2): 1- 11.
4	LI Z G , ZHENG J H , ZHU Z J , et al. Weighted guided image filtering. IEEE Transactions on Image Processing, 2015, 24 (1): 120- 129. doi: 10.1109/TIP.2014.2371234
5	WANG Z, SIMONCELLI E P, BOVIK A C. Multiscale structural similarity for image quality assessment[C]//Proceedings of the 27th Asilomar Conference on Signals, Systems & Computers. Washington D.C., USA: IEEE Press, 2020: 1398-1402.
6	KARUNARATNE P V, SEGALL C A, KATSAGGELOS A K. A rate-distortion optimal video pre-processing algorithm[C]//Proceedings of the 2001 International Conference on Image Processing. Washington D.C., USA: IEEE Press, 2001: 481-484.
7	LEE J . Automatic prefilter control by video encoder statistics. Electronics Letters, 2002, 38 (11): 503. doi: 10.1049/el:20020361
8	JAIN C, SETHURAMAN S. A low-complexity, motion-robust, spatio-temporally adaptive video de-noiser with in-loop noise estimation[C]//Proceedings of the 15th IEEE International Conference on Image Processing. Washington D.C., USA: IEEE Press, 2008: 557-560.
9	LU S P , ZHANG S H . Saliency-based fidelity adaptation preprocessing for video coding. Journal of Computer Science and Technology, 2011, 26 (1): 195- 202. doi: 10.1007/s11390-011-9426-5
10	SHAW M Q , ALLEBACH J P , DELP E J . Color difference weighted adaptive residual preprocessing using perceptual modeling for video compression. Signal Processing: Image Communication, 2015, 39, 355- 368. doi: 10.1016/j.image.2015.04.008
11	VANAM R, REZNIK Y A. Perceptual pre-processing filter for user-adaptive coding and delivery of visual information[C]//Proceedings of the Picture Coding Symposium (PCS). Washington D.C., USA: IEEE Press, 2013: 426-429.
12	VANAM R, KEROFSKY L J, REZNIK Y A. Perceptual pre-processing filter for adaptive video on demand content delivery[C]//Proceedings of the IEEE International Conference on Image Processing (ICIP). Washington D.C., USA: IEEE Press, 2014: 2537-2541.
13	VIDAL E , STURMEL N , GUILLEMOT C , et al. New adaptive filters as perceptual preprocessing for rate-quality performance optimization of video coding. Signal Processing: Image Communication, 2017, 52, 124- 137. doi: 10.1016/j.image.2016.12.003
14	XIANG G Q, JIA H Z, LIU J, et al. Adaptive perceptual preprocessing for video coding[C]//Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS). Washington D.C., USA: IEEE Press, 2016: 2535-2538.
15	STERN M K , JOHNSON J H . Just noticeable difference. The Corsini Encyclopedia of Psychology, 2010, 5, 1- 2.
16	AHMED N , NATARAJAN T , RAO K R . Discrete cosine transform. IEEE Transactions on Computers, 1974, 23 (1): 90- 93.
17	BHAT M, THIESSE J M, LE CALLET P. HVS based perceptual pre-processing for video coding[C]//Proceedings of the 27th European Signal Processing Conference (EUSIPCO). Washington D.C., USA: IEEE Press, 2019: 1-5.
18	URVOY M , GOUDIA D , AUTRUSSEAU F . Perceptual DFT watermarking with improved detection and robustness to geometrical distortions. IEEE Transactions on Information Forensics and Security, 2014, 9 (7): 1108- 1119. doi: 10.1109/TIFS.2014.2322497
19	GULERYUZ O G, CHOU P A, HOPPE H, et al. Sandwiched image compression: wrapping neural networks around a standard codec[C]//Proceedings of the IEEE International Conference on Image Processing (ICIP). Washington D.C., USA: IEEE Press, 2021: 3757-3761.
20	RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[EB/OL]. [2023-10-05]. https://link.springer.com/chapter/10.1007/978-3-319-24574-4_28.
21	TALEBI H , KELLY D , LUO X Y , et al. Better compression with deep pre-editing. IEEE Transactions on Image Processing, 2021, 30, 6673- 6685. doi: 10.1109/TIP.2021.3096085
22	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2016: 770-778.
23	JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[EB/OL]. [2023-10-05]. https://arxiv.org/abs/1506.02025.
24	CHADHA A, ANDREOPOULOS Y. Deep perceptual preprocessing for video coding[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2021: 14847-14856.
25	MA C, WU Z, CAI C, et al. Rate-perception optimized preprocessing for video coding[EB/OL]. [2023-10-05]. https://arxiv.org/abs/2301.10455.
26	ZHANG Y L, TIAN Y P, KONG Y, et al. Residual dense network for image super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 2472-2481.
27	WALLACE G K. Overview of the JPEG (ISO/CCITT) still image compression standard[EB/OL]. [2023-10-05]. https://www.spiedigitallibrary.org/conference-proceedings-of-spie/1244/0000/Overview-of-the-JPEG-ISOCCITT-still-image-compression-standard/10.1117/12.19537.short.
28	VU T, NGUYEN C V, PHAM T X, et al. Fast and efficient image quality enhancement via desubpixel convolutional neural networks[EB/OL]. [2023-10-05]. https://link.springer.com/chapter/10.1007/978-3-030-11021-5_16.
29	LIU J, TANG J, WU G S. Residual feature distillation network for lightweight image super-resolution[EB/OL]. [2023-10-05]. https://arxiv.org/abs/2009.11551.
30	LI J F, WEN Y, HE L H. SCconv: spatial and channel reconstruction convolution for feature redundancy[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2023: 6153-6162.
31	SHI W Z, CABALLERO J, HUSZAR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2016: 1874-1883.
32	AGUSTSSON E, TIMOFTE R. NTIRE 2017 challenge on single image super-resolution: dataset and study[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Washington D.C., USA: IEEE Press, 2017: 1122-1131.
33	BOYCE J, SUEHRING K, LI X, et al. JVET common test conditions and software reference configurations[EB/OL]. [2023-10-05]. https://www.researchgate.net/publication/326506581_JVET-J1010_JVET_common_test_conditions_and_software_reference_configurations.
34	KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. [2023-10-05]. https://arxiv.org/abs/1412.6980.
35	PASZKE A, GROSS S, MASSA F, et al. PyTorch: an imperative style, high-performance deep learning library[EB/OL]. [2023-10-05]. https://arxiv.org/abs/1912.01703.
36	LI Z, BAMPIS C, NOVAK J, et al. VMAF: the journey continues[EB/OL]. [2023-10-05]. http://mcl.usc.edu/wp-content/uploads/2018/10/2018-10-25-Netflix-Worked-with-Professor-Kuo-on-Video-Quality-Metric-VMAF.pdf.

[1]	白邵宙, 张浩, 赵景波, 张振楷, 元辉. 基于颜色均衡与特征融合的水下图像增强框架[J]. 计算机工程, 2025, 51(10): 336-345.
[2]	顾轶寅, 王鸿奎, 殷海兵. 基于上下文自适应阈值剪枝的快速依赖量化算法[J]. 计算机工程, 2023, 49(7): 143-149.
[3]	庄子杰, 范之国, 金海红, 宫凯强. 基于水体衰减系数反演的水下图像复原方法[J]. 计算机工程, 2023, 49(1): 258-269.
[4]	李莉, 王新强, 银珊. 基于衰减补偿与直方图拉伸的水下图像增强算法[J]. 计算机工程, 2022, 48(6): 222-227.
[5]	陈乔松, 蒲柳, 张羽, 孙开伟, 邓欣, 王进. 结合整体注意力与分形稠密特征的图像超分辨率重建[J]. 计算机工程, 2022, 48(11): 207-214,223.
[6]	沙月, 杨静. H.266合并模式候选决策的研究与改进[J]. 计算机工程, 2022, 48(1): 260-265.
[7]	李维, 范彩霞. H.266/VVC帧内预测模式快速判决方法[J]. 计算机工程, 2021, 47(10): 221-225,235.
[8]	谢晓燕, 辛晓斐, 朱筠, 王飞龙, 刘阳. 基于边缘检测的3D-HEVC深度图运动估计算法[J]. 计算机工程, 2019, 45(7): 264-267.
[9]	刘余福,郎文辉,贾光帅. HXDSP平台上矩阵乘法的实现与性能分析[J]. 计算机工程, 2019, 45(4): 25-29.
[10]	詹亘,肖晶,陈宇静,陈军. 面向自适应码率视频直播的码率控制算法[J]. 计算机工程, 2019, 45(3): 268-272.
[11]	韩煦, 张国强, 高茜. 基于SVC与多网络接口的DASH调度算法[J]. 计算机工程, 2019, 45(12): 243-248.
[12]	崔佰会,高戈,姜林. 基于神经网络的AVS-P10开环模式选择算法优化[J]. 计算机工程, 2018, 44(9): 256-262.
[13]	任云,程福林,黎洪松. 基于频率敏感三维自组织映射的立体视频视差估计算法[J]. 计算机工程, 2018, 44(5): 252-255.
[14]	郭磊,王晓东,王健,徐博文. 基于纹理主方向强度的HEVC帧内快速分层算法[J]. 计算机工程, 2018, 44(3): 307-314.
[15]	王明青,杨博文,杨坚. LTE可伸缩视频组播的动态资源分配算法[J]. 计算机工程, 2018, 44(10): 274-280.

选择文件类型/文献管理软件名称

选择包含的内容