基于多尺度特征融合的新能源数据补全方法

doi:10.19678/j.issn.1000-3428.0252894

摘要/Abstract

摘要： 在新能源发电系统中，数据缺失问题严重制约了设备运行状态评估与故障预警的准确性。由于新能源场景下的数据通常具有高复杂性、长序列依赖性以及强波动性，传统的数据补全方法在准确性与泛化能力方面难以满足实际应用需求。为此，本文提出了一种基于多尺度特征融合的新能源缺失数据补全方法。首先，采用皮尔逊相关系数与最大互信息系数对多变量特征进行筛选，以提升输入数据的相关性与信息质量。随后，设计了一种全新的时序数据补全模型——AFMFormer（Adaptive Frequency-aware Multi-scale Transformer），该模型首先通过自适应频域特征增强模块对输入序列进行频域分解与主频增强，从而实现对复杂长序列中主要特征的突出。接着，模型引入两条并行时间特征提取分支Patch-based Transformer、Standard Transformer，其中，Patch-based Transformer用于捕捉短期时间序列特征，Standard Transformer用于提取长期时间序列特征。最后，通过特征融合模块对两个分支的输出结果进行融合，生成最终的缺失值补全结果。实验结果表明，所提出模型的评价指标均显著优于基线方法，其中，在风电、光伏数据集上的均方误差相较最优基线模型分别降低49.3%和31.5%，显著提升补全效果。

Abstract: In new energy power generation systems, missing data severely constrains the reliability of equipment condition assessment and fault prediction. The data in such scenarios typically exhibit high complexity, long-term dependencies, and strong volatility, making conventional imputation techniques inadequate in terms of both accuracy and generalization. To address these limitations, this paper proposes AFMFormer, an adaptive frequency-aware multi-scale transformer designed for imputation in new energy systems. Initially, Pearson correlation coefficients and maximal information coefficients are employed to select informative multivariate features, thereby enhancing the relevance and quality of the input data. AFMFormer integrates an adaptive frequency-domain feature enhancement module that performs frequency decomposition and dominant frequency amplification, emphasizing critical components within complex long sequences. Furthermore, two parallel temporal branches—a Patch-based Transformer for short-term dynamics and a Standard Transformer for long-term dependencies—jointly capture comprehensive temporal representations. Finally, a feature fusion mechanism combines the outputs of both branches to generate the imputed sequences. The experimental results show that the evaluation metrics of the proposed model are all significantly better than the baseline method, in which the mean square errors on the wind and PV datasets are reduced by 49.3% and 31.5%, respectively, compared with the optimal baseline model, which significantly improves the imputation effect.

刘佳乐, 邓韦斯, 胡甲秋, 荆朝霞, 邹文仲. 基于多尺度特征融合的新能源数据补全方法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252894.

LIU Jiale, DENG Weisi, HU Jiaqiu, JING Zhaoxia, ZOU Wenzhong. Multi-scale Feature Fusion Based New Energy Data Imputation Method[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252894.

参考文献

[1] 焦杰,文泽军.基于SCADA数据的风力发电机发电性能指标评估[J].现代电力,2020,37(05): 539-543. Jiao J, Wen Z J. Evaluation of wind turbine power generation performance indices based on SCADA data[J]. Modern Electric Power, 2020, 37(5): 539–543. (in Chinese)
[2] 闫炯程,李常刚,刘玉田.数据驱动的新型电力系统安全风险预警综述[J].电网技术,2024,48(12): 4989-5002. Yan J C, Li C G, LiuY T. Review of data-driven security risk early warning research of the new-type power system[J]. Power System Technology, 2024, 48(12): 4989–5002. (in Chinese)
[3] 满建浩,刘才玮,刘峰,等.基于CS-SVR优化算法的结构健康监测数据修复研究[J].结构工程师,2022,38(04):75-81. Man J H, Liu C W, Liu F, et al. Research on structural health monitoring data restoration based on CS-SVR optimization algorithm[J]. Structural Engineer, 2022, 38(4): 75–81. (in Chinese)
[4] 徐夏楠,张洪.基于信息增益的加权贝叶斯插补法及其在心脏病类医疗缺失数据分析中的应用[J].复旦学报(自然科学版),2022,61(03):335-341+352. Xu X N, Zhang H. Weighted Bayesian interpolation based on information gain and its application to medical missing data about heart disease[J]. Journal of Fudan University (Natural Science), 2022, 61(3): 335–341+352. (in Chinese)
[5] 李富柏,焦瑞莉,薄宇,等.基于DTWKNN的电力缺失数据补全方法[J].北京信息科技大学学报(自然科学版),2023,38(05):32-38. Li F B, Jiao R L, Bo Y, et al. Power missing data completion method based on DTWKNN[J]. Journal of Beijing Information Science & Technology University (Natural Science Edition), 2023, 38(5): 32–38 (in Chinese)
[6] 赵敏,米子川.基于HCW-随机森林的时间序列插补方法与应用[J].统计与决策,2025,41(09):60-65. Zhao M, Mi Z C. Time series interpolation method based on HCW–Random Forest and its application[J]. Statistics and Decision, 2025, 41(9): 60–65. (in Chinese)
[7] Yang S, Dong M, Wang Y, et al. Adversarial Recurrent Time Series Imputation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(4): 1639-1650.
[8] 李大海, 吕春桂, 王振东. 基于双分支序列残差注意力的场景文本图像超分辨率重建[J]. 计算机工程, 2024, 50(9): 286-295. Li D H, Lü C G, Wang Z D. Scene text image super-resolution reconstruction based on dual-branched sequence residual attention[J]. Computer Engineering, 2024, 50(9): 286-295. (in Chinese)
[9] Zhang H, Qi B, Wang H. R-LSTM Algorithm for Imputation of Air Quality Data Based on Sample Screening[C]//2024 10th International Conference on Computer and Communications (ICCC). Chengdu, China: IEEE, 2024: 17-21.
[10] Che Z, Purushotham S, Cho K, et al. Recurrent neural networks for multivariate time series with missing values[J]. Scientific Reports, 2018, 8(1): 6085-6096.
[11] Cao W, Wang D, Li J, et al. BRITS: Bidirectional recurrent imputation for time series[C]//Proceedings of the 31st Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates, 2018: 6775-6785.
[12] Peng B, Dong C. TsCDD-GAN: A Conditional Dual-Discriminator Generative Adversarial Network for Incomplete Time Series Data Imputation and Clustering[C]//2024 5th International Conference on Computer Vision, Image and Deep Learning (CVIDL). Xi'an, China: IEEE, 2024: 1193-1197.
[13] Liu X, Zhang Z. A two-stage deep autoencoder-based missing data imputation method for wind farm SCADA data[J]. IEEE Sensors Journal, 2021, 21(9): 10933-10945.
[14] Ou C, Zhu H, Shardt Y A W, et al. Missing-data imputation with position-encoding denoising auto-encoders for industrial processes[J]. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 1-11.
[15] 刘慧, 郭特, 刘栋, 等. 基于量化降噪自编码器的遮挡微表情重建方法研究[J]. 计算机工程, 2025, 51(5): 288-304. Liu H, Guo T, Liu D, et al. Research on reconstruction method of occluded micro-expressions based on quantized denoising autoencoder[J]. Computer Engineering, 2025, 51(5): 288-304. (in Chinese)
[16] Chang Z, Liu S, Cai Z, et al. Continuous Latent Adversarial Autoencoder: A Time-Sensitive Method for Incomplete Time Series Modeling[J]. IEEE Internet of Things Journal, 2024, 12(7): 8552-8569.
[17] Miao X, Wu Y, Wang J, et al. Generative semi-supervised learning for multivariate time series imputation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2021: 8983-8991.
[18] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 30th Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates, 2017: 5998-6008.
[19] Du W, Côté D, Liu Y. SAITS: Self-attention-based imputation for time series[J]. Expert Systems with Applications, 2023, 219: 119619.
[20] Eldele E, Ragab M, Chen Z, et al. TSLANet: Rethinking transformers for time series representation learning[C]//Proceedings of the 41st International Conference on Machine Learning. Vienna, Austria: PMLR, 2024: 12409-12428.
[21] Liu Y, Hu T, Zhang H, et al. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting[C]//The Eleventh International Conference on Learning Representations. Vienna, Austria: ICLR, 2024.
[22] Nie Y, Nguyen N H, Sinthong P, et al. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers[C]//The Eleventh International Conference on Learning Representations. Kigali, Rwanda: ICLR, 2023.
[23] Zhou H, Zhang S, Peng J, et al. Informer: Beyond efficient transformer for long sequence time-series forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2021: 11106-11115.
[24] Yi K, Zhang Q, Fan W, et al. Frequency-domain MLPs are more effective learners in time series forecasting[C]//Proceedings of the 37th Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates, 2023: 76656-76679.
[25] Zeng A, Chen M, Zhang L, et al. Are transformers effective for time series forecasting?[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press, 2023: 11121-11128.
[26] Chen T, Guestrin C. Xgboost: A scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA: ACM, 2016: 785-794.
[27] Hu Y, Zhang G, Liu P, et al. TimeFilter: Patch-Specific Spatial-Temporal Graph Filtration for Time Series Forecasting[C]//Proceedings of the 42nd International Conference on Machine Learning. Vancouver, Canada: PMLR, 2025: 24893-24911.
[28] PASSOS J, SAKAGAMI Y, SANTOS P, et al. Costal operating wind farms: two datasets with concurrent SCADA, LiDAR and turbulent fluxes (Version 1.0.0)[DB/OL]. Zenodo, 2017-07-01[2025-07-22]. https://doi.org/10.5281/zenodo.1475197.
[29] 南通信息技术学院. NTIT02# 固定式单晶硅光伏电站运行数据集[DB/OL]. 2020-09-30[2025-12-01]. https://dqxy.ntit.edu.cn/2020/0421/c3601a29681/page.htm.
[30] 詹兆康, 胡旭光, 赵浩然, 等. 基于多变量时空融合网络的风机数据缺失值插补研究[J]. 自动化学报, 2024, 50(6): 1171-1184. Zhan Z K, Hu X G, Zhao H R, et al. Study of Missing Value Imputation in Wind Turbine Data Based on Multivariate Spatiotemporal Integration Network[J]. Acta Automatica Sinica, 2024, 50(6): 1171-1184. (in Chinese)

选择文件类型/文献管理软件名称

选择包含的内容