Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Data Augmentation Methods for Imbalanced Samples in Power Systems

  

  • Published:2026-01-30

电力系统中不平衡样本的数据增强方法

Abstract: As a critical infrastructure, the power system is vulnerable to threats such as equipment failures and malicious data tampering, while the scarcity of abnormal samples restricts the performance of traditional detection models. To address the problem of abnormal data imbalance in the power system, this paper proposes a data augmentation method based on the Mixture of Experts Wasserstein Generative Adversarial Network (LT-MoEWGAN). This method innovatively integrates Long Short-Term Memory (LSTM) and Temporal Convolutional Network (TCN) as dual expert modules, and realizes dynamic weight allocation at the feature level through a gating network to construct a multi-scale temporal feature extractor for generating high-quality samples. Simulation experiments based on real power system datasets show that: 1) Based on the Wasserstein distance metric, the distribution difference between the data generated by this method and real samples is the smallest (with medians of 0.043 and 0.135 respectively), and taking WGAN as the baseline, the generation stability is improved by 33%; 2) On classifiers such as XGBoost, LightGBM, Random Forest, Decision Tree, CNN, GAT, and MTGF-Conv the Area Under the Curve (AUC) of the proposed algorithm is improved by 1.5%–2% compared with baseline methods such as SMOTE, ADASYN, Borderline-SMOTE, GAN, WGAN, WGAN-GP, DCGAN, and WM_CVAE. This method effectively enhances anomaly detection performance through high-quality data augmentation, thus providing a reliable data augmentation solution for abnormal detection in power systems, and its innovative architecture has theoretical reference value for time-series data generation tasks.

摘要: 电力系统作为关键基础设施,易受设备故障及恶意数据篡改等威胁,而异常样本的稀缺性导致传统检测模型性能受限。为了解决电力系统异常数据不平衡问题,提出基于混合专家生成对抗网络的数据增强方法(LT-MoEWGAN)。该方法创新性地集成了LSTM与TCN作为双专家模块,通过门控网络实现特征层级的动态权重分配,构建多尺度时序特征提取器以生成高质量样本。基于真实电力数据集的仿真实验结果表明:1) 基于Wasserstein距离度量,本方法生成数据与真实样本分布差异最小(中位数分别为0.043和0.135),且以WGAN为基准,生成稳定性提升33%;2) 在XGBoost、LightGBM、Random Forest、Decision Tree、CNN、GAT以及MTGF-Conv分类器上,在本文提出算法的基础上,AUC较SMOTE、ADASYN、Borderline-SMOTE和GAN、WGAN、WGAN-GP、DCGAN、WM_CVAE等基线方法提升1.5%-2%。该方法通过高质量数据增强有效改善异常检测性能,为电力系统异常检测提供了可靠的数据增强解决方案,其创新架构对时序数据生成任务具有理论参考价值。