作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (5): 98-103,111. doi: 10.19678/j.issn.1000-3428.0061529

• 人工智能与模式识别 • 上一篇    下一篇

面向停电分类预测的因子分解机模型

冉懿1, 王润年1, 潘红伟1, 俞海猛2, 袁培森3   

  1. 1. 国网新疆电力有限公司 营销服务中心, 乌鲁木齐 830000;
    2. 国电南瑞南京控制系统有限公司, 南京 211106;
    3. 南京农业大学 人工智能学院, 南京 210095
  • 收稿日期:2021-04-30 修回日期:2021-06-06 发布日期:2021-06-09
  • 作者简介:冉懿(1988—),男,工程师,主研方向为电能计量、电力数据分析、数据挖掘;王润年、潘红伟、俞海猛,工程师;袁培森(通信作者),副教授、博士。
  • 基金资助:
    国家自然科学基金(61502236,61806097)。

Factorization Machine Model for Power Outage Classification Prediction

RAN Yi1, WANG Runnian1, PAN Hongwei1, YU Haimeng2, YUAN Peisen3   

  1. 1. Marketing Service Center, State Grid Xinjiang Electric Power Co., Ltd., Urumqi 830000, China;
    2. NARI-TECH Nanjing Control Systems Co., Ltd., Nanjing 211106, China;
    3. College of Artificial Intelligence, Nanjing Agricultural University, Nanjing 210095, China
  • Received:2021-04-30 Revised:2021-06-06 Published:2021-06-09

摘要: 可靠的电力供应对于工业生产和居民日常生活至关重要,通过对电力数据平台中的停电数据进行分析和挖掘,可以更好地了解配电网停电的潜在规律。分类预测是数据挖掘和分析中的常见技术,停电分类预测可以为企事业单位的停电规划安排提供决策参考。针对停电分类预测问题,提出一种基于因子分解机(FM)的停电数据分类预测模型。利用决策树算法计算停电数据中不同特征的基尼系数以得出重要性得分,从中筛选与停电预测关联度较大的非稀疏特征。根据不同地区的地理位置关系构建不同地区间的空间位置矩阵,并通过矩阵分解的方式构造不同地区在空间上的地理位置关联特征。为防止FM模型出现过拟合问题,在模型中加入L2-范数正则化。在此基础上,利用随机梯度下降的方法训练FM模型,通过训练完成的FM模型对停电数据进行分类预测。在真实停电数据集上的实验结果表明,该模型在训练数据集和测试数据集上的F1值和准确率分别高达0.90和0.89,优于DNN、SVM、XGBoost等模型。

关键词: 停电分类预测, 决策树, 矩阵分解, 因子分解机, 随机梯度下降方法

Abstract: Reliable power supply is important for industrial production and residential daily life.By analyzing and mining the outage data in power data platforms, we can better understand the potential law of network outage distributions.Classification prediction is a common technology in data mining and analysis.Outage classification prediction can provide a decision-making reference for outage planning as well as the arrangement of enterprises and institutions.Concerning blackout classification and prediction, a blackout data classification and prediction model based on the Factorization Machine(FM) model is proposed.The Gini coefficients of different features in outage data are calculated using a decision-tree algorithm to obtain the importance score, and the non-sparse features demonstrating a high correlation with outage prediction are selected.According to the geographical location relationship of different regions, the spatial location matrix between different regions is constructed, and the spatial geographic location correlating features of different regions are constructed using matrix decomposition.To prevent overfitting in the FM model, L2-norm regularization is added to the model.On this basis, the FM model is trained using random gradient descent, and the outage data are classified and predicted by the trained FM model.The experimental results on the real outage dataset show that the F1-score and accuracy of the model on the training and test datasets are as high as 0.90 and 0.89, respectively, which is better than other models, such as DNN, SVM, and XGBoost.

Key words: power outage classification prediction, decision tree, matrix decomposition, Factorization Machine(FM), Stochastic Gradient Descent(SGD) method

中图分类号: