作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (1): 51-59. doi: 10.19678/j.issn.1000-3428.0068372

• 人工智能与模式识别 • 上一篇    下一篇

基于SARIMA-SVM模型的季节性PM2.5浓度预测

宋英华, 徐亚安, 张远进*()   

  1. 武汉理工大学安全科学与应急管理学院, 湖北 武汉 430070
  • 收稿日期:2023-09-11 出版日期:2025-01-15 发布日期:2025-02-10
  • 通讯作者: 张远进
  • 基金资助:
    湖北省自然科学基金(2021CFB017); 安全预警与应急联动技术湖北省协同创新中心开放课题(AY2023-1-3)

Seasonal PM2.5 Concentration Prediction Based on SARIMA-SVM Model

SONG Yinghua, XU Yaan, ZHANG Yuanjin*()   

  1. School of Safety Science and Emergency Management, Wuhan University of Technology, Wuhan 430070, Hubei, China
  • Received:2023-09-11 Online:2025-01-15 Published:2025-02-10
  • Contact: ZHANG Yuanjin

摘要:

空气污染是城市环境治理的主要问题之一, 而PM2.5是影响空气质量的重要因素。针对传统时间序列预测模型对PM2.5浓度预测缺少季节性因素分析, 预测精度不够高的问题, 提出一种基于机器学习的季节性差分自回归滑动平均-支持向量机(SARIMA-SVM)融合模型。该融合模型为串联型融合模型, 将数据拆分为线性部分与非线性部分。SARIMA模型在差分自回归滑动平均(ARIMA)模型的基础上增加了季节性因素提取参数, 能有效分析PM2.5浓度数据的季节性规律变化趋势, 较好地预测数据未来的线性变化趋势。结合SVM模型对预测数据的残差序列进行优化, 利用滑动步长预测法确定残差序列的最优预测步长, 通过网格搜索确定最优模型参数, 实现对PM2.5浓度数据的长期预测, 同时提高整体预测精度。通过对武汉市近5年的PM2.5浓度监测数据进行分析, 结果表明该融合模型的预测准确率相较于单一模型有很大提升, 在相同的实验环境下比单一的ARIMA、Auto ARIMA、SARIMA模型分别提升了99%、99%、98%, 稳定性也更好, 为PM2.5浓度预测研究提供了新的思路。

关键词: 季节性差分自回归滑动平均, 支持向量机, 融合模型, PM2.5浓度, 季节性预测

Abstract:

Air pollution is one of the primary challenges in urban environmental governance, with PM2.5 being a significant contributor that affects air quality. As the traditional time-series prediction models for PM2.5 often lack seasonal factor analysis and sufficient prediction accuracy, a fusion model based on machine learning, Seasonal Autoregressive Integrated Moving Average (SARIMA)-Support Vector Machine (SVM), is proposed in this paper. The fusion model is a tandem fusion model, which splits the data into linear and nonlinear parts. Based on the Autoregressive Integral Moving Average (ARIMA) model, the SARIMA model adds seasonal factor extraction parameters, to effectively analyze and predict the future linear seasonal trend of PM2.5 data. Combined with the SVM model, the sliding step size prediction method is used to determine the optimal prediction step size for the residual series, thereby optimizing the residual sequence of the predicted data. The optimal model parameters are further determined through grid search, leading to the long-term predictions of PM2.5 data and improves overall prediction accuracy. The analysis of the PM2.5 monitoring data in Wuhan for the past five years shows that prediction accuracy of the fusion model is significantly higher than that of the single model. In the same experimental environment, the accuracy of the fusion model is improved by 99%, 99%, and 98% compared with those of ARIMA, Auto ARIMA, and SARIMA models, respectively and the stability of the model is also better, thus providing a new direction for the prediction of PM2.5.

Key words: Seasonal Autoregressive Integrated Moving Average (SARIMA), Support Vector Machine (SVM), fusion model, PM2.5 concentration, seasonal prediction