作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (3): 355-363. doi: 10.19678/j.issn.1000-3428.0069841

• 高性能计算与大数据 • 上一篇    下一篇

基于成分分解和多模态融合的云数据库产品用量预测

杨定裕1, 邓喻丰2, 钱诗友2,*(), 曹健2, 薛广涛2   

  1. 1. 阿里巴巴集团技术风险与效能部, 浙江 杭州 310000
    2. 上海交通大学计算机科学与工程系, 上海 200240
  • 收稿日期:2024-05-13 修回日期:2024-08-31 出版日期:2026-03-15 发布日期:2024-10-29
  • 通讯作者: 钱诗友
  • 作者简介:

    杨定裕(CCF会员), 男, 博士, 主研方向为分布式性能预测、大数据服务计算

    邓喻丰, 硕士研究生

    钱诗友(CCF会员、通信作者), 副研究员

    曹健(CCF高级会员)教授

    薛广涛(CCF高级会员), 教授

  • 基金资助:
    国家自然科学基金面上项目(62072301)

Cloud Database Product Usage Prediction Based on Component Decomposition and Multimodal Fusion

YANG Dingyu1, DENG Yufeng2, QIAN Shiyou2,*(), CAO Jian2, XUE Guangtao2   

  1. 1. Department of Technology Risk and Efficiency, Alibaba, Hangzhou 310000, Zhejiang, China
    2. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2024-05-13 Revised:2024-08-31 Online:2026-03-15 Published:2024-10-29
  • Contact: QIAN Shiyou

摘要:

云数据库技术以其灵活扩展、易于管理、按需收费等特点得到了广泛应用。业务方通常根据自身应用场景和实际需求选择云数据库产品, 而服务提供商确定产品中不同类型资源(如计算资源和存储资源的用量)来满足业务方的业务需求。云产品用量的准确预测对于提升资源使用效率、降低运营成本、保障服务质量(QoS)至关重要。然而, 云数据库产品用量预测场景复杂, 用量序列通常由存在复杂纠缠的多个成分组成, 并且不同业务方在不同云产品及计费项下的行为特征差异较大, 这对用量预测提出了巨大的挑战。针对这一难题, 提出一种基于成分分解和多模态融合的云数据库产品用量预测模型。该模型对存在复杂纠缠的时间序列进行成分分解, 融合多模态需求数据, 构建需求趋势与用量趋势之间的映射关系, 并通过自动调节成分的权重参数以获得准确的预测结果, 同时使用阿里云计算服务商的4个主要云数据库产品的真实生产数据进行预测效果评估, 并与5种预测算法进行性能比较。通过分析平均绝对百分比误差(MAPE)等评估指标, 实验结果表明, 该预测模型在4个云数据库产品中都有不同程度的预测准确率提升, 约18.6%~51.8%, 能更好地适用于云数据库产品用量预测场景, 有助于云服务提供商进行更准确的资源容量规划。

关键词: 云计算, 云数据库产品, 时间序列, 成分分解, 多模态

Abstract:

Cloud database technology has been widely used because of its flexible expansion, ease of management, and on-demand charging. Businesses usually select cloud database products based on their specific application scenarios and requirements. Service providers determine the usage of different types of resources, such as computing and storage, to satisfy service requirements. Accurate prediction of cloud product usage is critical for improving resource usage efficiency, reducing operational costs, and ensuring Quality of Service (QoS). However, predicting cloud database product usage is complex. A usage sequence typically comprises multiple interrelated components with complex entanglements. Additionally, the behavioral characteristics of different businesses vary according to cloud products and billing items, which poses a significant challenge for accurate usage prediction. To solve this problem, this study proposes a cloud database product usage prediction model based on component decomposition and multimodal fusion. This model effectively decomposes a time-series with complex entanglement, fuses multimodal demand data, builds a mapping relationship between demand and usage trends, and automatically adjusts the weight parameters of its components to obtain accurate prediction results. In this study, real production data from four major cloud database products from the Ali cloud computing service providers are used to evaluate the prediction effect and the performance is compared with that of five other prediction algorithms. Analyses of evaluation metrics, such as the Mean Absolute Percentage Error (MAPE), reveal that the proposed model improves prediction accuracy to different degrees in the four cloud database products, approximately 18.6%-51.8%. Therefore, this model can be applied to cloud database product usage prediction scenarios and help cloud service providers in improving the accuracy of resource capacity planning.

Key words: cloud computing, cloud database product, time-series, component decomposition, multimodal