Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2026, Vol. 52 ›› Issue (2): 79-88. doi: 10.19678/j.issn.1000-3428.0069787

• Computational Intelligence and Pattern Recognition • Previous Articles    

Adaptive Lossless Segmented Compression Method Integrating Temporal Dependencies and Data Features

CHEN Zhenqing1, WAN Jiafu1, ZHANG Rui2   

  1. 1. School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640, Guangdong, China;
    2. Shanxi Information Industry Technology Research Institute Co., Ltd., Taiyuan 030000, Shanxi, China
  • Received:2024-04-25 Revised:2024-08-29 Published:2024-09-09

融合时序依赖性与数据特征的自适应无损分段压缩方法

陈振清1, 万加富1, 张锐2   

  1. 1. 华南理工大学机械与汽车工程学院, 广东 广州 510640;
    2. 山西省信息产业技术研究院有限公司, 山西 太原 030000
  • 作者简介:陈振清,男,硕士研究生,主研方向为时序数据压缩、机器学习;万加富(通信作者),教授、博士,E-mail:mejwan@scut.edu.cn;张锐,工程师、硕士。
  • 基金资助:
    国家自然科学基金(U1801264)。

Abstract: Compression algorithms struggle to maintain a high compression ratio when handling complex and diverse patterns in time series data. Thus, selecting the appropriate compression algorithms tailored to different patterns is an urgent requirement. Existing adaptive compression schemes have low accuracy when determining the optimal compression algorithm. To address this issue, this paper proposes an Adaptive Lossless Segmented Compression method integrating Temporal Dependencies and data Features (ALSC-TDF). This method performs segmented compression of time series data and selects the most suitable compression algorithm based on the pattern of each segment. ALSC-TDF converts the compression algorithm selection problem into a time series classification task; utilizes Gated Recurrent Unit (GRU) to capture temporal dependencies; and considers compression efficiency features that are closely related to the data compression ratio, including basic statistical features, permutation and variation features, and compression degree features. Temporal dependencies and proposed features are analyzed using a modified GRU-Fully Convolutional Network (GRU-FCN) to improve classification accuracy and robustness, thereby improving the overall data compression ratio. The effectiveness and advantages of ALSC-TDF are verified using multiple datasets, and it outperforms comparison models in terms of classification accuracy and F1 value, with an accuracy of 88.86%. Moreover, ALSC-TDF achieves a significantly better compression ratio than existing compression algorithms, with a 15.62% improvement in overall data compression ratio compared to that of the Elf algorithm. Experimental results indicate that comprehensively analyzing the data features and temporal dependencies of time series can greatly improve the accuracy and robustness of adaptive compression algorithm selection, thereby achieving a higher compression ratio.

Key words: time series data, adaptive compression, pattern recognition, Gated Recurrent Unit (GRU), feature extraction

摘要: 面对复杂多样的时序数据模式,单一的压缩算法难以保持高压缩比,亟需根据不同数据模式选择合适的压缩算法。针对现有自适应压缩方案在确定最佳压缩算法时准确性较低的问题,提出一种融合时序依赖性与数据特征的自适应无损分段压缩方法(ALSC-TDF)。该方法对时序数据进行分段压缩,并根据各段模式选择最合适的压缩算法。ALSC-TDF将压缩算法选择问题转化为时间序列分类任务,利用门控循环单元(GRU)捕捉时序依赖性,并考量了与数据压缩比密切相关的压缩效率特征,包括基本统计特征、排列和变化特征以及压缩程度特征。通过改进的GRU-全卷积网络(GRU-FCN)融合分析时序依赖性和数据特征,以提高分类准确性和稳健性,进而提升整体数据的压缩比。最后,利用多种数据集验证了ALSC-TDF的有效性与优势,其在分类准确率和F1值方面均优于对比模型,准确率达到88.86%。同时,ALSC-TDF的压缩比显著超越现有压缩算法,其总压缩比相较Elf算法提升15.62%。实验结果表明,综合分析时间序列的数据特征及其时序依赖性,可有效提高自适应压缩算法选择的准确性和稳健性,从而实现更高的压缩比。

关键词: 时序数据, 自适应压缩, 模式识别, 门控循环单元, 特征提取

CLC Number: