作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 328-335. doi: 10.19678/j.issn.1000-3428.0067865

• 开发研究与工程应用 • 上一篇    下一篇

基于LT码的分布式矩阵计算研究

刘怡, 张磊*()   

  1. 首都师范大学信息工程学院, 北京 100048
  • 收稿日期:2023-06-16 出版日期:2024-08-15 发布日期:2023-11-14
  • 通讯作者: 张磊
  • 基金资助:
    科技创新2030—“新一代人工智能”重大项目(2020AAA0109700)

Research on Distributed Matrix Computing Based on LT Code

Yi LIU, Lei ZHANG*()   

  1. College of Information Engineering, Capital Normal University, Beijing 100048, China
  • Received:2023-06-16 Online:2024-08-15 Published:2023-11-14
  • Contact: Lei ZHANG

摘要:

在如今大数据和机器学习应用范围不断扩大的背景下, 分布式计算系统成为处理庞大数据的必要工具。对于具有一定规模的计算集群, 其性能会不可避免地受到系统噪声的影响, 应考虑在分布式计算系统中借助编码技术来增强系统的鲁棒性。现有应用于分布式矩阵计算的编码方案多为固定速率编码, 无法适应节点数量动态变化的实际情况。同时, 由于部分任务有截止期限制, 应在保证任务顺利完成的前提下尽可能地减少平均开销从而降低时延。针对上述问题, 提出将LT码应用于雾计算场景下的分布式矩阵计算, 设计Remo2算法。依托LT码的无速率特性自适应信道状态变化, 通过合适的度分布函数设计以及双向切割、因子化度数的方法达到降低时延、增强分布式计算系统鲁棒性的预期效果。令k1A矩阵被切分后的子矩阵行值, k2B矩阵被切分后的子矩阵列值, 实验结果表明, 在$ {k}_{1} $值固定的前置条件下, 与FLT码及BDC-LT算法相比, Remo2算法的平均开销相对于前者稳定降低了33.3%, 相对于后者减少了7.7%的冗余。此外, 当$ {k}_{1}{k}_{2} $大小的码长固定时, $ {k}_{1} $$ {k}_{2} $的离散化程度越低, 即$ \lim\left|{k}_{1}-{k}_{2}\right|\to 0 $, 会带来更小的平均开销。

关键词: LT码, 分布式矩阵计算, 双向切割, 因式化, 平均开销

Abstract:

Distributed computing systems have emerged as necessary tools for processing large amounts of data in the context of the continuous expansion of big data and machine learning applications. For computing clusters with a certain scale, their performance are inevitably affected by system noise. Therefore, it is necessary to consider encoding technology in distributed computing systems to enhance their robustness. Most of the existing encoding schemes used in distributed matrix computing are fixed rate codes, which cannot adapt to the actual situation of dynamic changes in the number of nodes. Meanwhile, owing to deadlines for some tasks, the average cost should be minimized completely to reduce latency while ensuring smooth task completion. To address the above issues, this paper proposes the application of the Luby Transform(LT) code to distributed matrix computing in fog computing scenarios, and designs Remo2 algorithm. Based on the rate-free characteristics of LT codes, adaptive channel state changes can be achieved through an appropriate degree distribution function design, bidirectional cutting, and degree factorization methods to reduce latency and enhance the robustness of distributed computing systems. This paper lets k1 be the row value of the submatrix after the partition of the A matrix and k2 be the column value of the submatrix after the partition of the B matrix. The experimental results indicate that under a fixed $ {k}_{1} $ value precondition, compared with the Factored LT(FLT) code and Block-diagonal Coding-LT(BDC-LT) algorithm, the average cost of the Remo2 algorithm can be stably reduced by 33.3% compared to that of the former, and the redundancy can be reduced by 7.7% compared to that of the latter. In addition, when the code length of $ {k}_{1}{k}_{2} $ is fixed, a lower degree of discretization of $ {k}_{1} $, $ {k}_{2} $, and $ \lim\left|{k}_{1}-{k}_{2}\right|\to 0 $ results in a smaller average overhead.

Key words: Luby Transform(LT) code, distributed matrix computing, bidirectional cutting, factorization, average cost