作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

TM-HEDGE: 面向数字服务网络的追踪链路重建技术

  • 发布日期:2025-04-21

TM-HEDGE: Tracing Reconstruction for Digital Service Networks

  • Published:2025-04-21

摘要: 随着微服务架构在数字服务网络中的广泛应用,数字服务网络中服务节点规模的庞大和调用关系的复杂性给运维管理带来了严峻挑战。目前,分布式追踪技术在研究和应用领域已经取得显著进展。然而,该技术仍面临诸多限制,这些限制包括需要侵入系统源代码、依赖特定中间件,甚至在生成追踪路径时的准确性和完整性不足,导致调用链出现链路缺失,从而影响基于可观测数据进行的下游分析任务的可靠性。为此,本文提出了一种面向追踪-度量的异质动态图神经网络(TM-HEDGE)。首先,构建了引入度量数据的调用异质动态有向图,通过快照内异质注意力编码器和快照间Transformer编码器进行节点异质时空表征学习;然后,通过链路补全分类器实现缺失调用链的补全,进而完成追踪链路重建。实验结果表明,本文提出的TM-HEDGE在三个公开数据集上执行追踪链路重建任务的准确率相比现有链路补全模型平均提升了5.22%。本方法显著提高了数字服务网络中调用链的完整性,为数字服务网络的高效治理提供了可靠的技术支持。

Abstract: With the widespread adoption of microservice in digital service networks, the large scale of service nodes and the complexity of call graphs in these networks present significant challenges for operations management. Although distributed tracing technology has made significant progress, it still has limitations, such as the need to invade source code and reliance on specific middleware, leading to insufficiencies in the accuracy and completeness of tracing chains and affecting the reliability of downstream analysis tasks based on observability data. To address this problem, we propose a Trace-Metric Oriented Heterogeneous Dynamic Graph Neural Network, named TM-HEDGE. First, we construct a heterogeneous dynamic directed acyclic graph by incorporating metric data. Then, we propose an intra-snapshot heterogeneous attention encoder and an inter-snapshot Transformer encoder to learn node heterogeneous spatiotemporal representations. Finally, we complete missing tracing chains through link completion, achieving tracing reconstruction. Experimental evaluation results show that TM-HEDGE, when performing tracing reconstruction tasks, improves accuracy by 5.22% on average compared to existing stateof-the-art link completion GNNs on three public datasets, which significantly enhances the completeness and accuracy of tracing chains in digital service networks.