Lattice Structure Based Distributed Table Joining Optimization

doi:10.19678/j.issn.1000-3428.0252206

Abstract

Abstract: In distributed computing frameworks, inefficient data transfer in the Shuffle phase has become a key bottleneck in data connectivity. Existing methods have certain limitations in dealing with table joins, such as broadcast joins and hash joins in Spark are both susceptible to data skewing, which makes the load between nodes unbalanced. Aiming at this problem, the paper focuses on joining aggregated queries, and proposes a table joining method based on lattice structure: by precomputing the storage table partition data in the form of lattice structure, and utilizing the convex set property of equivalence class, i.e., the data cells containing the upper bound of equivalence class and contained by the lower bound of equivalence class, whose aggregation values are equal to the aggregation values of equivalence class, so as to realize the quick matching and Calculation. Since the query data cells as a compressed form of basic table data, the data size and skew are more concise and uniform, the article uses the query data cells instead of table data to perform data transfer and connection, which greatly reduces the data Shuffle and computational complexity. The method proposed in the paper has been implemented in Spark, and experiments based on the TPC-H dataset show that: the method of the paper reduces the data Shuffle by about 45.06% in large dataset scenarios, meanwhile, the workload among the nodes is more balanced compared to the benchmark method, and the query response time is shortened by 14.23% on average.

摘要： 在分布式计算框架中，Shuffle阶段的数据传输效率低下已成为数据连接的关键瓶颈。现有方法在处理表连接时存在一定的局限性，如Spark中的广播连接和哈希连接均易受数据倾斜影响，使得节点之间负载不均衡。针对此问题，文章聚焦于连接聚集查询，提出一种基于格结构的表连接方法：通过预计算存储表分区数据为格结构形式，利用等价类的凸集性质，即包含等价类上界且被等价类下界所包含的数据单元，其聚集值与等价类聚集值相等，从而实现对查询语句所映射生成的查询单元进行快速匹配和计算。由于查询单元作为基本表数据的一种压缩形式，数据大小和倾斜度更加简洁、均匀，文章使用查询单元代替基本表数据执行数据的传输和连接，极大程度上减少了Shuffle阶段的数据大小和计算复杂度。文章所提方法已在Spark中实现，基于TPC-H数据集的实验表明：文章方法在大数据集场景中减少数据Shuffle约45.06%，同时，节点间的工作负载相较于基准方法更加均衡，查询响应时间平均缩短了14.23%。

Shang Wen, You Jinguo, Wu Kang, Fu Wantin, Li Xiaowu, Jia Lianyin. Lattice Structure Based Distributed Table Joining Optimization[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252206.

尚文, 游进国, 吴康, 付琬婷, 李晓武, 贾连印. 基于格结构的分布式表连接优化方法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252206.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0252206

References

[1] Gu R, Huang X, Dai H, et al. Efficient. Scalable and robust data shuffle service for distributed MapReduce computing on cloud[C]//2022 IEEE 24th Int Conf on High Performance Computing & Communications, 2022: 337-346.
[2] 陈勇旭,陈梦杰,刘雪冰等.基于MapReduce的连接聚集查询算法研究[J].计算机研究与发展,2013,50(S1):306-311. Yongxu C, Mengjie C, Xuebing L, et al. MapReduce based aggregate-join query algorithms[J]. Journal of Computer Research and Development, 2013,50(S1):306-311. (in Chinese)
[3] 吴恩慈.广播机制解决Shuffle过程数据倾斜的方法[J].计算机系统应用,2019,28(06):189-197. Enci W. Method Research to Solve Shuffle Data Skew Based on Broadcast [J]. Computer Systems & Applications, 2019,28(6):189−197. (in Chinese)
[4] Foto N, Jeffrey D. Optimizing joins in a map-reduce environment[C]//Proceedings of the 13th International Conference on Extending Database Technology (EDBT '10), 2010: 99–110.
[5] 高锦涛,李战怀,杜洪涛等.分布式数据库下基于剪枝的并行合并连接策略[J].软件学报,2019,30(11):3364-3381. Jintao G, Zhanhuai L, Hongtao D, et al. Strategy of Parallel Merge Join Based on Prune in Distributed Database [J]. Journal of Software, 2019, 30(11):3364-3381. (in Chinese)
[6] Laks V, Jian P, Jiawei H. Quotient cube: How to summarize the semantics of a data cube[C]// VLDB’02: Proceedings of the 28th International Conference on Very Large Data Bases, 2002: 778–789.
[7] Yuting L, Divyakant A, Chun C, et al. Llama: Leveraging columnar storage for scalable join processing in the MapReduce framework[C]//Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, 2011: 961–972.
[8] 乔百友, 朱俊海, 郑宇杰等. 一种基于 Spark 的多路空间连接查询处理算法[J]. 计算机研究与发展, 2017, 54(07): 1592-1602. Baiyou Q, Junhai Z, Yujie Z, et al. A Multi-Way Spatial Join Querying Processing Algorithm Based on Spark [J]. Journal of Computer Research and Development, 2017, 54(07): 1592-1602. (in Chinese)
[9] Peter J, Joseph M. Hellerstein. Ripple joins for online aggregation[C]//Proceedings of the 1999 ACM SIGMOD International Conference on Management of data,1999: 287–298.
[10] Manas R. Joglekar, Rohan Puttagunta, and Christopher Ré. AJAR: Aggregations and joins over annotated relations[C]//Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS '16),2016: 91–106.
[11] F. García-García, A. Corral, L. Iribarne, et al. Efficient distributed algorithms for distance join queries in spark-based spatial analytics systems[J]. Int.J. Gen. Syst. 52 (2023) 206–250.
[12] Azhir E, Navimipour N, Hosseinzadeh M, et al. Join queries optimization in the distributed databases using a hybrid multi-objective algorithm[J]. Cluster Comput 25(2022), 2021–2036.
[13] Feng L, Francis C, Heming C, et al. Relative-cost-based selection of distributed join methods for query plan optimization[J]. Information Sciences https://doi.org/10.1016/j.ins.2023.120022. 658(2024).
[14] Runsheng B, Khuzaima D. Research challenges in deep reinforcement learning-based join query optimization[C]//Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, 2020: 1–6.
[15] Eich M, Pit F, Guido M. Efficient generation of query plans containing group-by, join, and groupjoin[J]. The VLDB Journal, 2017, 27: 617 - 641.
[16] Xuanhe Z, Guoliang L, Chengliang C, et al. A learned query rewrite system using Monte Carlo tree search[C]//Proc. VLDB Endow. 15, 1 (September 2021), 46–58.
[17] Jim G, Surajit C, Adam B, et al. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals[J]. Data mining and knowledge discovery, 1997, 1:29–53.
[18] Sachin B, Peter L, Zhekai J, et al. Aggregation and exploration of high-dimensional data using the sudokube data cube engine[C]//Proceedings of the 2023 ACM SIGMOD International Conference on Management of data, 2023:175–178.
[19] Sachin B, Christoph K. High-dimensional data cubes[C]//Proceedings of the VLDB Endowment, 2022, 15(13):3828–3840.
[20] Laks V, Jian P, Yan Z. QC-trees: An efficient summary structure for semantic OLAP[C]//Proceedings of the 2003 ACM SIGMOD international conference on Management of data. 2003: 64–75.
[21] 徐静文,游进国,王全鹍,等.数据立方体与频繁项集的统一计算框架研究[J].计算机学报,2023,46(04):780-802. Jingwen X, Jinguo Y, Quankun W, et al. Unified Computing Framework for Data Cubes and Frequent Itemsets[J]. Chinese Journal of Computers, 2023, 46(04): 780-802. (in Chinese)
[22] 游进国,董朋志,胡宝丽等.语义 OLAP 缓存技术研究[J]. 小型微型计算机系统,2015,36(07):1470-1475. Jinguo Y, Pengzhi D, Baoli H, et al. Research of semantic OLAP caching[J]. Journal of Chinese Computer Systems, 2015, 36(7): 1470-1475. (in Chinese)
[23] 何培蕾,游进国,王宇轩等.数据库条件查询的非语义等价关系建模[J/OL].小型微型计算机系统,2024:1-7. Peilei H, Jinguo Y, Yuxuan W, et al. Non-semantic equivalence relation modeling of conditional query in databases[J/OL]. Journal of Chinese Computer Systems, 2024: 1-7. (in Chinese)
[24] TPC-H benchmark[EB/OL]. https://www.tpc.org/tpch/.

Please choose a citation manager

Content to export