[1] 3rd Gen Intel Xeon scalable processors brief[EB/OL].[2022-09-18].https://www.intel.com/content/www/us/en/products/docs/processors/xeon/3rd-gen-xeon-scalable-processors-brief.html. [2] Amazon EC2 high memory instances-Amazon Web Services(AWS)[EB/OL].[2022-09-18].https://aws.amazon.com/ec2/instance-types/high-memory/. [3] ZAHARIA M, XIN R S, WENDELL P, et al.Apache Spark[J].Communications of the ACM, 2016, 59(11):56-65. [4] MPI Forum 4.0[EB/OL].[2022-09-18].https://www.mpi-forum.org/mpi-40/. [5] DEAN J, GHEMAWAT S.MapReduce[J].Communications of the ACM, 2008, 51(1):107-113. [6] ZAHARIA M, CHOWDHURY M, DAS T, et al.Resilient distributed datasets:a fault-tolerant abstraction for in-memory cluster computing[C]//Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation.San Diego, USA:USENIX Association, 2012:2-10. [7] 黄廷辉, 王玉良, 汪振, 等.基于内存与文件共享机制的Spark I/O性能优化[J].计算机工程, 2017, 43(3):1-6. HUANG T H, WANG Y L, WANG Z, et al.Spark I/O performance optimization based on memory and file sharing mechanism[J].Computer Engineering, 2017, 43(3):1-6.(in Chinese) [8] 夏立斌, 刘晓宇, 孙玮, 等.Spark任务间消息传递方法研究[J].计算机工程与应用, 2022, 58(21):91-97. XIA L B, LIU X Y, SUN W, et al.Exploring message passing method between Spark tasks[J].Computer Engineering and Applications, 2022, 58(21):91-97.(in Chinese) [9] 张嗜军, 高曙.一种改进的增量式JVM垃圾收集算法[J].计算机工程, 2012, 38(1):71-73. ZHANG S J, GAO S.Improved garbage collection algorithm for incremental JVM[J].Computer Engineering, 2012, 38(1):71-73.(in Chinese) [10] KOLOKASIS I G, PAPAGIANNIS A, PRATIKAKIS P, et al.Say goodbye to off-heap caches! on-heap caches using memory-mapped I/O[C]//Proceedings of the 12th USENIX Conference on Hot Topics in Storage and File Systems.San Diego, USA:USENIX Association, 2020:4-10. [11] VIEßMANN H N, ŠINKAROVS A, SCHOLZ S B.Extended memory reuse:an optimisation for reducing memory allocations[C]//Proceedings of the 30th Symposium on Implementation and Application of Functional Languages.New York, USA:ACM Press, 2018:107-118. [12] Apache Arrow[EB/OL].[2022-09-18].https://arrow.apache.org/. [13] MERRILL M, REUS W, NEUMANN T.Arkouda:interactive data exploration backed by Chapel[C]//Proceedings of ACM SIGPLAN Conference on Chapel Implementers and Users.New York, USA:ACM Press, 2019:28-32. [14] BAUER M, GARLAND M.Legate NumPy:accelerated and distributed array computing[EB/OL].[2022-09-18].https://dl.acm.org/doi/10.1145/3295500.3356175. [15] GROSSMAN M, POOLE S, PRITCHARD H, et al.SHMEM-ML:leveraging OpenSHMEM and Apache Arrow for scalable, composable machine learning[EB/OL].[2022-09-18].https://link.springer.com/chapter/10.1007/978-3-031-04888-3_7. [16] OpenSHMEM application programming interface, version 1.5[EB/OL].[2022-09-18].http://www.openshmem.org. [17] DALCIN L, FANG Y L L.mpi4py:status update after 12 years of development[J].Computing in Science & Engineering, 2021, 23(4):47-54. [18] SI M, FU H S, HAMMOND J R, et al.OpenSHMEM over MPI as a performance contender:thorough analysis and optimizations[EB/OL].[2022-09-18].https://link.springer.com/chapter/10.1007/978-3-031-04888-3_3. [19] FRIEDLEY A, HOEFLER T, BRONEVETSKY G, et al.Ownership passing[J].ACM SIGPLAN Notices, 2013, 48(8):177-186. [20] AHMAD T, AHMED N, PELTENBURG J, et al.ArrowSAM:in-memory genomics data processing using Apache Arrow[C]//Proceedings of the 3rd International Conference on Computer Applications & Information Security.Washington D.C., USA:IEEE Press, 2020:1-6. [21] AHMAD T, MA C, AL-ARS Z, et al.Communication-efficient cluster scalable genomics data processing using Apache Arrow flight[EB/OL].[2022-09-18].http://biorxiv.org/lookup/doi/10.1101/2022.04.01.486780. [22] MUSHTAQ H, LIU F, COSTA C, et al.SparkGA:a Spark framework for cost effective, fast and accurate DNA analysis at scale[C]//Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics.New York, USA:ACM Press, 2017:148-157. [23] JARLIER F, JOLY N, FEDY N, et al.QUARTIC:quick parallel algorithms for high-throughput sequencing data processing[EB/OL].[2022-09-18].https://f1000research.com/articles/9-240. [24] DataFusion[EB/OL].[2022-09-18].https://github.com/apache/arrow-datafusion. [25] LI H.Alluxio:a virtual distributed file system[EB/OL].[2022-09-18].https://escholarship.org/uc/item/4n80320 w#main. [26] 廖旺坚, 黄永峰, 包从开.Spark并行计算框架的内存优化[J].计算机工程与科学, 2018, 40(4):587-593. LIAO W J, HUANG Y F, BAO C K.Memory optimization of Spark parallel computing framework[J].Computer Engineering & Science, 2018, 40(4):587-593.(in Chinese) [27] KIM M, LI J, VOLOS H, et al.Sparkle:optimizing Spark for large memory machines and analytics[EB/OL].[2022-09-18].https://arxiv.org/abs/1708.05746. [28] RANG W, YANG D, CHENG D.A shared memory cache layer across multiple executors in apache Spark[EB/OL].[2022-09-18].https://ieeexplore.ieee.org/document/9378179/. [29] RODRIGUEZ S A, CHAKRABORTY J, CHU A, et al.Zero-cost, arrow-enabled data interface for apache Spark[EB/OL].[2022-09-18].https://arxiv.org/abs/2106.13020. [30] KHANNA D, SHARMA S, RODRÍGUEZ C, et al.Dynamic symbolic verification of MPI programs[EB/OL].[2022-09-18].http://pag.iiitd.edu.in/sites/default/files/FM2018.pdf. [31] 刘翔, 童薇, 刘景宁, 等.动态内存分配器研究综述[J].计算机学报, 2018, 41(10):2359-2378. LIU X, TONG W, LIU J N, et al.A review of dynamic memory allocator research[J].Chinese Journal of Computers, 2018, 41(10):2359-2378.(in Chinese) |