1 |
|
2 |
何晓斌, 高洁, 肖伟, 等. 应用透明的超算多层存储加速技术研究. 计算机工程, 2022, 48 (12): 1- 8.
doi: 10.19678/j.issn.1000-3428.0065928
|
|
HE X B , GAO J , XIAO W , et al. Research on application-transparent supercomputing multi-tier storage acceleration technology. Computer Engineering, 2022, 48 (12): 1- 8.
doi: 10.19678/j.issn.1000-3428.0065928
|
3 |
PUMMA S , SI M , FENG W C , et al. Scalable deep learning via I/O analysis and optimization. ACM Transactions on Parallel Computing, 2019, 6 (2): 1- 34.
|
4 |
PATEL T, BYNA S, LOCKWOOD G K, et al. Revisiting I/O behavior in large-scale storage systems: the expected and the unexpected[C]//Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. New York, USA: ACM Press, 2019: 1-13.
|
5 |
ISAKOV M, del ROSARIO E, MADIREDDY S, et al. HPC I/O throughput bottleneck analysis with explainable local models[C]//Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Washington D. C., USA: IEEE Press, 2020: 1-10.
|
6 |
PATEL T, BYNA S. Uncovering access, reuse, and sharing characteristics of I/O-intensive files on large-scale production HPC systems[C]//Proceedings of the 18th USENIX Conference on File and Storage Technologies. [S. l. ]: USENIX, 2020: 91-101.
|
7 |
WANG F Y, SIM H, HARR C, et al. Diving into petascale production file systems through large scale profiling and analysis[C]//Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems. New York, USA: ACM Press, 2017: 37-42.
|
8 |
DAI Y Q, DONG Y, LU K, et al. Towards scalable resource management for supercomputers[C]//Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Washington D. C., USA: IEEE Press, 2022: 324-338.
|
9 |
GARLICK J E. I/O forwarding on livermore computing commodity Linux clusters: LLNL-TR-609233[R]. Livermore, USA: Lawrence Livermore National Lab, 2012: 1-9.
|
10 |
PAUL A K, FAALAND O, MOODY A, et al. Understanding HPC application I/O behavior using system level statistics[C]//Proceedings of the IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC). Washington D. C., USA: IEEE Press, 2020: 202-211.
|
11 |
周隆放, 杨文祥, 韩永国, 等. 作业名层次化聚类算法预测作业运行时间. 国防科技大学学报, 2022, 44 (5): 13- 23.
doi: 10.11887/j.cn.202205002
|
|
ZHOU L F , YANG W X , HAN Y G , et al. Predicting the job running time with job name hierarchical clustering algorithm. Journal of National University of Defense Technology, 2022, 44 (5): 13- 23.
doi: 10.11887/j.cn.202205002
|
12 |
XIAN G , ZHANG X R , YU J , et al. PreF: predicting job failure on supercomputers with job path and user behavior. Concurrency and Computation: Practice and Experience, 2022, 34 (23)
doi: 10.1002/cpe.7202
|
13 |
唐阳坤, 鲜港, 杨文祥, 等. 基于用户行为的超级计算机作业失败预测方法. 计算机工程与科学, 2022, 44 (10): 1753- 1761.
URL
|
|
TANG Y K , XIAN G , YANG W X , et al. Job failure prediction based on user behavior on supercomputers. Computer Engineering & Science, 2022, 44 (10): 1753- 1761.
URL
|
14 |
ZHANG H T , XIAN G , YANG W X , et al. A study of job failure prediction on supercomputers with application semantic enhancement. Journal of Computing Science and Engineering, 2022, 16 (4): 222- 232.
|
15 |
|
16 |
LOCKWOOD G K, YOO W, BYNA S, et al. UMAMI: a recipe for generating meaningful metrics through holistic I/O performance analysis[C]//Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems. New York, USA: ACM Press, 2017: 55-60.
|
17 |
KUNKEL J M, BETKE E, BRYSON M, et al. Tools for analyzing parallel I/O[C]//Proceedings of 2018 International Workshops on High Performance Computing. Berlin, Germany: Springer, 2018: 49-70.
|
18 |
LOCKWOOD G K, WRIGHT N, SNYDER S, et al. TOKIO on ClusterStor: connecting standard tools to enable holistic I/O performance analysis[EB/OL]. [2023-06-15]. https://www.osti.gov/biblio/1632125.
|
19 |
PARK B H, HUKERIKAR S, ADAMSON R, et al. Big data meets HPC log analytics: scalable approach to understanding systems at extreme scale[C]//Proceedings of the IEEE International Conference on Cluster Computing. Washington D. C., USA: IEEE Press, 2017: 758-765.
|
20 |
NEUWIRTH S, PAUL A K. Parallel I/O evaluation techniques and emerging HPC workloads: a perspective[C]//Proceedings of the IEEE International Conference on Cluster Computing. Washington D. C., USA: IEEE Press, 2021: 671-679.
|
21 |
LIU Z C, LEWIS R, KETTIMUTHU R, et al. Characterization and identification of HPC applications at leadership computing facility[C]//Proceedings of the 34th ACM International Conference on Supercomputing. New York, USA: ACM Press, 2020: 1-12.
|
22 |
LU S, LUO B, PATEL T, et al. Making disk failure predictions SMARTer![C]//Proceedings of the 18th USENIX Conference on File and Storage Technologies. [S. l. ]: USENIX, 2020: 151-168.
|
23 |
CHIEN S W D, PODOBAS A, PENG I B, et al. tf-Darshan: understanding fine-grained I/O performance in machine learning workloads[C]//Proceedings of the IEEE International Conference on Cluster Computing. Washington D. C., USA: IEEE Press, 2020: 359-370.
|
24 |
MADIREDDY S, BALAPRAKASH P, CARNS P, et al. Analysis and correlation of application I/O performance and system-wide I/O activity[C]//Proceedings of the International Conference on Networking, Architecture, and Storage. Washington D. C., USA: IEEE Press, 2017: 1-10.
|
25 |
KIM S, SUNG D K, SON Y. IFLustre: towards interference-free and efficient storage allocation in distributed file system[C]//Proceedings of the 30th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems. Washington D. C., USA: IEEE Press, 2022: 105-112.
|
26 |
YANG B, ZOU Y L, LIU W G, et al. An end-to-end and adaptive I/O optimization tool for modern HPC storage systems[C]//Proceedings of the IEEE International Parallel and Distributed Processing Symposium. Washington D. C., USA: IEEE Press, 2022: 1294-1304.
|