[ 1 ] Lundstrom M S, Alam M A. Moore’s law: The journey
ahead [J]. Science, 2022, 378(6621):722-723.
[ 2 ] Lee E A. The problem with threads [J]. Computer, 2006,
39(5):33-42.
[ 3 ] Gropp W, Lusk E, Doss N, et al. A high-performance,
portable implementation of the MPI message passing in
terface standard. [J]. Parallel computing, 1996,
22(6):789-828.
[ 4 ] Dagum L, Menon R. OpenMP: An industry-standard API
for shared-memory programming [J]. IEEE Computa
tional Science and Engineering, 1998, 5(1):46-55.
[ 5 ] Sanders J, Kandrot E. Sanders J, Kandrot E. CUDA by
example: an introduction to general-purpose GPU pro
gramming [M]. Boston: Addison-Wesley Professional,
2010.
[ 6 ] Stone J E, Gohara D, Shi G. OpenCL: A parallel pro
gramming standard for heterogeneous computing systems
[J]. Computing in Science & Engineering, 2010,
12(3):66.
[ 7 ] Dennis J B. Data flow supercomputers [J]. Computer,
1980, 13(11):48-56.
[ 8 ] Flynn M J. Some computer organizations and their effec
tiveness. [J]. IEEE transactions on computers, 1972,
100(9):948-60.
[ 9 ] Johnston W M, Hanna J P, Millar R J. Advances in data
flow programming languages [J]. ACM computing sur
veys (CSUR), 2004, 36(1):1-34.
[ 10 ] Oviedo E I. Control flow, data flow and program com
plexity [D]. Buffalo, NY: State University of New York
at Buffalo, 1984.
[ 11 ] Bauer M, Treichler S, Slaughter E, et al. Legion: Ex
pressing locality and indepen-dence with logical regions
[C]. SC’12: Proceedings of the International Conference
on High Performance Computing, Networking, Storage
and Analysis, IEEE, 2012:1-11.
[ 12 ] Pheatt C. Intel threading building blocks [J]. Journal of
Computing Sciences in Colleges, 2008, 23(4):298.
[ 13 ] Gray A L. CUDA Graphs: Flexible Dependency Repre
sentation for Efficient GPU Work Submission [EB/OL].
https://developer.nvidia.com/blog/cuda-graphs. 2019.
[ 14 ] Ben-Nun T, de Fin Licht J, Ziogas A N, et al. Stateful
dataflow multigraphs: a data-centric model for perfor
mance portability on heterogeneous architectures [C].
Proceedings of the International Conference for High
Performance Computing, Networking, Storage and
Analysis (SC), 2019:1-14.
[ 15 ] Lattner C, Amini M, Bondhugula U, et al. MLIR: scaling
compiler infrastructure for domain specific computation
[C]. 2021 IEEE/ACM International Symposium on Code
Generation and Optimization (CGO), 2021:2-14.
[ 16 ] Ben-Nun T, Ates B, Calotoiu A, et al. Bridging con
trol-centric and data-centric optimization [C]. Proceed
ings of the 21st ACM/IEEE International Symposium on
Code Generation and Optimization, 2023:173-185.
[ 17 ] Suetterlein J, Zuckerman S, Gao G R. An implementation
of the Codelet model [C]. Euro-Par 2013 Parallel Pro
cessing: 19th International Conference, 2013:633-644.
[ 18 ] Suetterlein J. DARTS: a runtime based on the Codelet
execution model [D]. Newark; University of Delaware,
2014.
[ 19 ] Yuki T, Pouchet L-N. PolyBench 4.2.1 (pre-release)
[EB/OL].
https://github.com/MatthiasJReisinger/PolyBenchC-4.2.1
/blob/master/polybench.pdf. 2016.
[ 20 ] Moses W S, Chelini L, Zhao R Z, et al. Polygeist: raising
C to polyhedral MLIR [C]. 2021 30th International Con
ference on Parallel Architectures and Compilation Tech
niques (PACT), 2021:45-59.
[ 21 ] Flamegraphs
for
Rust
[EB/OL].
https://github.com/flamegraph-rs/flamegraph. 2025.
[ 22 ] Chen L, Tang S, Fu Y, et al. AceMesh: a structured data
driven programming language for high performance
computing. [J] CCF Transactions on High Performance
Computing, 2020, 2(4):309-322.
[ 23 ] Bosilca G, Bouteiller A, Danalis A, et al. PaRSEC: Ex
ploiting heterogeneity to enhance scalability. [J] Compu
ting in Science & Engineering, 2013, 15(6):36-45.
[ 24 ] Kabrick R, Perdomo D A, Raskar S, et al. CODIR: To
wards an MLIR Codelet Model Dialect [C]. 2020
IEEE/ACM Fourth Annual Workshop on Emerging Par
allel and Distributed Runtime Systems and Middleware
(IPDRM), 2020:33-40.
[ 25 ] 李金熹,尹首一,魏少军,等. 基于 MLIR 的数据流模型
[J]. 计算机工程与科学, 2024, 46(7):1151-1157
Li J X, Yin S Y, Wei S J, et al. A codelet model based on
MLIR [J] Computer Engineering and Science, 2024,
46(7):1151-1157 |