Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2024, Vol. 50 ›› Issue (12): 1-15. doi: 10.19678/j.issn.1000-3428.0068694

• Research Hotspots and Reviews • Previous Articles     Next Articles

Survey on GPGPU and CUDA Unified Memory Research Status

PANG Wenhao1, WANG Jialun2, WENG Chuliang1,*()   

  1. 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
    2. Research Institute of Interdisciplinary Innovation, Zhejiang Laboratory, Hangzhou 310000, Zhejiang, China
  • Received:2023-10-25 Online:2024-12-15 Published:2024-12-28
  • Contact: WENG Chuliang

GPGPU和CUDA统一内存研究现状综述

庞文豪1, 王嘉伦2, 翁楚良1,*()   

  1. 1. 华东师范大学数据科学与工程学院, 上海 200062
    2. 之江实验室交叉创新研究院, 浙江 杭州 310000
  • 通讯作者: 翁楚良
  • 基金资助:
    国家自然科学基金(62272171); 浙江省“尖兵”“领雁”研发攻关计划(2022C04006)

Abstract:

In the context of big data, the rapid advancement of fields such as scientific computing and artificial intelligence, there is an increasing demand for high computational power across various domains. The unique hardware architecture of the Graphics Processing Unit (GPU) makes it suitable for parallel computing. In recent years, the concurrent development of GPUs and fields such as artificial intelligence and scientific computing has enhanced GPU capabilities, leading to the emergence of mature General-Purpose Graphics Processing Units (GPGPUs). Currently, GPGPUs are one of the most important co-processors for Central Processing Units (CPUs). However, the fixed hardware configuration of the GPU after delivery and its limited memory capacity can significantly hinder its performance, particularly when dealing with large datasets. To address this issue, Compute Unified Device Architecture (CUDA) 6.0 introduces unified memory, allowing GPGPU and CPU to share a virtual memory space, thereby simplifying heterogeneous programming and expanding the GPGPU-accessible memory space. Unified memory offers a solution for processing large datasets on GPGPUs and alleviates the constraints of limited GPGPU memory capacity. However, the use of unified memory introduces performance issues. Effective data management within unified memory is the key to enhancing performance. This article provides an overview of the development and application of CUDA unified memory. It covers topics such as the features and evolution of unified memory, its advantages and limitations, its applications in artificial intelligence and big data processing systems, and its prospects. This article provides a valuable reference for future work on applying and optimizing CUDA unified memory.

Key words: General-Purpose Graphics Processing Unit (GPGPU), unified memory, memory oversubscription, data management, heterogeneous system

摘要:

在大数据背景下, 随着科学计算、人工智能等领域的快速发展, 各领域对硬件的算力要求越来越高。图形处理器(GPU)特殊的硬件架构, 使其适合进行高并行度的计算, 并且近年来GPU与人工智能、科学计算等领域互相发展促进, 使GPU功能细化, 逐渐发展出了成熟的通用图形处理器(GPGPU), 目前GPGPU已成为中央处理器(CPU)最重要的协处理器之一。然而, GPU硬件配置在出厂后不容易更改且显存容量有限, 在处理大数据集时显存容量不足的缺点对计算性能造成较大的影响。统一计算设备架构(CUDA)6.0推出了统一内存, 使GPGPU和CPU可以共享虚拟内存空间, 以此来简化异构编程和扩展GPGPU可访问的内存空间。统一内存为GPGPU处理大数据集提供了一项可行的解决方案, 在一定程度上缓解了GPU显存容量较小的问题, 但是统一内存的使用也带来了一些性能问题, 如何在统一内存中做好内存管理成为性能提升的关键。本研究对CUDA统一内存的发展和应用进行综述, 包括CUDA统一内存的特性、发展、优势和局限性以及在人工智能、大数据处理系统等领域的应用和未来的发展前景, 为未来使用和优化CUDA统一内存的研究工作提供有价值的参考。

关键词: 通用图形处理器, 统一内存, 显存超额订阅, 数据管理, 异构系统