作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

GPGPU和CUDA统一内存研究现状综述

  • 发布日期:2024-04-26

Survey on GPGPU and CUDA Unified Memory Research Status

  • Published:2024-04-26

摘要: 大数据背景下,随着科学计算、人工智能等领域的快速发展,各领域对硬件的算力要求越来越高。GPU特殊的硬件架构,使其适合进行高并行度的计算,并且近年来GPU与人工智能、科学计算等领域互相发展促进,使GPU成为了CPU最重要的协处理器之一。然而GPU的硬件配置在出厂后不易更改,且显存容量有限,在处理大数据集时显存容量不足的缺点对计算性能存在较大影响。CUDA 6.0推出了统一内存,使GPU和CPU可以共享相同的虚拟内存空间,以此来简化异构编程和扩展GPU可访问的内存空间。统一内存为GPU处理大数据集提供了一项可行的解决方案,在一定程度上缓解了GPU显存容量较小的问题,但是统一内存的使用也带来了一些性能问题,如何在统一内存中做好内存管理成为性能提升的关键。本文将对CUDA统一内存的发展和应用进行综述,包括:CUDA统一内存的特性、发展、优势和局限性,以及在人工智能和大数据处理系统等领域的应用,未来的发展前景等。为未来使用和优化CUDA统一内存的研究工作提供有价值的参考。

Abstract: In the context of big data, with the rapid development of scientific computing, artificial intelligence and other fields, there is an increasing demand for high computational power across various domains. The unique hardware architecture of the GPU makes it suitable for highly parallel computing. In recent years, the mutual development of GPU with fields such as artificial intelligence and scientific computing has made GPU one of the most crucial co-processors alongside CPU. However, due to the unmodifiable nature of GPU hardware configurations after manufacturing and the limited capacity of memory, the drawback of insufficient memory capacity has a significant impact on computational performance, especially when dealing with large datasets. To address this issue, CUDA 6.0 introduced UM(Unified Memory), enabling GPU and CPU to share the same virtual memory space, thereby simplifying heterogeneous programming and expanding the GPU-accessible memory space. UM extends the GPU memory, providing a solution for GPU processing of large datasets and alleviating the issue of limited GPU memory capacity. However, the use of UM also introduces some performance issues. Effective data management within UM is the key to enhancing performance. This article provides an overview of the development and application of CUDA UM. It covers topics such as the features and evolution of UM, its advantages and limitations, applications in artificial intelligence and big data processing systems, and the future prospects of this technology. This article provides a valuable reference for future work on applying and optimizing CUDA UM.