LLVM对CUDA的支持

杂记

clang 21 兼容 CUDA11.8（部分兼容CUDA 12）

使用clang++编译CUDA的命令：

# 编译为PTX文件（只编译Device端）PTX还是文本文件
clang++ --cuda-path=/usr/local/cuda-11.8 --cuda-gpu-arch=sm_89 --cuda-device-only -S cuda_add_vector.cu -o cuda_add_vector.ptx

# [未测试]把PTX编译为cubin
# 另外注意：o文件（host端目标文件）不能直接和 cubin（GPU二进制）文件链接
# cubin文件是GPU内核的二进制（SASS），并不是主机可执行代码，也不包含host入口，也不是标准ELF目标文件
# 它们属于两个不同“世界”的目标文件格式，不能用常规链接方式直接合并成可执行程序
${CUDA_HOME}/bin/ptxas kernel.ptx -o kernel.cubin -arch=sm_70

# 只编译Host端
clang++ --cuda-path=/usr/local/cuda-11.8 --cuda-host-only -c cuda_add_vector.cu -o cuda_add_vector_host.o -I/usr/local/cuda-11.8/include

# 不指定CUDA版本，默认使用CUDA 12，会有一个警告
clang++ --cuda-gpu-arch=sm_60 --cuda-device-only -S cuda_add_vector.cu -o cuda_add_vector.ptx
clang++: warning: CUDA version 12.6 is only partially supported [-Wunknown-cuda-version]

# LLVM官方提供的编译命令（自动完成Host和Device的编译，然后打包）
clang++ axpy.cu -o axpy --cuda-gpu-arch=<GPU arch> \
    -L<CUDA install path>/<lib64 or lib>           \
    -lcudart_static -ldl -lrt -pthread

# 一个有点奇怪的编译流程，使用clang++完成分步编译
# 使用clang++编译cu
clang++ --cuda-gpu-arch=sm_70 -c add_vector.cu -o add_vector.o -I$CUDA_HOME/include
# 用nvcc做device linking
nvcc -arch=sm_70 -dlink add_vector.o -o add_vector_dlink.o
# 最终链接
nvcc add_vector.o add_vector_dlink.o -o add_vector

参考资料

Clang で CUDA コードを NVPTX に変換するメモ https://qiita.com/syoyo/items/4e60543aded0210fde49
Compiling CUDA with clang https://llvm.org/docs/CompileCudaWithLLVM.html
User Guide for NVPTX Back-end
1. https://llvm.org/docs/NVPTXUsage.html
2. https://rocm.docs.amd.com/projects/llvm-project/en/latest/LLVM/llvm/html/NVPTXUsage.html
Clang 如何支持 CUDA 程序 https://jia.je/software/2023/10/17/clang-cuda-support/
CUDA LLVM Compiler https://developer.nvidia.com/cuda-llvm-compiler
1. NVIDIA has contributed key enhancements to the LLVM project to enable support of CUDA and massively parallel accelerators such as GPUs.

Zhonghui

Table of Contents

LLVM对CUDA的支持

杂记

参考资料

Zhonghui

User Tools

Site Tools

Table of Contents

LLVM对CUDA的支持

杂记

参考资料

Page Tools