Documents for the CUDA Compiler SDK (including the specification for LLVM IR, an API document for libnvvm, and an API document for libdevice) are provided. A set of samples that illustrate the use of the compiler SDK are provided. A set of libraries, libdevice.bc, that implement the common math functions for devices in the LLVM bitcode format are provided. An optimizing compiler library (libnvvm.so, nvvm.dll/nvvm.lib, libnvvm.dylib) and its header file nvvm.h are provided for compiler developers who want to generate PTX from a program written in NVVM IR, which is a compiler internal representation based on LLVM. The version of Thrust included with the current CUDA toolkit was upgraded from version 1.5.3 to version 1.7.0. The cublasgtsv() routines have been replaced with a version that supports pivoting. CURAND 5.5 introduces support for the random number generator Philox4x32-10. CUFFT 5.5 provides FFTW3 interfaces that enables applications using FFTW to gain performance with NVIDIA CUFFT with minimal changes to program source code. The new calls allow creation of a CUFFT plan handle separate from the actual creation of the plan, allow insertion of new calls to set plan attributes before the work of plan creation is done, and allow advanced users more control over memory space allocation. The limitation on the dimension n of the routine cublasgetrfbatched() has been removed. The routines cublasmatinvBatched() have been added to the CUBLAS Library. Installations can be updated when a new version of the CUDA Toolkit is available. deb installation packages for all the supported Linux distributions, except Ubuntu 10.04 and RHEL 5.5. The CUDA Toolkit and the CUDA Driver are now available for installation as. The CUDA Sample projects have makefiles that are now more self-contained and robust. The Toolkit is using a new installer on Windows. CUDA SDK 11.0 – 11.2 support for compute capability 3.5 – 8.6 (Kepler (in part), Maxwell, Pascal, Volta, Turing, Ampere) New data types: Bfloat16 and TF32 on third-generations Tensor Cores.- Adds support for Linux on the ARMv7 Architecture.10.2 is the last official release for macOS, as support will not be available for macOS in newer releases. Last version with support for compute capability 3.x (Kepler). CUDA SDK 10.0 – 10.2 support for compute capability 3.0 – 7.5 (Kepler, Maxwell, Pascal, Volta, Turing).CUDA SDK 9.0 – 9.2 support for compute capability 3.0 – 7.2 (Kepler, Maxwell, Pascal, Volta) (Pascal GTX 1070Ti Not Supported.Last version with support for compute capability 2.x (Fermi) (Pascal GTX 1070Ti Not Supported) CUDA SDK 8.0 support for compute capability 2.0 – 6.x (Fermi, Kepler, Maxwell, Pascal).CUDA SDK 7.0 – 7.5 support for compute capability 2.0 – 5.x (Fermi, Kepler, Maxwell).Last version with support for compute capability 1.x (Tesla) CUDA SDK 6.5 support for compute capability 1.1 – 5.x (Tesla, Fermi, Kepler, Maxwell).CUDA SDK 6.0 support for compute capability 1.0 – 3.5 (Tesla, Fermi, Kepler).CUDA SDK 5.0 – 5.5 support for compute capability 1.0 – 3.5 (Tesla, Fermi, Kepler).CUDA SDK 4.0 – 4.2 support for compute capability 1.0 – 2.1+x (Tesla, Fermi, more?).
0 Comments
Leave a Reply. |