4.1 C
New York
Monday, January 27, 2025

Leveraging the Energy of GPUs with CuPy in Python


Leveraging the Power of GPUs with CuPy in Python
Picture by Creator

 

 

CuPy is a Python library that’s suitable with NumPy and SciPy arrays, designed for GPU-accelerated computing. By changing NumPy with CuPy syntax, you may run your code on NVIDIA CUDA or AMD ROCm platforms. This lets you carry out array-related duties utilizing GPU acceleration, which ends up in sooner processing of bigger arrays.

By swapping out just some strains of code, you may benefit from the huge parallel processing energy of GPUs to considerably pace up array operations like indexing, normalization, and matrix multiplication. 

CuPy additionally permits entry to low-level CUDA options. It permits passing of ndarrays to current CUDA C/C++ packages utilizing RawKernels, streamlines efficiency with Streams, and permits direct calling of CUDA Runtime APIs.

 

 

You possibly can set up CuPy utilizing pip, however earlier than that it’s a must to discover out the correct CUDA model utilizing the command under. 

 

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Company
Constructed on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation instruments, launch 11.8, V11.8.89
Construct cuda_11.8.r11.8/compiler.31833905_0

 

Evidently the present model of Google Colab is utilizing CUDA model 11.8. Due to this fact, we’ll proceed to put in the cupy-cuda11x model. 

In case you are working on an older CUDA model, I’ve supplied a desk under that will help you decide the suitable CuPy package deal to put in. 

 

Leveraging the Power of GPUs with CuPy in Python
Picture from CuPy 12.2.0

 

After choosing the correct model, we’ll set up the Python package deal utilizing pip. 

 

You may also use the conda command to routinely detect and set up the proper model of the CuPy package deal when you’ve got Anaconda put in.

conda set up -c conda-forge cupy

 

 

On this part, we’ll examine the syntax of CuPy with Numpy and they’re 95% comparable. As an alternative of utilizing np you can be changing it with cp.

We’ll first create a NumPy and CuPy array utilizing the Python checklist. After that, we’ll calculate the norm of the vector. 

import cupy as cp
import numpy as np

x = [3, 4, 5]
 
x_np = np.array(x)
x_cp = cp.array(x)
 
l2_np = np.linalg.norm(x_np)
l2_cp = cp.linalg.norm(x_cp)
 
print("Numpy: ", l2_np)
print("Cupy: ", l2_cp)

 

As we are able to see, we obtained comparable outcomes. 

Numpy:  7.0710678118654755
Cupy:  7.0710678118654755

 

To transform a NumPy to CuPy array, you may merely use cp.asarray(X).

x_array = np.array([10, 22, 30])
x_cp_array = cp.asarray(x_array)
sort(x_cp_array)

 

 

Or, use .get(), to transform CuPy to Numpy array. 

x_np_array = x_cp_array.get()
sort(x_np_array)

 

 

 

On this part, we will likely be evaluating the efficiency of NumPy and CuPy.

We’ll use time.time() to time the code execution time. Then, we’ll create a 3D NumPy array and carry out some mathematical features. 

import time

# NumPy and CPU Runtime
s = time.time()
x_cpu = np.ones((1000, 100, 1000))
np_result = np.sqrt(np.sum(x_cpu**2, axis=-1))
e = time.time()
np_time = e - s
print("Time consumed by NumPy: ", np_time)

 

Time consumed by NumPy: 0.5474584102630615

 

Equally, we’ll create a 3D CuPy array, carry out mathematical operations, and time it for efficiency. 

# CuPy and GPU Runtime
s = time.time()
x_gpu = cp.ones((1000, 100, 1000))
cp_result = cp.sqrt(cp.sum(x_gpu**2, axis=-1))
e = time.time()
cp_time = e - s
print("nTime consumed by CuPy: ", cp_time)

 

Time consumed by CuPy: 0.001028299331665039

 

To calculate the distinction, we’ll divide NumPy time with CuPy time and It looks as if we obtained above 500X efficiency increase whereas utilizing CuPy. 

diff = np_time/cp_time
print(f'nCuPy is {diff: .2f} X time sooner than NumPy')

 

CuPy is 532.39 X time sooner than NumPy

 

Notice:  To realize higher outcomes, it’s endorsed to conduct a couple of warm-up runs to reduce timing fluctuations.

 

Past its pace benefit, CuPy affords superior multi-GPU assist, enabling harnessing of collective energy of a number of GPUs.

Additionally, you may try my Colab pocket book, if you wish to examine the outcomes.

 

 

In conclusion, CuPy offers a easy strategy to speed up NumPy code on NVIDIA GPUs. By making just some modifications to swap out NumPy for CuPy, you may expertise order-of-magnitude speedups on array computations. This efficiency increase permits you to work with a lot bigger datasets and fashions, enabling extra superior machine studying and scientific computing.

 

Sources

 

 
 

Abid Ali Awan (@1abidaliawan) is an authorized information scientist skilled who loves constructing machine studying fashions. Presently, he’s specializing in content material creation and writing technical blogs on machine studying and information science applied sciences. Abid holds a Grasp’s diploma in Know-how Administration and a bachelor’s diploma in Telecommunication Engineering. His imaginative and prescient is to construct an AI product utilizing a graph neural community for college kids battling psychological sickness.

Related Articles

Latest Articles