NetSci
Loading...
Searching...
No Matches
Public Member Functions | Private Attributes | List of all members
CuArray< T > Class Template Reference

Manages CUDA-supported arrays, offering initialization, memory management, and data manipulation. Implemented as a template class in C++, with Python and Tcl wrapper interfaces. In Python and Tcl, use as <ElementType>CuArray (e.g., FloatCuArray, IntCuArray), as they don't support templates. Supports float and int types in Python and Tcl, and all numeric types in C++. More...

#include <cuarray.h>

Public Member Functions

 CuArray ()
 Constructs an empty CuArray object.
 
CuArrayError init (int m, int n)
 Initializes CuArray with specified dimensions and allocates memory on host and device.
 
CuArrayError init (T *host, int m, int n)
 Initializes CuArray with specified host data and dimensions, performing a shallow copy. Allocates memory on both the host and the device. The data is shallow copied, so the ownership remains unchanged.
 
CuArrayError fromCuArrayShallowCopy (CuArray< T > *cuArray, int start, int end, int m, int n)
 Performs a shallow copy of data from another CuArray within a specified row range. Copies the host data from the given CuArray, within the inclusive range specified by 'start' and 'end'. This CuArray does not own the copied data, and deallocation is handled by the source CuArray.
 
CuArrayError fromCuArrayDeepCopy (CuArray< T > *cuArray, int start, int end, int m, int n)
 Performs a deep copy of data from another CuArray within a specified row range. Copies the host data from the given CuArray, including all data within the inclusive range defined by 'start' and 'end'. Memory for the copied data is allocated in this CuArray's host memory.
 
 ~CuArray ()
 Destructor for CuArray. Deallocates memory on both the host and the device.
 
int n () const
 Returns the number of columns in the CuArray.
 
int m () const
 Returns the number of rows in the CuArray.
 
int size () const
 Returns the total number of elements in the CuArray.
 
size_t bytes () const
 Returns the total size in bytes of the CuArray data.
 
T *& host ()
 Returns a reference to the host data.
 
T *& device ()
 Returns a reference to the device data.
 
CuArrayError allocateHost ()
 Allocates memory for the host data.
 
CuArrayError allocateDevice ()
 Allocates memory for the device data.
 
CuArrayError allocatedHost () const
 Checks if memory is allocated for the host data.
 
CuArrayError allocatedDevice () const
 Checks if memory is allocated for the device data.
 
CuArrayError toDevice ()
 Copies data from the host to the device.
 
CuArrayError toHost ()
 Copies data from the device to the host.
 
CuArrayError deallocateHost ()
 Deallocates memory for the host data.
 
CuArrayError deallocateDevice ()
 Deallocates memory for the device data.
 
CuArrayError fromNumpy (T *NUMPY_ARRAY, int NUMPY_ARRAY_DIM1, int NUMPY_ARRAY_DIM2)
 Copies data from a NumPy array to the CuArray.
 
CuArrayError fromNumpy (T *NUMPY_ARRAY, int NUMPY_ARRAY_DIM1)
 Copies data from a NumPy array to the CuArray.
 
void toNumpy (T **NUMPY_ARRAY, int **NUMPY_ARRAY_DIM1, int **NUMPY_ARRAY_DIM2)
 Copies data from the CuArray to a NumPy array.
 
void toNumpy (T **NUMPY_ARRAY, int **NUMPY_ARRAY_DIM1)
 Copies data from the CuArray to a NumPy array.
 
T get (int i, int j) const
 Returns the value at a specified position in the CuArray.
 
CuArrayError set (T value, int i, int j)
 Sets a value at a specified position in the CuArray.
 
CuArrayError load (const std::string &fname)
 Loads CuArray data from a specified file.
 
void save (const std::string &fname)
 Saves CuArray data to a specified file.
 
CuArray< T > * sort (int i)
 Sorts CuArray based on the values in a specified row.
 
Toperator[] (int i) const
 Returns a reference to the element at a specified index in the CuArray.
 
int owner () const
 Returns the owner status of the CuArray. Indicates whether the CuArray is responsible for memory deallocation.
 
CuArray< int > * argsort (int i)
 Performs an argsort on a specified row of the CuArray. Returns a new CuArray containing sorted indices.
 

Private Attributes

Thost_
 
Tdevice_
 
int n_ {}
 
int m_ {}
 
int size_ {}
 
size_t bytes_ {}
 
int allocatedDevice_ {}
 
int allocatedHost_ {}
 
int owner_ {}
 

Detailed Description

template<typename T>
class CuArray< T >

Manages CUDA-supported arrays, offering initialization, memory management, and data manipulation. Implemented as a template class in C++, with Python and Tcl wrapper interfaces. In Python and Tcl, use as <ElementType>CuArray (e.g., FloatCuArray, IntCuArray), as they don't support templates. Supports float and int types in Python and Tcl, and all numeric types in C++.

Parameters
TData type of the array elements.

Constructor & Destructor Documentation

◆ CuArray()

template<typename T >
CuArray< T >::CuArray ( )

Constructs an empty CuArray object.

C++ Example

#include <iostream>
#include "cuarray.h"
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
auto *cuArray = new CuArray<float>();
/* Free memory */
delete cuArray;
return 0;
}
Manages CUDA-supported arrays, offering initialization, memory management, and data manipulation....
Definition cuarray.h:24
CuArrayError set(T value, int i, int j)
Sets a value at a specified position in the CuArray.

Python Example

"""
Always precede CuArray with the data type
Here we are importing CuArray int and float templates.
"""
from cuarray import FloatCuArray, IntCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""Create a new int CuArray instance"""
int_cuarray = IntCuArray()

Member Function Documentation

◆ allocatedDevice()

template<typename T >
CuArrayError CuArray< T >::allocatedDevice ( ) const

Checks if memory is allocated for the device data.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <iostream>
#include <random>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Allocate device memory. */
cuArray->allocateDevice();
/* Check if device memory is allocated. If it is,
* allocatedDevice() will return 1, other wise it
* will return 0. This is convenient for boolean checks.*/
auto deviceMemoryAllocated = cuArray->allocatedDevice();
/* Print whether or not device memory is allocated. */
std::cout
<< "Device memory allocated: "
<< std::endl;
delete cuArray;
return 0;
}

◆ allocateDevice()

template<typename T >
CuArrayError CuArray< T >::allocateDevice ( )

Allocates memory for the device data.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <random>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Fill the CuArray with random values */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
cuArray->host()[i * cuArray->n() + j] =
static_cast<float>(rand() / (float) RAND_MAX);
}
}
/* Allocate device memory. If successful, allocateDevice returns 0.*/
auto err = cuArray->allocateDevice();
/* Check if device memory allocation was successful. */
if (err == 0) {
std::cout
<< "Device memory allocated successfully."
<< std::endl;
} else {
std::cout
<< "Device memory allocation failed."
<< std::endl;
}
/* Frees host and device memory. */
delete cuArray;
return 0;
}

◆ allocatedHost()

template<typename T >
CuArrayError CuArray< T >::allocatedHost ( ) const

Checks if memory is allocated for the host data.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <random>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Check if host memory is allocated. If it is,
* allocatedHost() will return 1, other wise it
* will return 0. This is convenient for boolean checks.*/
auto hostMemoryAllocated = cuArray->allocatedHost();
/* Print whether or not host memory is allocated. */
std::cout
<< "Host memory allocated: "
<< std::endl;
delete cuArray;
return 0;
}

◆ allocateHost()

template<typename T >
CuArrayError CuArray< T >::allocateHost ( )

Allocates memory for the host data.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <random>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Fill the CuArray with random values */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
cuArray->host()[i * cuArray->n() + j] =
static_cast<float>(rand() / (float) RAND_MAX);
}
}
/* Allocate device memory. */
cuArray->allocateDevice();
/* Copy data from host to device. */
cuArray->toDevice();
/* Free host memory, since it is no longer needed.*/
cuArray->deallocateHost();
/*Do some complicated GPU calculations
* and then allocate host memory when you need it again.
* Also, this is extremely wasteful, it's just an example of
* how to use this method. Realistically, most users will never have
* to manually allocate host memory as that is handled by the
* init methods. If memory allocation is successful, allocateHost
* returns 0*/
auto err = cuArray->allocateHost();
/* Check if host memory allocation was successful. */
if (err == 0) {
std::cout
<< "Host memory allocated successfully."
<< std::endl;
} else {
std::cout
<< "Host memory allocation failed."
<< std::endl;
}
/* Copy data from device to host. */
cuArray->toHost();
delete cuArray;
return 0;
}

◆ argsort()

template<typename T >
CuArray< int > * CuArray< T >::argsort ( int  i)

Performs an argsort on a specified row of the CuArray. Returns a new CuArray containing sorted indices.

Parameters
iColumn index to argsort.
Returns
Pointer to a new CuArray with sorted indices.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Fill the CuArray with random values */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
cuArray->host()[i * cuArray->n() + j] =
static_cast<float>(rand() / (float) RAND_MAX);
}
}
/* Create a new CuArray with indices that sort the 8th row
* of the original CuArray.*/
auto cuArrayRowIndex = 7;
/* Create a new CuArray containing sorted data from the 8th row
* of the original CuArray.*/
/* Print the sorted CuArray and the corresponding values from the
* original CuArray using the sortedIndicesCuArray.*/
for (int j = 0; j < sortedCuArray->n(); j++) {
j);
auto sortedValue = sortedCuArray->get(0,
j);
std::cout
<< " "
<< " "
<< std::endl;
}
/* Cleanup time. */
delete cuArray;
delete sortedCuArray;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray int and float templates
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""
Create a new float CuArray instance
"""
float_cuarray = FloatCuArray()
"""
Create a random float32 numpy array with 10 rows
and 10 columns
"""
numpy_array = np.random.rand(10, 10).astype(np.float32)
"""Load the numpy array into the CuArray"""
float_cuarray.fromNumpy2D(numpy_array)
"""
Perform a descending sort on
the 8th row of float_cuarray
"""
sorted_cuarray = float_cuarray.sort(7)
"""
Get the indices that sort the 8th row of float_cuarray
"""
argsort_cuarray = float_cuarray.argsort(7)
"""
Print the sorted 8th row of float_cuarray using
sorted_cuarray and argsort_cuarray indices
"""
for _ in range(10):
sort_idx = argsort_cuarray[0][_]
print(
sorted_cuarray[0][_],
float_cuarray[7][sort_idx]
)

◆ bytes()

template<typename T >
size_t CuArray< T >::bytes ( ) const

Returns the total size in bytes of the CuArray data.

Includes both the host and device memory.

Returns
Size in bytes.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/*
* Initializes the CuArray with 10 rows and 5 columns
* and allocates memory on host.
*/
cuArray->init(10,
5);
/* Get the number of bytes the CuArray data occupies */
auto bytes_ = cuArray->bytes();
/* Print the total number of bytes in cuArray. */
std::cout
<< "Number of bytes: "
<< bytes_
<< std::endl;
/* Output:
* Number of bytes: 200
*/
delete cuArray;
return 0;
}

Python Example

◆ deallocateDevice()

template<typename T >
CuArrayError CuArray< T >::deallocateDevice ( )

Deallocates memory for the device data.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

◆ deallocateHost()

template<typename T >
CuArrayError CuArray< T >::deallocateHost ( )

Deallocates memory for the host data.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <random>
int main() {
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Fill the CuArray with random values */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
cuArray->host()[i * cuArray->n() + j] =
static_cast<float>(rand() / (float) RAND_MAX);
}
}
/* Allocate device memory. */
cuArray->allocateDevice();
/* Copy data from host to device. */
cuArray->toDevice();
/* Deallocate the host array to reduce memory usage if it's not needed again. */
cuArray->deallocateHost();
/* Set the number of threads per block to 1024 */
auto threadsPerBlock = 1024;
/* Set the number of blocks to the ceiling of the number of elements
* divided by the number of threads per block. */
/* Launch a CUDA kernel that does something cool and only takes
* a single float array as an argument
*<<<blocksPerGrid, threadsPerBlock>>>kernel(cuArray->device()); */
/* Free device memory. */
delete cuArray;
return 0;
}

◆ device()

template<typename T >
T *& CuArray< T >::device ( )

Returns a reference to the device data.

Returns
Reference to the device data.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 3 rows and 3 columns */
cuArray->init(3,
3);
/*Set each i, j element equal to i*3 + j */
for (int i = 0; i < 9; i++) {
cuArray->host()[i] = i;
}
/* Allocate device memory. */
cuArray->allocateDevice();
/* Copy data from host to device. */
cuArray->toDevice();
/* Set deviceArray equal to cuArray's device data via the
* device() method, */
auto deviceArray = cuArray->device();
/* which can be used in CUDA kernels.
* Eg.) <<<1, 1>>>kernel(deviceArray)*/
/* delete frees both host and device memory. */
delete cuArray;
return 0;
}

◆ fromCuArrayDeepCopy()

template<typename T >
CuArrayError CuArray< T >::fromCuArrayDeepCopy ( CuArray< T > *  cuArray,
int  start,
int  end,
int  m,
int  n 
)

Performs a deep copy of data from another CuArray within a specified row range. Copies the host data from the given CuArray, including all data within the inclusive range defined by 'start' and 'end'. Memory for the copied data is allocated in this CuArray's host memory.

Parameters
cuArrayPointer to the source CuArray.
startIndex of the first row to copy.
endIndex of the last row to copy.
mNumber of rows in this CuArray.
nNumber of columns in this CuArray.
Returns
CuArrayError indicating the operation's success or failure.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Create a new float CuArray instance */
auto cuArray = new CuArray<float>;
/* Initialize the CuArray with 3 rows and 3 columns */
3);
/*Set each i, j element equal to i*3 + j */
for (int i = 0; i < 9; i++) {
cuArray->host()[i] = i;
}
/*
* Create a float 'CuArray' that
* will be a deep copy of the last two cuArray rows
*/
3);
/* First row to copy from cuArray into cuArray2x3Copy */
int startRowIndex = 1;
/* Last row to copy from cuArray into cuArray2x3Copy */
int endRowIndex = 2;
cuArray2x3Copy->fromCuArrayDeepCopy(
cuArray, /*Source for copying data into cuArray2x3Copy. This method is
* significantly safer than its shallow copy equivalent. However, it is also
* slower, which can impact performance if it's called a lot.*/
startRowIndex, /* First row to copy from cuArray into cuArray2x3Copy */
endRowIndex, /* Last row to copy from cuArray into cuArray2x3Copy */
cuArray2x3Copy->m(), /* Number of rows in cuArray2x3Copy */
cuArray2x3Copy->n() /* Number of columns in cuArray2x3Copy */
);
/* Print each element in cuArray2x3Copy */
for (int i = 0; i < cuArray2x3Copy->m(); i++) {
for (int j = 0; j < cuArray2x3Copy->n(); j++) {
std::cout << cuArray2x3Copy->get(i,
j) << " ";
}
std::cout << std::endl;
}
/* Output:
* 3 4 5
* 6 7 8
*/
/* Both cuArray and cuArray2x3Copy own their data.*/
std::cout
<< cuArray->owner() << " "
<< cuArray2x3Copy->owner()
<< std::endl;
/* Output:
* 1 1
*/
delete cuArray;
return 0;
}
CuArrayError init(int m, int n)
Initializes CuArray with specified dimensions and allocates memory on host and device.

Python Example

"""
Always precede CuArray with the data type
Here we are importing float templates.
"""
from cuarray import FloatCuArray
import numpy as np
print("Running", __file__)
"""Create two new float CuArray instances"""
float_cuarray1 = FloatCuArray()
float_cuarray2 = FloatCuArray()
"""Initialize float_cuarray1 with 10 rows and 10 columns"""
float_cuarray1.init(10, 10)
"""Fill float_cuarray1 with random values"""
for i in range(float_cuarray1.m()):
for j in range(float_cuarray1.n()):
val = np.random.random()
float_cuarray1[i][j] = val
"""Copy the data from float_cuarray1 into float_cuarray2"""
float_cuarray2.fromCuArray(float_cuarray1, 0, 9, 10, 10)
"""
Print both CuArrays. Also this performs a deep copy for
memory safety.
"""
for i in range(float_cuarray1.m()):
for j in range(float_cuarray1.n()):
print(float_cuarray1[i][j], float_cuarray2[i][j])

◆ fromCuArrayShallowCopy()

template<typename T >
CuArrayError CuArray< T >::fromCuArrayShallowCopy ( CuArray< T > *  cuArray,
int  start,
int  end,
int  m,
int  n 
)

Performs a shallow copy of data from another CuArray within a specified row range. Copies the host data from the given CuArray, within the inclusive range specified by 'start' and 'end'. This CuArray does not own the copied data, and deallocation is handled by the source CuArray.

Parameters
cuArrayPointer to the source CuArray.
startIndex of the first row to copy.
endIndex of the last row to copy.
mNumber of rows in this CuArray.
nNumber of columns in this CuArray.
Returns
CuArrayError indicating the operation's success or failure.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Create a new float CuArray instance */
auto cuArray = new CuArray<float>;
/* Initialize the CuArray with 3 rows and 3 columns */
3);
/*Set each i, j element equal to i*3 + j */
for (int i = 0; i < 9; i++) {
cuArray->host()[i] = i;
}
/*
* Create a float 'CuArray' that
* will be a shallow copy of the last two cuArray rows
*/
3);
/* First row to copy from cuArray into cuArray2x3Copy */
int startRowIndex = 1;
/* Last row to copy from cuArray into cuArray2x3Copy */
int endRowIndex = 2;
cuArray2x3Copy->fromCuArrayShallowCopy(
cuArray, /* Source for copying data into cuArray2x3Copy.
* Both cuArray and cuArray2x3Copy will point to the same
* data, which helps with
* performance at the expense of being extremely dangerous. As an
* attempt to make this method somewhat safe, there is an "owner"
* attribute that is set to 1 if the CuArray owns the data and 0
* otherwise. Logic is implemented in the destructor to check for ownership
* and only delete data if the CuArray owns the data. As of now, this method has
* passed all real life stress tests, and CUDA-MEMCHECK doesn't hate it,
* but it still shouldn't be used in the vast majority of cases.
* The legitimate reason this should ever be called is when you have to
* pass the CuArray data as a double pointer to a function that
* cannot itself take a CuArray object. Eg.) A CUDA kernel.*/
startRowIndex, /* First row to copy from cuArray into cuArray2x3Copy */
endRowIndex, /* Last row to copy from cuArray into cuArray2x3Copy */
cuArray2x3Copy->m(), /* Number of rows in cuArray2x3Copy */
cuArray2x3Copy->n() /* Number of columns in cuArray2x3Copy */
);
/* Print each element in cuArray2x3Copy */
for (int i = 0; i < cuArray2x3Copy->m(); i++) {
for (int j = 0; j < cuArray2x3Copy->n(); j++) {
std::cout << cuArray2x3Copy->get(i,
j) << " ";
}
std::cout << std::endl;
}
/* Output:
* 3 4 5
* 6 7 8
*/
delete cuArray;
return 0;
}

◆ fromNumpy() [1/2]

template<typename T >
CuArrayError CuArray< T >::fromNumpy ( T NUMPY_ARRAY,
int  NUMPY_ARRAY_DIM1 
)

Copies data from a NumPy array to the CuArray.

Parameters
NUMPY_ARRAYPointer to input NumPy array.
NUMPY_ARRAY_DIM1Dimension 1 of the NumPy array.
Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <iostream>
#include <random>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Create a float vector with 10 elements.*/
auto *NUMPY_ARRAY = new float[10];
int rows = 10;
/* Fill the NUMPY_ARRAY with random values */
for (int i = 0; i < rows; i++) {
}
/* Copy the NUMPY_ARRAY data into the CuArray. The
* CuArray has the same dimensions as the NUMPY_ARRAY. */
cuArray->fromNumpy(
);
/* Print the CuArray. */
for (int i = 0; i < rows; i++) {
std::cout
<< cuArray->host()[i]
<< " ";
}
std::cout
<< std::endl;
/* Free the NUMPY_ARRAY and CuArray. */
delete cuArray;
delete[] NUMPY_ARRAY;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""
Create a random float32, 1-dimension numpy array,
with 10 elements
"""
np_array = np.random.rand(10).astype(np.float32)
"""Copy the numpy array to the CuArray instance"""
float_cuarray.fromNumpy1D(np_array)
"""Print the CuArray and numpy array to compare."""
for _ in range(10):
print(float_cuarray[0][_], np_array[_])

◆ fromNumpy() [2/2]

template<typename T >
CuArrayError CuArray< T >::fromNumpy ( T NUMPY_ARRAY,
int  NUMPY_ARRAY_DIM1,
int  NUMPY_ARRAY_DIM2 
)

Copies data from a NumPy array to the CuArray.

Parameters
NUMPY_ARRAYPointer to the input NumPy array.
NUMPY_ARRAY_DIM1Dimension 1 of the NumPy array.
NUMPY_ARRAY_DIM2Dimension 2 of the NumPy array.
Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Create a linear float array that has 10 rows and 10 columns.*/
auto *NUMPY_ARRAY = new float[100];
int rows = 10;
int cols = 10;
/* Fill the NUMPY_ARRAY with random values */
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
}
}
/* Copy the NUMPY_ARRAY data into the CuArray. The
* CuArray has the same dimensions as the NUMPY_ARRAY. */
cuArray->fromNumpy(
);
/* Print the CuArray. */
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
std::cout
<< cuArray->host()[i * cols + j]
<< " ";
}
std::cout
<< std::endl;
}
std::cout
<< std::endl;
/* Free the NUMPY_ARRAY and CuArray. */
delete cuArray;
delete[] NUMPY_ARRAY;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""
Create a random float32, 2-dimension numpy array
with 10 rows and 10 columns.
"""
np_array = np.random.random((10, 10)).astype(np.float32)
"""Copy the numpy array to the CuArray instance"""
float_cuarray.fromNumpy2D(np_array)
"""Print the CuArray and numpy array to compare."""
for i in range(10):
for j in range(10):
print(float_cuarray[i][j], np_array[i][j])

◆ get()

template<typename T >
T CuArray< T >::get ( int  i,
int  j 
) const

Returns the value at a specified position in the CuArray.

Parameters
iRow index.
jColumn index.
Returns
Value at the specified position.

C++ Example

#include "cuarray.h"
#include <iostream>
#include <random>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance that will have 10 rows
* and 10 columns*/
int m = 10; /* Number of rows */
int n = 10; /* Number of columns */
cuArray->init(m,
n);
/* Fill the CuArray with random values */
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
cuArray->set((float) rand() / (float) RAND_MAX,
i,
j);
}
}
/* As it's name implies, get(i, j) returns the value at the
* specified position (i, j) in the CuArray. */
/* Use the get method to print the value at each position in the CuArray. */
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
std::cout
<< cuArray->get(i,
j)
<< " ";
}
std::cout
<< std::endl;
}
/* Free the CuArray. */
delete cuArray;
return 0;
}
int m() const
Returns the number of rows in the CuArray.
int n() const
Returns the number of columns in the CuArray.

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""
Create a new float CuArray instance with
10 rows and 10 columns
"""
float_cuarray = FloatCuArray()
float_cuarray.init(10, 10)
"""Fill the array with random values"""
for i in range(10):
for j in range(10):
val = np.random.random()
float_cuarray.set(val, i, j)
"""Print the array"""
print(float_cuarray)

◆ host()

template<typename T >
T *& CuArray< T >::host ( )

Returns a reference to the host data.

Returns
Reference to the host data.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 3 rows and 3 columns */
cuArray->init(3,
3);
/*Set each i, j element equal to i*3 + j */
for (int i = 0; i < 9; i++) {
cuArray->host()[i] = i;
}
/* Print each element in cuArray's host memory.
* The host data is linear and stored in row major order. To
* access element i,j you would use the linear index
* i*n+j, where n is the number of columns.*/
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
std::cout << cuArray->host()[i * cuArray->n() + j] << " ";
}
std::cout << std::endl;
}
/* Output:
* 0 1 2
* 3 4 5
* 6 7 8
*/
delete cuArray;
return 0;
}

◆ init() [1/2]

template<typename T >
CuArrayError CuArray< T >::init ( int  m,
int  n 
)

Initializes CuArray with specified dimensions and allocates memory on host and device.

Parameters
mNumber of rows.
nNumber of columns.
Returns
CuArrayError indicating operation success or failure.

C++ Example

#include <iostream>
#include "cuarray.h"
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
auto *cuArray = new CuArray<float>();
/*
* Initializes the CuArray with 10 rows and 5 columns
* and allocates memory on host.
*/
cuArray->init(10,
5);
/* Print the cuArray */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
std::cout << cuArray->get(i,
j) << " ";
}
std::cout << std::endl;
}
/* Free the memory allocated on host and device */
delete cuArray;
return 0;
}

Python Example

"""
Always precede CuArray with the data type
Here we are importing float templates.
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""Initialize the float CuArray with 10 rows and 10 columns"""
float_cuarray.init(10, 10)
"""
Print the CuArray,
which has a __repr__ method implemented in the SWIG interface
"""
print(float_cuarray)

◆ init() [2/2]

template<typename T >
CuArrayError CuArray< T >::init ( T host,
int  m,
int  n 
)

Initializes CuArray with specified host data and dimensions, performing a shallow copy. Allocates memory on both the host and the device. The data is shallow copied, so the ownership remains unchanged.

Parameters
hostPointer to input host data.
mNumber of rows.
nNumber of columns.
Returns
CuArrayError indicating operation success or failure.

C++ Example

#include "cuarray.h"
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/*
* Initializes the CuArray with 10 rows and 5 columns
* and allocates memory on host.
*/
cuArray->init(10,
5);
/* Create a 50-element float vector and fill it with random values */
auto a = new float[50];
for (int i = 0; i < 50; i++) {
a[i] = static_cast<float>(rand() / (float) RAND_MAX);
}
/* Initialize the CuArray with data from "a", preserving
* overall size while setting new dimensions
* (similar to NumPy's reshape method). */
cuArray->init(a,
10,
5);
/* Print each element in cuArray's host memory.
* The host data is linear and stored in row major order. To
* access element i,j you would use the linear index
* i*n+j, where n is the number of columns.*/
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
std::cout << cuArray->get(i,
j) << " ";
std::cout << a[i * cuArray->n() + j] << std::endl;
}
std::cout << std::endl;
}
/* Delete "a" and cuArray */
delete[] a;
delete cuArray;
return 0;
}

◆ load()

template<typename T >
CuArrayError CuArray< T >::load ( const std::string &  fname)

Loads CuArray data from a specified file.

Parameters
fnameFile name to load from.
Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include "cuarray.h"
#include <iostream>
#include <random>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Create a new double CuArray instance. We're using a double vs. float
* here because the numpy array is a float64 array. If you tried
* to load this file into a CuArray<float> it would cause a
* segmentation fault.*/
/*
* Load a serialized numpy array with 2000 elements from the C++ test data directory.
* NETSCI_ROOT_DIR, used here, is defined in CMakeLists. Ignore warnings in IDEs
* about it being undefined; it's a known issue and does not affect functionality.
*/
"/tests/netcalc/cpp/data/2X_1D_1000_4.npy";
cuArray->load(npyFname);
/* Print the CuArray. */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
std::cout
<< cuArray->get(i,
j)
<< std::endl;
}
}
/* Free the CuArray. */
delete cuArray;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""
Create a new float CuArray instance with
10 rows and 10 columns
"""
float_cuarray = FloatCuArray()
"""
Create a random float32 numpy array with 10 rows
and 10 columns
"""
numpy_array = np.random.rand(10, 10).astype(np.float32)
"""Save the numpy array to a .npy file"""
np.save("tmp.npy", numpy_array)
"""
Load the .npy file into the float CuArray instance
"""
float_cuarray.load("tmp.npy")
"""Print the CuArray and the numpy array"""
for i in range(10):
for j in range(10):
print(float_cuarray[i][j], numpy_array[i, j])

◆ m()

template<typename T >
int CuArray< T >::m ( ) const

Returns the number of rows in the CuArray.

Returns
Number of rows.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/*
* Initializes the CuArray with 10 rows and 5 rows
* and allocates memory on host.
*/
cuArray->init(10,
5);
/* Get the number of rows in the CuArray */
int m = cuArray->m();
/* Print the number of rows */
std::cout
<< "Number of rows: "
<< m
<< std::endl;
/* Output:
* Number of rows: 10
*/
delete cuArray;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing float template.
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""Initialize the float CuArray with 10 rows and 2 columns"""
float_cuarray.init(10, 2)
"""Print the number of rows in the CuArray"""
print(float_cuarray.m())

◆ n()

template<typename T >
int CuArray< T >::n ( ) const

Returns the number of columns in the CuArray.

Returns
Number of columns.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/*
* Initializes the CuArray with 10 rows and 5 columns
* and allocates memory on host.
*/
cuArray->init(10,
5);
/* Get the number of columns in the CuArray */
int n = cuArray->n();
/* Print the number of columns */
std::cout
<< "Number of columns: "
<< n
<< std::endl;
/* Output:
* Number of columns: 5
*/
delete cuArray;
return 0;
}

Python Example

"""
Always precede CuArray with the data type
Here we are importing the float template.
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""Initialize the float CuArray with 10 rows and 2 columns"""
float_cuarray.init(10, 2)
"""Print the number of columns in the CuArray"""
print(float_cuarray.n())

◆ operator[]()

template<typename T >
T & CuArray< T >::operator[] ( int  i) const

Returns a reference to the element at a specified index in the CuArray.

Parameters
iIndex of the element.
Returns
Reference to the element at the specified index.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Create a new float CuArray instance */
auto cuArray = new CuArray<float>;
/* Initialize the CuArray with 3 rows and 3 columns */
3);
/*Set each i, j element equal to i*3 + j */
for (int i = 0; i < 9; i++) {
cuArray->host()[i] = i;
}
/* Calculate the linear index that
* retrieves the 3rd element in the 2nd row of the CuArray. */
int i = 1;
int j = 2;
int linearIndex = i * cuArray->n() + j;
auto ijVal = cuArray->get(i,
j);
/* Print the values at the linear index and the (i, j) index. */
std::cout
<< " "
<< ijVal
<< std::endl;
/*Deallocate memory*/
delete cuArray;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""
Create a new float CuArray instance
with 10 rows and 10 columns.
"""
float_cuarray = FloatCuArray()
float_cuarray.init(10, 10)
"""Fill it with random values"""
for i in range(10):
for j in range(10):
val = np.random.rand()
float_cuarray.set(val, i, j)
"""Print the 8th row"""
print(float_cuarray[7])
"""Print the 5th element of the 8th row"""
print(float_cuarray[7][4])

◆ owner()

template<typename T >
int CuArray< T >::owner ( ) const

Returns the owner status of the CuArray. Indicates whether the CuArray is responsible for memory deallocation.

Returns
Owner status of the CuArray.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Create a new float CuArray instance */
auto cuArray = new CuArray<float>;
/* Initialize the CuArray with 3 rows and 3 columns */
3);
/*Set each i, j element equal to i*3 + j */
for (int i = 0; i < 9; i++) {
cuArray->host()[i] = i;
}
/*
* Create a float 'CuArray' that
* will be a shallow copy of the last two cuArray rows
*/
3);
/* First row to copy from cuArray into cuArray2x3Copy */
int startRowIndex = 1;
/* Last row to copy from cuArray into cuArray2x3Copy */
int endRowIndex = 2;
cuArray2x3Copy->fromCuArrayShallowCopy(
cuArray, /* Source for copying data into cuArray2x3Copy. See
* CuArray::fromCuArrayShallowCopy for more info. */
startRowIndex, /* First row to copy from cuArray into cuArray2x3Copy */
endRowIndex, /* Last row to copy from cuArray into cuArray2x3Copy */
cuArray2x3Copy->m(), /* Number of rows in cuArray2x3Copy */
cuArray2x3Copy->n() /* Number of columns in cuArray2x3Copy */
);
/* Now make another CuArray that is a deep copy of cuArray2x3Copy */
3);
cuArray2x3DeepCopy->fromCuArrayDeepCopy(
cuArray, /* Source for copying data into cuArray2x3DeepCopy. See
* CuArray::fromCuArrayDeepCopy for more info. */
startRowIndex, /* First row to copy from cuArray into cuArray2x3DeepCopy */
endRowIndex, /* Last row to copy from cuArray into cuArray2x3DeepCopy */
cuArray2x3DeepCopy->m(), /* Number of rows in cuArray2x3DeepCopy */
cuArray2x3DeepCopy->n() /* Number of columns in cuArray2x3DeepCopy */
);
/* Check if cuArray2x3Copy owns the host data. */
/* Check if cuArray2x3DeepCopy owns the host data.
* Sorry for the verbosity :), I'm sure this is painful for
* Python devs to read (though Java devs are probably loving it).*/
/* Print data in both arrays. */
for (int i = 0; i < cuArray2x3Copy->m(); i++) {
for (int j = 0; j < cuArray2x3Copy->n(); j++) {
std::cout
<< cuArray2x3Copy->get(i,
j)
<< " "
j)
<< std::endl;
}
}
/* Print ownership info. */
std::cout
<< "cuArray2x3Copy owns host data: "
<< " cuArray2x3DeepCopy owns host data: "
<< std::endl;
delete cuArray;
return 0;
}

◆ save()

template<typename T >
void CuArray< T >::save ( const std::string &  fname)

Saves CuArray data to a specified file.

Parameters
fnameFile name to save to.

C++ Example

#include "cuarray.h"
#include <iostream>
#define NETSCI_ROOT_DIR "."
int main() {
std::cout
<< "Running "
<< std::endl;
/* Create a new double CuArray instance that will have 10 rows and 10
* columns*/
cuArray->init(10,
10
);
/* Fill the CuArray with random values. */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
float val = static_cast <float> (rand()) /
static_cast <float> (RAND_MAX);
cuArray->set(val,
i,
j);
}
}
/* Save the CuArray to a .npy file. */
auto npyFname = NETSCI_ROOT_DIR "/tmp.npy";
cuArray->save(npyFname);
/* Create a new CuArray instance from the .npy file. */
/*Print (i, j) elements of the CuArray's next to each other.
* and check for equality*/
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
auto val1 = cuArray->get(i,
j);
auto val2 = cuArrayFromNpy->get(i,
j);
bool equal = val1 == val2;
std::cout
<< val1
<< " "
<< val2
<< " "
<< equal
<< std::endl;
if (!equal) {
std::cout
<< "Values at ("
<< i
<< ", "
<< j
<< ") are not equal."
<< std::endl;
return 1;
}
}
}
delete cuArray;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""
Create a new float CuArray instance with
10 rows and 10 columns
"""
float_cuarray = FloatCuArray()
"""
Create a random float32 numpy array with 10 rows
and 10 columns
"""
numpy_array = np.random.rand(10, 10).astype(np.float32)
"""Save the numpy array to a .npy file"""
np.save("tmp.npy", numpy_array)
"""
Load the .npy file into the float CuArray instance
"""
float_cuarray.load("tmp.npy")
"""Print the CuArray and the numpy array"""
for i in range(10):
for j in range(10):
print(float_cuarray[i][j], numpy_array[i, j])

◆ set()

template<typename T >
CuArrayError CuArray< T >::set ( value,
int  i,
int  j 
)

Sets a value at a specified position in the CuArray.

Parameters
valueValue to set.
iRow index.
jColumn index.
Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include "cuarray.h"
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance that will have 10 rows
* and 10 columns*/
int m = 10; /* Number of rows */
int n = 10; /* Number of columns */
cuArray->init(m,
n);
/* As it's name implies, set(value, i, j) sets the value at the
* specified position (i, j) in the CuArray. */
/* Use the set method to set the value at each position in the CuArray
* to a random number.*/
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
cuArray->set((float) rand() / (float) RAND_MAX,
i,
j);
}
}
/* Print the CuArray. */
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
std::cout
<< cuArray->get(i,
j)
<< " ";
}
std::cout
<< std::endl;
}
/* Free the CuArray. */
delete cuArray;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""
Create a new float CuArray instance with
10 rows and 10 columns
"""
float_cuarray = FloatCuArray()
float_cuarray.init(10, 10)
"""Fill the array with random values"""
for i in range(10):
for j in range(10):
val = np.random.random()
float_cuarray.set(val, i, j)
"""Print the array using the get method"""
for i in range(10):
for j in range(10):
val = float_cuarray.get(i, j)
print('{0:.{1}f}'.format(val, 5), end=" ")
print()

◆ size()

template<typename T >
int CuArray< T >::size ( ) const

Returns the total number of elements in the CuArray.

Returns
Total number of elements (rows multiplied by columns).

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/*
* Initializes the CuArray with 10 rows and 5 columns
* and allocates memory on host.
*/
cuArray->init(10,
5);
/* Get the total number of values in the CuArray */
int size = cuArray->size();
/* Print the total number of values in cuArray. */
std::cout
<< "Number of values: "
<< size
<< std::endl;
/* Output:
* Number of values: 50
*/
delete cuArray;
return 0;
}
int size() const
Returns the total number of elements in the CuArray.

Python Example

"""
Always precede CuArray with the data type
Here we are importing the float template.
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""Initialize the float CuArray with 10 rows and 2 columns"""
float_cuarray.init(10, 2)
"""Print the total number of values in the CuArray"""
print(float_cuarray.size())

◆ sort()

template<typename T >
CuArray< T > * CuArray< T >::sort ( int  i)

Sorts CuArray based on the values in a specified row.

Parameters
iIndex of the row to sort by.
Returns
Pointer to a new CuArray with sorted data.

C++ Example

#include <cuarray.h>
#include <random>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Fill the CuArray with random values */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
cuArray->host()[i * cuArray->n() + j] =
static_cast<float>(rand() / (float) RAND_MAX);
}
}
/* Create a new CuArray that contains the sorted data from the
* 8th row of the original CuArray. */
auto sortedCuArray = cuArray->sort(7);
/* Print the sorted CuArray. */
for (int j = 0; j < sortedCuArray->n(); j++) {
std::cout
<< sortedCuArray->get(0,
j)
<< std::endl;
}
/* Cleanup time. */
delete cuArray;
delete sortedCuArray;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""
Create a new float CuArray instance
"""
float_cuarray = FloatCuArray()
"""
Create a random float32 numpy array with 10 rows
and 10 columns
"""
numpy_array = np.random.rand(10, 10).astype(np.float32)
"""Load the numpy array into the CuArray"""
float_cuarray.fromNumpy2D(numpy_array)
"""
Perform an out of place descending sort on the
8th column of float_cuarray
"""
sorted_cuarray = float_cuarray.sort(7)
"""
Print the 8th row of the original
CuArray and sorted_cuarray
"""
print(sorted_cuarray)
print(float_cuarray[7])

◆ toDevice()

template<typename T >
CuArrayError CuArray< T >::toDevice ( )

Copies data from the host to the device.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <random>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Fill the CuArray with random values */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
cuArray->host()[i * cuArray->n() + j] =
static_cast<float>(rand() / (float) RAND_MAX);
}
}
/* Allocate device memory. */
cuArray->allocateDevice();
/* Copy data from host to device. */
cuArray->toDevice();
/* Frees host and device memory. */
delete cuArray;
return 0;
}

◆ toHost()

template<typename T >
CuArrayError CuArray< T >::toHost ( )

Copies data from the device to the host.

Returns
CuArrayError indicating success or failure of the operation.

C++ Example

#include <cuarray.h>
#include <iostream>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance */
/* Initialize the CuArray with 300 rows and 300 columns */
auto rows = 300;
auto cols = 300;
cuArray->init(rows,
cols);
/* Fill the CuArray with random values */
for (int i = 0; i < cuArray->m(); i++) {
for (int j = 0; j < cuArray->n(); j++) {
cuArray->host()[i * cuArray->n() + j] =
static_cast<float>(rand() / (float) RAND_MAX);
}
}
/* Allocate device memory. */
cuArray->allocateDevice();
/* Copy data from host to device. */
cuArray->toDevice();
/* Set the number of threads per block to 1024 */
auto threadsPerBlock = 1024;
/* Set the number of blocks to the ceiling of the number of elements
* divided by the number of threads per block. */
/* Launch a CUDA kernel that does something cool and only takes
* a single float array as an argument
*<<<blocksPerGrid, threadsPerBlock>>>kernel(cuArray->device()); */
/* Copy data from device to host. */
cuArray->toHost();
/* Frees host and device memory. */
delete cuArray;
return 0;
}

◆ toNumpy() [1/2]

template<typename T >
void CuArray< T >::toNumpy ( T **  NUMPY_ARRAY,
int **  NUMPY_ARRAY_DIM1 
)

Copies data from the CuArray to a NumPy array.

Parameters
NUMPY_ARRAYPointer to output NumPy array.
NUMPY_ARRAY_DIM1Dimension 1 of the NumPy array.

C++ Example

#include "cuarray.h"
#include <iostream>
#include <random>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance 1 row and 10 columns*/
int m = 1; /* Number of rows */
int n = 10; /* Number of columns */
cuArray->init(m,
n);
/* Create a double pointer to a float array. It will
* store the data from the CuArray. */
auto NUMPY_ARRAY = new float *[1];
/* Create two double pointer int arrays that will store
* the number rows and columns in the CuArray.
* Btw this is what the NumPy C backend is doing every time
* you create a numpy array in Python*/
auto cols = new int *[1];
/* Fill the CuArray with random values */
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
cuArray->set((float) rand() / (float) RAND_MAX,
i,
j);
}
}
/* Copy the CuArray data into the NUMPY_ARRAY. The
* NUMPY_ARRAY has the same dimensions as the CuArray. */
cuArray->toNumpy(
);
/* Print the NUMPY_ARRAY data and the CuArray data. */
for (int i = 0; i < n; i++) {
std::cout
<< cuArray->get(0,
i)
<< " ";
std::cout
<< (*(NUMPY_ARRAY))[i]
<< std::endl;
}
/* Clean this mess up. Makes you appreciate std::vectors :).*/
delete cuArray;
delete[] NUMPY_ARRAY[0];
delete[] NUMPY_ARRAY;
delete[] cols[0];
delete[] cols;
return 0;
}

Python Example

◆ toNumpy() [2/2]

template<typename T >
void CuArray< T >::toNumpy ( T **  NUMPY_ARRAY,
int **  NUMPY_ARRAY_DIM1,
int **  NUMPY_ARRAY_DIM2 
)

Copies data from the CuArray to a NumPy array.

Parameters
NUMPY_ARRAYPointer to output NumPy array.
NUMPY_ARRAY_DIM1Dimension 1 of the NumPy array.
NUMPY_ARRAY_DIM2Dimension 2 of the NumPy array.

C++ Example

#include "cuarray.h"
#include <iostream>
#include <random>
int main() {
std::cout
<< "Running "
<< std::endl;
/* Creates a new float CuArray instance that will have 10 rows
* and 10 columns*/
int m = 10; /* Number of rows */
int n = 10; /* Number of columns */
cuArray->init(m,
n);
/* Create a double pointer to a float array. It will
* store the data from the CuArray. */
auto NUMPY_ARRAY = new float *[1];
/* Create two double pointer int arrays that will store
* the number rows and columns in the CuArray.
* Btw this is what the NumPy C backend is doing every time
* you create a numpy array in Python*/
auto rows = new int *[1];
auto cols = new int *[1];
/* Fill the CuArray with random values */
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
cuArray->set((float) rand() / (float) RAND_MAX,
i,
j);
}
}
/* Copy the CuArray data into the NUMPY_ARRAY. The
* NUMPY_ARRAY has the same dimensions as the CuArray. */
cuArray->toNumpy(
);
/* Print the NUMPY_ARRAY data and the CuArray data. */
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
std::cout
<< cuArray->get(i,
j)
<< " ";
std::cout
<< (*(NUMPY_ARRAY))[i * m + j]
<< std::endl;
}
std::cout
<< std::endl;
}
/* Clean this mess up. Makes you appreciate std::vectors :).*/
delete cuArray;
delete[] NUMPY_ARRAY[0];
delete[] NUMPY_ARRAY;
delete[] rows[0];
delete[] rows;
delete[] cols[0];
delete[] cols;
return 0;
}

Python Example

import numpy as np
"""
Always precede CuArray with the data type
Here we are importing the CuArray float template
"""
from cuarray import FloatCuArray
print("Running", __file__)
"""Create a new float CuArray instance"""
float_cuarray = FloatCuArray()
"""
Create a random float32, 2-dimension numpy array
with 10 rows and 10 columns.
"""
np_array1 = np.random.random((10, 10)).astype(np.float32)
"""Copy the numpy array to the CuArray instance"""
float_cuarray.fromNumpy2D(np_array1)
"""Convert the CuArray instance to a numpy array"""
np_array2 = float_cuarray.toNumpy2D()
"""Print the CuArray and both numpy arrays to compare."""
for i in range(10):
for j in range(10):
print(
float_cuarray[i][j],
np_array1[i][j],
np_array2[i][j]
)

The documentation for this class was generated from the following file: