Embedded Multicore Building Blocks V1.0.0
|
Provides functionality to execute tasks on CUDA devices. More...
Functions | |
void | mtapi_cuda_plugin_initialize (MTAPI_OUT mtapi_status_t *status) |
Initializes the MTAPI CUDA environment on a previously initialized MTAPI node. More... | |
void | mtapi_cuda_plugin_finalize (MTAPI_OUT mtapi_status_t *status) |
Finalizes the MTAPI CUDA environment on the local MTAPI node. More... | |
mtapi_action_hndl_t | mtapi_cuda_action_create (MTAPI_IN mtapi_job_id_t job_id, MTAPI_IN char *kernel_source, MTAPI_IN char *kernel_name, MTAPI_IN mtapi_size_t local_work_size, MTAPI_IN mtapi_size_t element_size, MTAPI_IN void *node_local_data, MTAPI_IN mtapi_size_t node_local_data_size, MTAPI_OUT mtapi_status_t *status) |
This function creates a CUDA action. More... | |
CUcontext | mtapi_cuda_get_context (MTAPI_OUT mtapi_status_t *status) |
Retrieves the handle of the CUDA context used by the plugin. More... | |
Provides functionality to execute tasks on CUDA devices.
void mtapi_cuda_plugin_initialize | ( | MTAPI_OUT mtapi_status_t * | status | ) |
Initializes the MTAPI CUDA environment on a previously initialized MTAPI node.
It must be called on all nodes using the MTAPI CUDA plugin.
Application software using MTAPI CUDA must call mtapi_cuda_plugin_initialize() once per node. It is an error to call mtapi_cuda_plugin_initialize() multiple times from a given node, unless mtapi_cuda_plugin_finalize() is called in between.
On success, *status
is set to MTAPI_SUCCESS
. On error, *status
is set to the appropriate error defined below.
Error code | Description |
---|---|
MTAPI_ERR_UNKNOWN | MTAPI CUDA couldn't be initialized. |
[out] | status | Pointer to error code, may be MTAPI_NULL |
void mtapi_cuda_plugin_finalize | ( | MTAPI_OUT mtapi_status_t * | status | ) |
Finalizes the MTAPI CUDA environment on the local MTAPI node.
It has to be called by each node using MTAPI CUDA. It is an error to call mtapi_cuda_plugin_finalize() without first calling mtapi_cuda_plugin_initialize(). An MTAPI node can call mtapi_cuda_plugin_finalize() once for each call to mtapi_cuda_plugin_initialize(), but it is an error to call mtapi_cuda_plugin_finalize() multiple times from a given node unless mtapi_cuda_plugin_initialize() has been called prior to each mtapi_cuda_plugin_finalize() call.
All CUDA tasks that have not completed and that have been started on the node where mtapi_cuda_plugin_finalize() is called will be canceled (see mtapi_task_cancel()). mtapi_cuda_plugin_finalize() blocks until all tasks that have been started on the same node return. Tasks that execute actions on the node where mtapi_cuda_plugin_finalize() is called, also block finalization of the MTAPI CUDA system on that node.
On success, *status
is set to MTAPI_SUCCESS
. On error, *status
is set to the appropriate error defined below.
Error code | Description |
---|---|
MTAPI_ERR_UNKNOWN | MTAPI CUDA couldn't be finalized. |
[out] | status | Pointer to error code, may be MTAPI_NULL |
mtapi_action_hndl_t mtapi_cuda_action_create | ( | MTAPI_IN mtapi_job_id_t | job_id, |
MTAPI_IN char * | kernel_source, | ||
MTAPI_IN char * | kernel_name, | ||
MTAPI_IN mtapi_size_t | local_work_size, | ||
MTAPI_IN mtapi_size_t | element_size, | ||
MTAPI_IN void * | node_local_data, | ||
MTAPI_IN mtapi_size_t | node_local_data_size, | ||
MTAPI_OUT mtapi_status_t * | status | ||
) |
This function creates a CUDA action.
It is called on the node where the user wants to execute an action on an CUDA device. A CUDA action contains a reference to a local job, the kernel source to compile and execute on the CUDA device, the name of the kernel function, a local work size (see CUDA specification for details) and the size of one element in the result buffer. After a CUDA action is created, it is referenced by the application using a node-local handle of type mtapi_action_hndl_t
, or indirectly through a node-local job handle of type mtapi_job_hndl_t
. A CUDA action's life-cycle begins with mtapi_cuda_action_create(), and ends when mtapi_action_delete() or mtapi_finalize() is called.
To create an action, the application must supply the domain-wide job ID of the job associated with the action. Job IDs must be predefined in the application and runtime, of type mtapi_job_id_t
, which is an implementation-defined type. The job ID is unique in the sense that it is unique for the job implemented by the action. However several actions may implement the same job for load balancing purposes.
If node_local_data_size
is not zero, node_local_data
specifies the start of node local data shared by kernel functions executed on the same node. node_local_data_size
can be used by the runtime for cache coherency operations.
On success, an action handle is returned and *status
is set to MTAPI_SUCCESS
. On error, *status
is set to the appropriate error defined below. In the case where the action already exists, status
will be set to MTAPI_ERR_ACTION_EXISTS
and the handle returned will not be a valid handle.
Error code | Description |
---|---|
MTAPI_ERR_JOB_INVALID | The job_id is not a valid job ID, i.e., no action was created for that ID or the action has been deleted. |
MTAPI_ERR_ACTION_EXISTS | This action is already created. |
MTAPI_ERR_ACTION_LIMIT | Exceeded maximum number of actions allowed. |
MTAPI_ERR_NODE_NOTINIT | The calling node is not initialized. |
MTAPI_ERR_UNKNOWN | The kernel could not be compiled or no CUDA device was available. |
[in] | job_id | Job id |
[in] | kernel_source | Pointer to kernel source |
[in] | kernel_name | Name of the kernel function |
[in] | local_work_size | Size of local work group |
[in] | element_size | Size of one element in the result buffer |
[in] | node_local_data | Data shared across tasks |
[in] | node_local_data_size | Size of shared data |
[out] | status | Pointer to error code, may be MTAPI_NULL |
CUcontext mtapi_cuda_get_context | ( | MTAPI_OUT mtapi_status_t * | status | ) |
Retrieves the handle of the CUDA context used by the plugin.
[out] | status | Pointer to error code, may be MTAPI_NULL |