<< gpuAlloc sciGPGPU gpuBuild >>

sciGPGPU >> sciGPGPU > gpuApplyFunction

gpuApplyFunction

Execute a kernel.

Call sequence

gpuApplyFunction(fonc,args,block_height,block_width,grid_height,grid_width);

Parameters

fonc

Kernel to launch, returned by gpuLoadFunction.

args

The argument list passed to the kernel Cuda/OpenCL. It's a Scilab list created with list() function.

block_height

Height of a block

block_width

Width of a block

grid_height

Height of the computing grid (in block)

grid_width

Width of the computing grid (in block)

Description

gpuApplyFunction(fonc,args,block_width,block_height,grid_width,grid_height);

gpuApplyFunction(fonc,args,block_height,block_width,grid_height,grid_width) launch the kernel fonc with arguments from the list args. The size of a block is block_height x block_width.

fonc is an object returned by gpuLoadFunction.

args is a Scilab list of int and pointers on matrix stored in GPU memory (returned by gpuAlloc or gpuSetData).

GPU kernels are executed with respect to a grid of blocks, the grid can be seen as an array of grid_heightxgrid_width blocks.

Exemples

//--matrixAdd.cu--
extern "C"
__global__ void
matrixAdd( double* C, double* A, double* B, int M, int N)
{
int idx = blockIdx.x;
int idy = blockIdx.y;

int dx = blockDim.x;
int dy = blockDim.y;

int tx = threadIdx.x;
int ty = threadIdx.y;

int x=tx+dx*idx;
int y=ty+dy*idy;

if(x<M && y<N)
  C[ x + y*M ]= A[ x + y*M ] + B[ x+ y*M ];
}

//--Scilab script--
A=ones(16,16);gA=gpuSetData(A)
B=2*ones(16,16);gB=gpuSetData(B)
C=0*ones(16,16);gC=gpuSetData(C)

bin=gpuBuild(gpuPATH+"/tests/unit_tests/"+"matrixAdd");
fonc=gpuLoadFunction(bin,"matrixAdd")
lst=list(gC,gB,gA,16,16)
gpuApplyFunction(fonc,lst,16,16,1,1);
C=gpuGetDAta(gC);
gpuFree(gA);
gpuFree(gB);
gpuFree(gC);
gpuExit();

See Also

<< gpuAlloc sciGPGPU gpuBuild >>