<< gpuKronecker sciGPGPU gpuMatrix >>

sciGPGPU >> sciGPGPU > gpuLoadFunction

gpuLoadFunction

Load a kernel created with Cuda/OpenCL and compiled with gpuBuild function.

Call sequence

func=gpuLoadFunction(bin,"kernelname",block_height,block_width,grid_height,grid_width);

Parameters

bin

Matrix of string which contain the path of the file containing the kernel and the language of kernels writing (Cuda or OpenCL).

kernelname

Name of the kernel

block_height

Height of a block

block_width

Width of a block

grid_height

Height of the computing grid (in block)

grid_width

Width of the computing grid (in block)

func

The Cuda/OpenCL kernel.

Description

func=gpuLoadFunction(bin,kernelname);

func=gpuLoadFunction(bin,kernelname) loads a kernel from a compiling file. Then the returned value can directly called in Scilab. Inputs argument are input arguments of the kernel. Only device pointer, scalar double or scalar integer can be used. Optional arguments can be used to modify block/grid dimensions.

bin is an absolute path to a compiling file. The user must ckeck that its hardware meets kernel requirement (that is, a ptx build for 1.3 compute capable gpu card will return false results when launched on a 1.1 compute capable gpu card.)

kernelname is the name of the kernel inside the PTX file. It is the name given to the corresponding function in C for gpu source code, but can be mangled. To avoid name mangling, please add the extern "C" attribute before declaring the kernel.

Exemples

//--matrixAdd.cu--
extern "C"
__global__ void
matrixAdd( double* C, double* A, double* B, int M, int N)
{
    int idx = blockIdx.x;
    int idy = blockIdx.y;

    int dx = blockDim.x;
    int dy = blockDim.y;

    int tx = threadIdx.x;
    int ty = threadIdx.y;

    int x=tx+dx*idx;
    int y=ty+dy*idy;

    if(x<M && y<N)
      C[ x + y*M ]= A[ x + y*M ] + B[ x+ y*M ];
}

//--Scilab script--
A=ones(16,16);gA=gpuSetData(A)
B=2*ones(16,16);gB=gpuSetData(B)
C=0*ones(16,16);gC=gpuSetData(C)

bin=gpuBuild(gpuPATH+"/tests/unit_tests/"+"matrixAdd");
func=gpuLoadFunction(bin,"matrixAdd",16,16,1,1)

// call the kernel with block/grid dimensions setted in gpuLoadFunction call.
func(gC,gB,gA,int32(16),int32(16))
C=gpuGetData(gC)

// call the kernel with new block/grid dimensions.
func(gC,gB,gA,int32(10),int32(10), blockX=10, blockY=10, gridX=1, gridY=1)
C=gpuGetData(gC)

clear gA;
clear gB;
clear gC;
gpuExit();

See Also


Report an issue
<< gpuKronecker sciGPGPU gpuMatrix >>