Getting started
Users who wish to use Dirac must first submit a Request for Dirac Account form. This form includes fields to describe the proposed research. Users will be expected to provide a brief written summary of their results periodically.
Access to Dirac is via batch jobs only. You first login to the Carver system:
ssh -l username carver.nersc.gov
and then submit batch jobs to Dirac. See Running Jobs on Dirac.
Running Your First CUDA Code: Vector Addition
Open a new file called vectorAdd.cu with a text editor such as emacs or vi. Paste the contents of the below code into the file.
// Kernel definition, see also section 4.2.3 of Nvidia Cuda Programming Guide
__global__ void vecAdd(float* A, float* B, float* C)
{
// threadIdx.x is a built-in variable provided by CUDA at runtime
int i = threadIdx.x;
A[i]=0;
B[i]=i;
C[i] = A[i] + B[i];
}
#include <stdio.h>
#define SIZE 10
int main()
{
int N=SIZE;
float A[SIZE], B[SIZE], C[SIZE];
float *devPtrA;
float *devPtrB;
float *devPtrC;
int memsize= SIZE * sizeof(float);
cudaMalloc((void**)&devPtrA, memsize);
cudaMalloc((void**)&devPtrB, memsize);
cudaMalloc((void**)&devPtrC, memsize);
cudaMemcpy(devPtrA, A, memsize, cudaMemcpyHostToDevice);
cudaMemcpy(devPtrB, B, memsize, cudaMemcpyHostToDevice);
// __global__ functions are called: Func<<< Dg, Db, Ns >>>(parameter);
vecAdd<<<1, N>>>(devPtrA, devPtrB, devPtrC);
cudaMemcpy(C, devPtrC, memsize, cudaMemcpyDeviceToHost);
for (int i=0; i<SIZE; i++)
printf("C[%d]=%f\n",i,C[i]);
cudaFree(devPtrA);
cudaFree(devPtrA);
cudaFree(devPtrA);
}
Compile the Program
% module load cuda
% nvcc vectorAdd.cu
Submit Your Job to the Queue
The qsub command is used to submit job the Dirac compute nodes.
[virajp83@cvrsvc04 ~]$ qsub -I -V -q dirac_int -l nodes=1:ppn=8:fermi
qsub: waiting for job 494318.cvrsvc09-ib to start
qsub: job 494318.cvrsvc09-ib ready
Run Your job
Your job will be placed on a Dirac compute node. Run it as follows
[virajp83@dirac41 ~]$ ./a.out
C[0]=0.000000
C[1]=1.000000
C[2]=2.000000
C[3]=3.000000
C[4]=4.000000
C[5]=5.000000
C[6]=6.000000
C[7]=7.000000
C[8]=8.000000
C[9]=9.000000


