NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 

Programming on Franklin: Other Topics

Vendor Documentation on Interlanguage Programming and OpenMP

Please see the PGI User's Guide (PDF), chapter 10, "Inter-Language Calling."

This chapter describes inter-language calling conventions
for C, C++ and Fortran programs using the PGI compilers.
The following sections describe how to call a Fortran function or
subroutine from a C or C++ program and how to cal a C or C++ function
from a Fortran program.

Table 10-1 covers Fortran and C/C++ data type compatibility, and Table 10-2 describe complex type representations. Section 10.8 is an example of a Fortran main program calling a C function, and Section 10.9 is an example of a C main program calling a Fortran function. Section 10.12 is an example of a Fortran main program calling a C++ function, and Section 10.13 is an example of a C++ main program calling a Fortran function. The examples include invocations of the pgcc, pgCC and pgf95 compilers.

OpenMP for Fortran is described in Chapter 5 of the PGI User's Guide, and OpenMP for C/C++ is described in Chapter 6. The environment variable OMP_NUM_THREADS is used to control the number of threads in an OpenMP section. Specifying more threads than processors may result in performance degradation.

For small, serial codes to be run on the service nodes (e.g., login nodes) with a standard SuSE Linux environment, the examples in the PGI User's Guide mentioned above should be followed.

 

Fortran Hybrid MPI/OpenMP Example

This example is based on a code at the OpenMP.org site. The code "solves a finite difference discretization of Helmholtz equation...using Jacobi iterative method." This code has been modified to use a constant set of input values and to compute their over-relaxation parameter based on the MPI rank. The code is run with one MPI task per node. With the default one thread per task, the time for the computation is about twice that with two threads per task. Because Franklin nodes only have two cores, allowing more than two threads with one MPI task per node does not improve performance.

Fortran Source Code

 

Batch Job Script

% cat runjac

#PBS -N jac
#PBS -q debug
#PBS -l mppwidth=4
#PBS -l mppnppn=1
#PBS -l mppdepth=2
#PBS -l walltime=00:10:00
#PBS -e jacobijob.out
#PBS -j eo

cd $PBS_O_WORKDIR

ftn -o jac -mp=nonuma -Minfo=mp jac-openmp.f

setenv OMP_NUM_THREADS 2
time aprun -N 1 -n 4 ./jac

Output File

 

C/MPI Main Calling Fortran Subroutine

Here is a full example of a parallel C code using MPI which calls a Fortran subroutine. The Fortran subroutine sets a number of data values based on the MPI task ID. The code is compiled and run using the batch system.

C Source Code

% cat cmainmpi.c

#include "mpi.h"

int main(int argc, char* argv[]) {
   char bool1, letter1;
   int numint1, numint2;
   float numfloat1;
   double numdoub1;
   short numshor1;
   extern void forts_();
   int myid, numprocs;
 
   MPI_Init(&argc,&argv);
   MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
   MPI_Comm_rank(MPI_COMM_WORLD,&myid); 

   forts_(&bool1, &letter1, &numint1, &numint2, &numfloat1,
          &numdoub1, &numshor1, &myid, 1);

   printf(" Hello from proc %d of %d; data values: %s %c %d %d %3.1f %.0f %d\n",
      myid, numprocs, bool1?"TRUE":"FALSE", letter1, numint1,
      numint2, numfloat1, numdoub1, numshor1); 

   MPI_Finalize();
}

Fortran Source Code

% cat forts.f

      subroutine forts(bool1,letter1,numint1,numint2,numfloat1,
&                      numdoub1,numshor1,numtask)
      logical*1 bool1
      character letter1
      integer numint1, numint2, numtask
      double precision numdoub1
      real numfloat1
      integer*2 numshor1
      
      bool1 = .true.
      if (numtask .ne. 2*(numtask/2)) bool1 = .false.
      letter1 = "y"
      numint1 = 11*numtask
      numint2 = numtask
      numdoub1 = 902
      numfloat1 = 39.6 + (0.1*numtask)
      numshor1 = 299
      
      return
      end

Batch Job Script

% cat runhello

#PBS -N hellojob
#PBS -q debug
#PBS -l mppwidth=4
#PBS -l walltime=00:01:00
#PBS -e hellojob.out
#PBS -j eo

cd $PBS_O_WORKDIR

cc -c cmainmpi.c
ftn -o cftnmpi -Mnomain cmainmpi.o forts.f
aprun -n 4 ./cftnmpi

Output File

% cat hellojob.out

Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
/opt/xt-pe/2.0.24b/bin/snos64/cc: INFO: linux target is being used
/opt/xt-pe/2.0.24b/bin/snos64/ftn: INFO: linux target is being used
forts.f:
 Hello from proc 1 of 4; data values: FALSE y 11 1 39.7 902 299
 Hello from proc 0 of 4; data values: TRUE y 0 0 39.6 902 299
 Hello from proc 3 of 4; data values: FALSE y 33 3 39.9 902 299
 Hello from proc 2 of 4; data values: TRUE y 22 2 39.8 902 299
Application 149009 resources: utime 0, stime 0

 

C++/MPI Main Calling Fortran Subroutine

Here is a full example of a parallel C++ code using MPI which calls a Fortran subroutine. The Fortran subroutine sets a number of data values based on the MPI task ID. The code is compiled and run using the batch system.

C++ Source Code

% cat cpmain.C

#include <mpi.h>
#include <iostream>
using namespace std;

extern "C" { extern void forts_(char *,char *,int *,int *,
            float *,double *,short *); }

int main (int argc, char* argv[])
{
 char bool1, letter1;
 int numprocs, myid;
 float numfloat1;
 double numdoub1;
 short numshor1;

 MPI_Init(&argc, &argv);
 MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
 MPI_Comm_rank(MPI_COMM_WORLD,&myid);

 forts_(&bool1,&letter1,&myid,&numprocs,&numfloat1,&numdoub1,&numshor1);

 cout << " bool1 = ";
 bool1?cout << "TRUE ":cout << "FALSE ";
 cout << "; letter1 = " << letter1 << "; myid = " << myid << " numprocs = " << numprocs;
 cout << " myid/numprocs  = " << numfloat1 << endl;
 cout << " numdoub1 = " << numdoub1 << " numshor1 = " << numshor1 << endl;

 MPI_Finalize();
}

Fortran Source Code

% cat forts.f

       subroutine forts
     .    (bool1,letter1,numint1,numint2,numfloat1,numdoub1,numshor1)

       logical*1 bool1
       character letter1
       integer numint1, numint2
       double precision numdoub1
       real numfloat1
       integer*2 numshor1

       bool1 = .true. 
       if (numint1 .ne. 2*(numint1/2)) bool1 = .false.
       letter1 = "v" 
       numdoub1 = 902
       numfloat1 = (1.0*numint1)/(1.0*numint2)
       numshor1 = 299

       return
       end

Batch Job Script

% cat runftncpp

#PBS -N ftncpp
#PBS -q debug
#PBS -l mppwidth=4
#PBS -l walltime=00:01:00
#PBS -e ftncpp.out
#PBS -j eo

cd $PBS_O_WORKDIR

ftn -c forts.f
CC -o cppftn forts.o cpmain.C -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl

aprun -n 4 ./cppftn

Output File

% cat ftncpp.out

Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
/opt/xt-pe/2.0.24b/bin/snos64/ftn: INFO: linux target is being used
/opt/xt-pe/2.0.24b/bin/snos64/CC: INFO: linux target is being used
 bool1 = FALSE ; letter1 = v; myid = 1 numprocs = 4 myid/numprocs  = 0.25
 numdoub1 = 902 numshor1 = 299
 bool1 = TRUE ; letter1 = v; myid = 0 numprocs = 4 myid/numprocs  = 0
 numdoub1 = 902 numshor1 = 299
 bool1 = FALSE ; letter1 = v; myid = 3 numprocs = 4 myid/numprocs  = 0.75
 numdoub1 = 902 numshor1 = 299
 bool1 = TRUE ; letter1 = v; myid = 2 numprocs = 4 myid/numprocs  = 0.5
 numdoub1 = 902 numshor1 = 299
Application 173761 resources: utime 0, stime 0

 

C/MPI Main Calling Fortran/OpenMP Subroutine

Here is a full example of a parallel C code using MPI which calls a Fortran subroutine. The Fortran subroutine includes a multi-threaded parallel segment and sets a number of data values based on the MPI task ID and number of OpenMP threads. The code is compiled and run using the batch system.

The environment variable OMP_NUM_THREADS is used to control the number of threads in the OpenMP segment.

Use of OpenMP requires a particular set of options on the Fortran compiler line: ftn -mp=nonuma -Minfo=mp ....

The source code for the C main program is the same as shown above. The Fortran routine, the batch script, and the output for the mixed C/Fortran and mixed MPI/OpenMP example are shown below.

Fortran Source Code

% cat forts.f

      subroutine forts(bool1,letter1,numint1,numint2,numfloat1,
&                      numdoub1,numshor1,numtask)
      logical*1 bool1
      character letter1
      integer numint1, numint2, numtask
      integer nthreads, tid
      integer  OMP_GET_NUM_THREADS, OMP_GET_THREAD_NUM
      double precision numdoub1
      real numfloat1
      integer*2 numshor1

      bool1 = .true.
      if (numtask .ne. 2*(numtask/2)) bool1 = .false.
      letter1 = "y"
      numint1 = 11*numtask
      numint2 = numtask
      numdoub1 = 902
      numfloat1 = 39.6 + (0.1*numtask)
!$OMP PARALLEL PRIVATE(nthreads, tid)
       tid = OMP_GET_THREAD_NUM()
       IF (tid .EQ. 0) THEN
          nthreads = OMP_GET_NUM_THREADS()
          numshor1 = nthreads
       ENDIF
!$OMP END PARALLEL
      
      return
      end

Batch Job Script

% cat runhello

#PBS -N hellojob
#PBS -q debug
#PBS -l mppwidth=2
#PBS -l mppnppn=1
#PBS -l mppdepth=2
#PBS -l walltime=00:05:00
#PBS -e hellojob.out
#PBS -j eo

cd $PBS_O_WORKDIR

cc -c cmainmpi.c
ftn -o cftnmpi -Mnomain -mp=nonuma -Minfo=mp cmainmpi.o forts.f

setenv OMP_NUM_THREADS 2
aprun -n 2 -N 1 ./cftnmpi

Output File

% cat hellojob.out

Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
/opt/xt-pe/2.0.44a2/bin/snos64/cc: INFO: linux target is being used
/opt/xt-pe/2.0.44a2/bin/snos64/ftn: INFO: linux target is being used
forts.f:
forts:
    19, Parallel region activated
    24, Parallel region terminated
 Hello from proc 0 of 2; data values: TRUE y 0 0 39.6 902 2
 Hello from proc 1 of 2; data values: FALSE y 11 1 39.7 902 2
Application 4904044 resources: utime 0, stime 5

 

C/MPI Main Calling Pthreads Subroutine

Here is a full example of a parallel C code using MPI which uses Pthreads. In this example, MPI task 0 is single-threaded, while MPI tasks 1-3 run 4 pthreads each. The code is compiled and run using the batch system.

C Source Code

% cat ptest.c

#include "mpi.h"
#include "pthread.h"
#define NUM_THREADS 4

int myid, numprocs;

void *ThreadHello(void *threadid, void *task)
{
   int tid, myproc;
   tid = (int)threadid;
   printf("Hello from thread %d of task %d \n",tid,myid);
   pthread_exit(NULL);
   return 0;
}

void main(int argc, char* argv[])
{
   int rc,t;
   pthread_t threads[NUM_THREADS];

   MPI_Init(&argc,&argv);
   MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
   MPI_Comm_rank(MPI_COMM_WORLD,&myid);

/* Run single thread on Processor 0,
   Multiple-threads on Processors 1-n */

   if (myid != 0) 
   {
      for (t=0; t<NUM_THREADS; t++)
      { rc = pthread_create(&threads[t], NULL, ThreadHello, (void *)t); }
      pthread_exit(NULL);
   }
   else { printf("Hello from single-threaded task %d \n",myid); }

   MPI_Barrier(MPI_COMM_WORLD);
   MPI_Finalize();
}

Batch Script

% cat ptest

#PBS -N ptest
#PBS -q debug
#PBS -l mppwidth=4
#PBS -l walltime=00:05:00
#PBS -e ptest.out
#PBS -j eo

cd $PBS_O_WORKDIR

cc -o ptest.exe ptest.c
aprun -n 4 ./ptest.exe

Output File

% cat ptest.out
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
/opt/xt-pe/2.0.53b/bin/snos64/cc: INFO: linux target is being used
ptest.c:
PGC-W-0155-Pointer value created from a nonlong integral type  (ptest.c: 31)
PGC-W-0095-Type cast required for this conversion (ptest.c: 31)
PGC/x86-64 Linux 7.1-6: compilation completed with warnings
Hello from single-threaded task 0 
Hello from thread 0 of task 1 
Hello from thread 1 of task 1 
Hello from thread 2 of task 1 
Hello from thread 3 of task 1 
[NID 12828]Apid 5175879: initiated application termination
Hello from thread 0 of task 3 
Hello from thread 0 of task 2 
Hello from thread 1 of task 2 
Hello from thread 2 of task 2 
Hello from thread 3 of task 2 
Hello from thread 1 of task 3 
Hello from thread 2 of task 3 
Hello from thread 3 of task 3 
Application 5175879 exit signals: Killed
Application 5175879 resources: utime 4, stime 14
 
   -------------------------- Batch Job Report ------------------------------
 
   Job Id:         5455004.nid00003
   User Name:      fvhale
   Group Name:     
   Job Name:       ptest
   Session ID:     16288
   Resource List:  walltime=00:05:00
   Queue Name:     debug
   Account String: mpccc
 
Job Exit Summary:
   APINFO_NIDTERM:  compute node initiated termination, possible out of memory condition

 

Fortran/MPI Main Calling C/Pthreads Subroutine

This example is similar to the last one, except that it has been wrapped in a Fortran main routine which manages all the MPI calls, while the Pthreads are run by the C routines. The only data passed between the Fortran wrapper and the C routines is the MPI task number. In this example, MPI task 0 is single-threaded, while MPI tasks 1-3 run 4 pthreads each. The code is compiled and run using the batch system.

Fortran Source Code

% cat main.f
      IMPLICIT NONE
 
      INCLUDE 'mpif.h'
  
      INTEGER :: myPE, totPEs, ierr
      EXTERNAL :: cpthreads
  
      CALL MPI_INIT( ierr )
      CALL MPI_COMM_RANK( MPI_COMM_WORLD, myPE, ierr )
      CALL MPI_COMM_SIZE( MPI_COMM_WORLD, totPEs, ierr )

      call cpthreads( myPE ) 

      CALL MPI_BARRIER( MPI_COMM_WORLD, ierr)
      CALL MPI_FINALIZE( ierr )
  
      END

C Source Code

% cat ptest.c
#include "pthread.h"
#define NUM_THREADS 4

int myid;

void *ThreadHello(void *threadid, void *task)
{
   int tid, myproc;
   tid = (int)threadid;
   printf("Hello from thread %d of task %d \n",tid,myid);
   pthread_exit(NULL);
   return 0;
}

void cpthreads_ (int *myPE )
{
   int rc,t;
   pthread_t threads[NUM_THREADS];

   myid = *myPE;

/* Run single thread on Processor 0,
   Multiple-threads on Processors 1-n */

   if (myid != 0) 
   {
      for (t=0; t<NUM_THREADS; t++)
      { rc = pthread_create(&threads[t], NULL, ThreadHello, (void *)t); }
      pthread_exit(NULL);
   }
   else { printf("Hello from single-threaded task %d \n",myid); }

}

Batch Script

% cat ptest
#PBS -N ptest
#PBS -q debug
#PBS -l mppwidth=4
#PBS -l walltime=00:05:00
#PBS -e ptest.out
#PBS -j eo

cd $PBS_O_WORKDIR

cc -c ptest.c
ftn -o ptest.exe ptest.o main.f
aprun -n 4 ./ptest.exe

Output File

% cat ptest.out
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
/opt/xt-pe/2.0.53b/bin/snos64/cc: INFO: linux target is being used
PGC-W-0155-Pointer value created from a nonlong integral type  (ptest.c: 28)
PGC-W-0095-Type cast required for this conversion (ptest.c: 28)
PGC/x86-64 Linux 7.1-6: compilation completed with warnings
/opt/xt-pe/2.0.53b/bin/snos64/ftn: INFO: linux target is being used
main.f:
Hello from single-threaded task 0 
Hello from thread 0 of task 1 
Hello from thread 1 of task 1 
Hello from thread 2 of task 1 
Hello from thread 3 of task 1 
[NID 12710]Apid 5176728: initiated application termination
Hello from thread 0 of task 2 
Hello from thread 1 of task 2 
Hello from thread 2 of task 2 
Hello from thread 3 of task 2 
Hello from thread 0 of task 3 
Hello from thread 1 of task 3 
Hello from thread 3 of task 3 
Hello from thread 2 of task 3 
Application 5176728 exit signals: Killed
Application 5176728 resources: utime 4, stime 8
 
   -------------------------- Batch Job Report ------------------------------
 
   Job Id:         5455307.nid00003
   User Name:      fvhale
   Group Name:     
   Job Name:       ptest
   Session ID:     11537
   Resource List:  walltime=00:05:00
   Queue Name:     debug
   Account String: mpccc
 
Job Exit Summary:
   APINFO_NIDTERM:  compute node initiated termination, possible out of memory condition
 


LBNL Home
Page last modified: Mon, 14 Jul 2008 23:38:12 GMT
Page URL: http://www.nersc.gov/nusers/systems/franklin/programming/interlanguage.php
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science