SHMEMmer Wanna Be

My Journey To Understanding and Using SHMEM:

Lesson 0: Hello World

If you are like me, you have heard about SHMEM and its benefits over MPI; however, you struggled finding the means to really dive in and learn. In addition, it isn't like you have an HPC computer at home to just program in your spare time. So, how is one to learn the gist of SHMEM to even conclude you want to try it or learn it to the extent it will prove useful at work.

I invite you to join me in the adventure of learning SHMEM. I should preface this tutorial with the following: this tutorial is by no means an all inclusive summary of shmem. There is much to learn and the documentation is quite beneficial.

As all good programming instructions begin, we will begin with the treasured program "HelloWorld".


#include <stdio.h>
#include <shmem.h>

int main(){
	int my_pe, num_pe;         //declare variables for both pe id of processor and the number of pes
	num_pe = shmem_n_pes();    //obtain the number of pes that can be used
	my_pe  = shmem_my_pe();    //obtain the pe id number
	printf("Hello from %d of %d\n", my_pe, num_pe);
	return 0;

If you are new to HPCs, I will be the first to say that thinking in parallel is difficult; however, it does get easier as you do it. So, lets look at this step by step for the time being.

What does each step of the above do?
  1. #include < stdio.h > -- including the standard io c library
  2. #include < shmem.h > -- including the header file for shmem; allows us to use the shmem commands
  3. int my_pe, num_pe -- declaring variables of type int for both the processor id and number of processors
  4. shmem_init() -- initializes shmem
  5. num_pe = shmem_n_pes() -- obtain the number of pes that can be used
  6. my_pe = shmem_my_pe() -- obtain the id of pe
  7. shmem_finalize() -- releases the shmem resources; the proverbial final curtain to shmem commands

To compile the above code, we use the openshmem cc (oshcc) compiler wrapper:
        oshcc HelloWorld.c -o HelloWorld

To read more about compilation, look at Annex B.

To run your treasured code, shmem was kind enough to provide the wrapper, oshrun:
        oshrun -np 10 HelloWorld
where 10 is the number of processors we want to use.

The output should be:
Hello from 0 of 10
Hello from 1 of 10
Hello from 2 of 10
Hello from 9 of 10