SHMEMmer Wanna Be

My Journey To Understanding and Using SHMEM:

Lesson 0: Hello World

As all good programming instructions begin, we will begin with the treasured program "HelloWorld".

//HelloWorld.c

#include <stdio.h>
#include <shmem.h>

int main(){
	int my_pe, num_pe;         //declare variables for both pe id of processor and the number of pes
	
	shmem_init();
	num_pe = shmem_n_pes();    //obtain the number of pes that can be used
	my_pe  = shmem_my_pe();    //obtain the pe id number
	
	printf("Hello from %d of %d\n", my_pe, num_pe);
	shmem_finalize();
	return 0;
}

If you are new to HPCs, I will be the first to say that thinking in parallel is difficult; however, it does get easier as you do it. So, lets look at this step by step for the time being.

What does each step of the above do?
  1. #include < stdio.h > -- including the standard io c library
  2. #include < shmem.h > -- including the header file for shmem; allows us to use the shmem commands
  3. int my_pe, num_pe -- declaring variables of type int for both the processor id and number of processors
  4. shmem_init() -- initializes shmem. A communication world is created. We now have the number of nodes in the communication world and the id of each processing element (pe) available.
  5. num_pe = shmem_n_pes() -- obtain the number of pes that can be used. Note this is a private variable.
  6. my_pe = shmem_my_pe() -- obtain the id of pe. Again, note that this is a private variable. It is not shared with all the processes.
  7. shmem_finalize() -- releases the shmem resources; the proverbial final curtain to shmem commands

To compile the above code, we use the openshmem cc (oshcc) compiler wrapper:
        oshcc HelloWorld.c -o HelloWorld

To read more about compilation, look at Annex B.

To run your treasured code, shmem was kind enough to provide the wrapper, oshrun:
        oshrun -np 10 HelloWorld
where 10 is the number of processors we want to use. Note: the max number of processes you may use on your home computer is the number of cores you have. To determine this, simply type in your terminal lscpu | grep Core.

The output should be:
Hello from 0 of 10
Hello from 1 of 10
Hello from 2 of 10
...
Hello from 9 of 10