Lab 1: Sequential 2-D Array Processing

Summary: The goal of this lab is to write a sequential version of a simple 2-D array processing program.

You will note that the labs will be less explicit than other classes you have previously taken. This is done on purpose. The goal is to let you think outside the box and create your own programs.

Outline due before end of class

Step 1 (10): Before programming, read through the lab. Create an outline (outline.txt) of how you envision your code should be. This outline is to be used in the future as comments in your code and it will be graded on how well thought out it is.

submit -c=SI458 -p=lab01 outline.txt

Step 2 (80): Complete the program as dictated by the program requirements. At the end, you should have both a makefile and a c source file, array.c. This link is a great resource if you need to recollect how to create a makefile.
Step 3 (10): Analyze Performance of Code

Grading: Grading is not all based on correctness. A few things to consider as you write your code:

comments,
layout (spacing, variable names, consistency),
main function should resemble an outline. It should not be the majority of your code.
essentially, can one read your code and easily follow it

Grid Computation

*Grid computations* are an important class of HPC scientific applications in which a physical phenomenon such as heat transfer, wind flow, fluid flow, etc. is modeled using an evenly spaced 2-dimensional or 3-dimensional grid of points. The value of each point represents the state of the phenomenon at a point in physical space at a given time and is updated iteratively to model the passage of time. The way in which a value is updated depends on the application, but in scientific applications the overall computation is often an iterative approach to solving partial differential equations (PDEs) that model the physics, such as Laplace's equation in 2 dimensions. Various iterative methods for solving Laplace's equation exist with different tradeoffs, including Jacobi iteration, Gauss-Seidel, and successive over-relaxation (SOR). The grid structure is also relevant for other types of applications like image processing. Our focus in this class is less on the physics and more on the way in which these applications are structured and parallelized in order to give the best results in the shortest amount of time. As a result, we'll abstract away the physics by using relatively simple ways to update points. This first program will update a given point using only its previous value.

2-D Array Processing Assignment

Big Picture

Your task is to write a C program that maintains a 2-D array of points that are updated in successive iterations. The executable should be called "array" and you should submit a makefile with a default target that builds an executable with that name using gcc.

Program Requirements

Arguments array should take four command line arguments and two optional, all integers:

the maximum number of iterations (generations) to run the simulation
the threshold, defined as the value that makes a grid point no longer eligible for updating (see below.)
the frequency for displaying the grid (see below.)
a seed for random: if the seed given is negative, then use time as the seed; otherwise, use the seed given.
an optional argument of rows
an optional argument of columns

The optional arguments (rows, columns) are only to be used if a matrix file is not to be used for the initialization (See Grid Initialization below.)

Grid initialization.

If args cols and rows not given, 1) ask the user for a file via stdin 2) read the initial values of the grid from the file , e.g. matrix.txt. The file will have the following format:

The first line will be three integers rows, cols, and InputSize. The first two specify that the size of the grid is rowsxcols, and the third specifies the number of subsequent lines in the input. Thus, if the first line is 20 30 100, this indicates a grid that has 20 rows and 30 columns, and that there are 100 remaining lines in the input. InputSize could be 0.
Each subsequent line read from the file will contain three integers X Y Z. This indicates that the point X, Y is initialized to the value Z. The value of X is between 1 and N, the value of Y is between 1 and M, and the value of Z can be either positive, negative, or 0.
Any point not explicitly initialized in the input starts at 0.

If args cols and rows are given, create a matrix of size rowsxcols where all points start at value 0.

Grid updating. On each iteration, a new generation is calculated by updating each grid point using the following rules:

Any point that has a value in the range [-threshold, +threshold] should be updated. We call such a point an eligible point. Thus, a point whose value is less than -threshold or greater than +threshold is not an eligible point and should not be updated.
Each eligible point should be updated by setting its value to the sum of the current value and a random integer between [-10,10] (see below.)

Stopping condition. Stop the simulation when either of the following conditions hold:

After executing the maximum number of generations as specified in the command line.
No more eligible points exist, i.e., every point has a value outside the range [-threshold, +threshold].

Display frequency. The third command argument is the display frequency, which dictates how often the grid is printed to stdout. The rules are as follows:

If the argument is a non-zero N, then display the grid every N generations by printing to stdout. Thus, if we say that the generations are numbered 0, 1, 2, 3, …. where generation 0 is the initial state of the grid, then you should display generation 0, N, 2xN, 3xN, etc. Also, always display the final grid. Label each displayed grid with its generation number.
If the frequency argument is zero, do not display anything during execution but write the contents of the final generation of the grid to a file called output.txt in a format consistent with the format of the input (see above.) Again, nothing is printed to stdout in this case.

Example:

$ ./array 5 20 2 0
Matrix File Path: matrix.txt
count: 0
------------
1 0 0 0 
0 2 0 0 
0 4 0 0 
0 0 0 0 
------------
count: 2
------------
11 3 -2 8 
-1 -9 -11 12 
-12 3 -8 -5 
7 -6 9 16 
------------
count: 4
------------
4 8 2 0 
-7 4 -7 15 
-20 -1 -11 -6 
-3 1 23 25 
------------
final
------------
11 3 3 -3 
-2 4 -3 16 
-30 -2 -2 3 
2 -4 23 25 
------------

$ ./array 100 15 0 1
Matrix File Path: matrix.txt
final
------------
22 23 18 17 
-18 16 17 18 
-21 23 -19 -18
19 -18 -16 16 
------------

$ ./array 5 15 0 1 10 10
final
------------
17 -3 23 25 9 11 12 21 11 -19 
23 -13 -7 -13 21 14 -4 -9 -24 15 
-18 -19 -5 19 4 0 14 -1 -3 -17 
1 23 -14 -18 1 8 -18 16 -6 12 
-15 0 -24 -1 -9 -16 16 16 10 19 
-4 -7 -2 1 9 -17 -18 -16 -4 -1 
-9 20 -4 -4 -16 9 7 -2 3 -19 
5 -19 -16 18 19 2 -21 -11 13 2 
8 -9 -3 21 6 -16 -15 23 -17 21 
10 1 12 21 8 20 -3 16 -6 16 
------------

Grid data structure

The data structure for the grid of points needs to be allocated dynamically as a two-dimensional array using malloc() or calloc() based on the dimensions given in the first line of the input. A 2-D array in C—called a double array in IC221—is best thought of as a 1-D array of 1-D arrays. For an NxM array called grid, for example, you’d allocate N-many pointers-to-data, and then allocate M-many items of data. (See the pseudocode below.) While you could then use pointer arithmetic to access elements, C syntax also allows you to reference the data as grid[i][j].

Random number generation

To generate the random numbers needed to update the points, use the library function srand() to set the seed and then rand() to generate the number. Invoke srand() only once in main and use the current time as a seed. As for rand(), it’s surprisingly difficult to generate numbers in a way that ensures that the distribution is uniform across a specified range. The first thing you might think to do if the range is [-10,10] is the following:

random_number = (rand() % 21) – 10;

This call to rand() generates a random number between 0 and RAND_MAX, then uses modular arithmetic to reduce it to the range 0 to 20, and finally subtracts 10 to put it in the required range. Unfortunately, the use of modular arithmetic here generates non-uniformity. A better way is as follows:

random_number = (int) (21.0 * ((double) rand()/(RAND_MAX+1.0))) - 10;

This generates a random number between 0 and RAND_MAX, converts it to a real value between 0 and 1, scales it up to give an integer between 0 and 20, and then subtracts 10 to put it in the required range. See the notes section of the rand() man page and the stackexchange discussions referenced below if you’re interested in the details: http://manpages.ubuntu.com/manpages/bionic/pt/man3/rand.3.html
https://codereview.stackexchange.com/questions/159604/uniform-random-numbers-in-an-integer-interval-in-c
https://stackoverflow.com/questions/288739/generate-random-numbers-uniformly-over-an-entire-range

Step 3: Analyze Performance of Code

To analyze the performance of the code, we are going to use raspberry pis. As you clock the code, I want you to consider memory, e.g. L1, L2. Rasperry Pi Commands:

ssh:

ssh pi@192.168.37.2

to copy:

scp file_path pi@192.168.37.2:.

Create a google document, which describes the performance.
Choose different matrix sizes and record the times. This can be done by one of the following:

library timer.h (will be using this method in future labs):

clock_t before = clock();
// code here you want to time
clock_t diff = clock() - before;
int msec = diff*1000/CLOCKS_PER_SEC;
printf("Time: %d sec %d millisec\n", msec/1000, msec%1000);

use time on terminal line: time ./2darray

Create a plot where the x-axis is number of bytes in matrix and the y-axis is the time it took in milliseconds. You can do this in excel, python, etc. You can even use ChatGPT if you would like.
Look up your cache via the commandline (lscpu - note on the raspberry pi, this command does not list cache. On the pi3, L1: 16KB and L2: 512KB). Record the numbers you find. If needed try a few more numbers with your code and update your plot. Explain what you see. Make sure you have in your analysis the layout of your computer architecture.

submit -c=SI458 -p=lab01 outline.txt array.c makefile

Google Classroom: submit analysis google doc and genAI transcript