SI 204 Spring 2017 / Notes


Unit 2: Variables and If

(Credit to Chris Brown for the original version of these notes.)

1 Starting out

1.1 Hello, world explained

The example below shows the structure of a very simple C program along the lines of our “Hello World” example from last class. It’s annotated to describe the different parts of the program.

/* Everything inside these slash-star characters is a "comment"
 * that will be ignored by the compiler. */

#include "si204.h"  /* This line "includes" code needed to standard things
                     * such as input/output and handling strings.
                     * Later we'll use the system-provided libraries instead,
                     * but to start out you just need to include si204.h
                     * at the top of every program you write. */

int main() {  /* The "main" tells the compiler where your program should start.
               * For now, all of our code will go between the { following
               * main() and the } at the end of the file. */

  fputs("Hello, world!\n", stdout); 
    /* fputs is a standard C library function (provided through si204.h)
     * to print out a string of characters. The "Hello, world\n" is the
     * string to be printed. stdout is the name of the output stream that
     * goes to the terminal. For now, we will always just use stdout.  */

  return 0;  /* The return code indicates to the operating system whether our
              * program worked correctly or not. The standard is that
              * a return code of 0 means "everything worked". */

} /* This is the matching "close brace" for the { following main() above. */

In the early part of the course, the C statements that comprise your program are listed between the { }’s in the body of the “main function”. Just as with JavaScript, these statements are executed one after the other from top to bottom.

Strings can be printed to “standard out” (which is the terminal window) using the function fputs. There are different functions to print different types of things like numbers, as we will see soon.

Also notice that every statement inside the main function MUST end with a semicolon. C ignores all code comments /* between slash-stars */' and it also treats all “whitespace” (spaces or newlines) as the same. So you need to put semicolons in to indicate where each statement ends. Even if it seems obvious to you from looking at the code, it’s not obvious to the computer!

1.2 Slightly longer example

Here is a slightly longer example that illustrates the main concepts from this unit. See if you can figure out what it does just by looking at the code. Feel free to copy to a text file, compile, and run it yourself!

#include "si204.h"

int main() {
  /* Read name into a variable */
  fputs("What is your name? ", stdout);
  cstring name;
  readstring(name, stdin);

  /* Read age into a variable */
  fputs("How old are you now? ", stdout);
  int age = readnum(stdin);

  /* Make a crucial decision */
  if (age >= 21) {
    fputs(name, stdout);
    fputs(" can... rent a car!\n", stdout);
  } 
  else {
    fputs("Sorry, no fun for ", stdout);
    fputs(name, stdout);
    fputs(" (yet).\n", stdout);
  }

  return 0;
}

This is a simple program, but there is a lot going on! For the first time, we have to save some pieces of information and retrieve them later. The variables name and age are used for that purpose. Also notice that they have different typesname is a cstring whereas age is an int. And finally, the program makes a decision based on whether the age is at least 21, using an if statement.

We’ll now get into all of those elements in greater detail.

2 Variables

Suppose I want to compute something like \((523 - 248)^3\). I can do this with C, but “^” doesn’t mean exponentiation in C, so instead I would need to write:

writenum((523 - 248) * (523 - 248) * (523 - 248), stdout);

This is a bit of a hassle. Normally we would think of first computing (523 - 248), then taking the resulting value and cubing it. Hopefully we think something like: “let x be (523 - 248) and compute x*x*x.” We need a variable in which to store the value (523 - 248)!

So we try creating the following program:

#include "si204.h"

int main() {
  x = 523 - 248;
  writenum(x*x*x, stdout);
  return 0;
}

When we compile this, we get an error message like

  error line 4: 'x' is an undeclared identifier

What the compiler is saying is this: “x??? you never told me there was going to be an x!” If you want to use x to store the value (523 - 248), you need to first tell the compiler that the name x is going to stand for a number - this is called declaring the variable x.

The statement int x; declares that x is a variable of type int, which means that it’s a variable that can stand for a (positive or negative) integer number. The type of a variable tells you what kind of data objects, such as an integer or real number or something else …, can be stored in the variable. There are a number of different types in C, and we will use many of them in this course.

The = operator assigns a value to the variable x. So to make the program above work, we just add a declaration of variable x of type int before the assignment:

#include "si204.h"

int main() {
  int x;                   // DECLARE x of type int
  x = 523 - 248;           // ASSIGN x to the difference 
  writenum(x*x*x, stdout); // PRINT x cubed
  fputs("\n", stdout);     // print a newline at the end
  return 0;                // success!
}

Once x has been declared as a variable of type int, it can be used in the program just like a normal number, either to print or to use in arithmetic expressions, like writenum(x*x*x,stdout), which does both.

What really goes on here is that space in your computer’s memory is allocated for one object of type int, and given the name x. When you ask for the value of x (by involving it in an arithmetic expression or by trying to print it) the computer fetches a copy of the value from memory. When you use the = operator, the computer takes the value on the right hand side, and copies it into the space reserved for x.

Thus, strange sequences like:

int x;
x = 3;
x = 200;

… make perfect sense to the compiler — even if the second line is a complete waste of time. After the statement int x;, space is reserved for x, though we have no idea what actual value is in there - at this point x is uninitialized. The statement x = 3; copies the value 3 into the space reserved for x. Finally, the statement x = 200; copies the value 200 into the space reserved for x, thus overwriting the 3 that had been there previously. Think of x as being the name of a box into which int values can be written.

2.1 Types

As you might have expected, int is not the only type of variable that we are allowed to have. Let’s look at a few more of the most important types that we can use in C. As the course goes on, we will see even more types, and we’ll even eventually see how to define types of our own! But for now, we’ll stick to int, double, char, and cstring:

Actually, there is one other type besides these four, which is the type of stdout and stdin. These are streams that you can read or write to, but there’s not much more to say about them for a few more weeks. We’ll sort of ignore streams as types until there’s something more that we can do with them.

  • int

    The int stands for integers. This type is the real workhorse of computing. As with integers, there are no fractions, but unlike integers there are limits on both the positive and negative ends (in the billions of billions on a typical PC today). If you just write a number like 65 in your program, which is called a literal, it is interpreted as an int by default.

    The operations for addition (+), subtraction (-), and multiplication (*) work pretty much as you’d expect.

    Division (/) with integers is a little funny, because the result of the division has to be an integer too. What happens is that any fractional remainder left over after the division is simply chopped off (truncated). For example,

    int x;
    x = 17 / 3;             /* x now equals 5, which is 17/3 rounded down. */
    writenum(1/2, stdout)   /* this will print 0 - surprise!! */

    This rounding down of so-called integer division is a common source of subtle program bugs, so watch out for it.

    A related — and very important — int operator is modulus, or %. This gives you the remainder when one integer is divided by another. For example, 17 % 3 equals 2, which is the remainder when 17 is divided by 3. Have a look at the Minutes and Seconds program to see an exampls of this in action.

  • double

    A double is a decimal number, which mathematicians would call a real number and which we might more accurately call a floating-point number.

    (The term “double” actually comes from “double-precision floating point”.)

    We saw that a literal number like 123 in your program automatically becomes an int, but if there’s a decimal place in the literal such as 4.97, that will be a double by default.

    Actually, doubles are only a “simulation” of real numbers because most numbers can only be approximated by a double, analogous to the way we approximate 1/3 as the 6-digit decimal 0.333333. We’d need an infinite number of digits to get 1/3 exactly, and in just the same way, the double type is limited to just approximating numbers like 1/3. The issue of how to get good and reliable answers when we can only approximate such numbers is the concern of numerical analysis, which is an important field on the border between mathematics and computer science.

    All of the basic math operations — including division / — work as you’d expect with doubles. There are also many other functions such as sqrt and log that you can compute with doubles that are part of the standard math library. Here is some online documentation on that library. To use it, you have to #include <math.h> at the top of your program.

  • char

    The char stands for character. These are letters or numbers, or stranger things … for example there is a “bell character”. Writing it causes your computer to beep! char literals are written inside single quotes, so for example the char a is written inside a program as 'a'. Unusual “characters”, like tabs or newlines, use “escape codes”. They are identified by a backslash. Tab is '\t', and newline is '\n'. Operations on char’s will be more important later on.

  • cstring

    The types we’ve seen so far (int, double, and char) are all built-in types. This means that they are part of the core language rather than part of some library. The last type we’ll talk about today, the type cstring, is not a built-in type, it is defined in our library si204.h. It is used for strings of characters, like we’ve already seen many times. For example "Hello World!", quotes and all, is a string literal.

    Because cstring is not a built-in type, none of the built-in operators like + or * work with strings. Instead, we have to use library functions! Some rather useful functions for strings can be used from the si204.h library, such as:

    • strcpy(s1, s2): sets string s1 equal to s2
    • strlen(s): returns the length of string s
    • strcmp(s1, s2): returns a number indicating the dictionary order between strings s1 and s2. A negative number from strcmp means that s1 comes before s2, a positive number means s1 comes after s2 alphabetically, and zero means the two strings are equal.

2.2 Variable Names

In some programming languages variables have to begin with funny characters. Such a character is usually referred to as a “sigil”. In bash scripting variables begin with a “$”, which is a common sigil. Perl has many sigils, including $, @ and %. For better or for worse, C doesn’t do that to us.
Sigil Cycle
http://xkcd.com/1306/

Variables in C consist of three things: a name, a type, and a value. We’ve just discussed the type int above, and the value of the variable is dependent on what the program computes or what the user types in, controlled of course according to the type. But what about the name? That’s up to us as programmers!

There are some rules about variable names in C. A variable name must begin with a letter (lowercase or uppercase) or an underscore ( _ ). It may be any length you want (although anything after the first 32 characters will be ignored) but it can only contain letters, digits, and the underscore. Except in special situations, the use of the underscore to begin a variable name should be avoided. There are a special class of names called keywords, which are reserved for use by C, and you may not use one of them to name a variable. Examples of keywords are int and return. (A complete list can be found on this page.

There are also names that you could choose for variables, but which are already used for important things. Examples of this are main and fputs. The problem with using such a name is that it creates ambiguity. For example, what would happen with the following:

int stdin; // bad choice for a variable name!!!
stdin = readnum(stdin);

As it turns out, the compiler will assume that both stdin’s refer to your new double and you won’t be able to use stdin for reading. As we proceed, it will (hopefully!) become obvious what cannot be used as variable names.

C distinguishes between uppercase and lowercase. As a result, Answer and answer will be considered different variable names. A very common mistake that beginning programmers make is to be sloppy in writing variable names, sometimes using capitals and sometimes not. It is not good programming practice to use two variable names that are spelled the same except for capitalization because it leads to errors. Your source code will be easier for mere mortals to understand (interpret this to mean the instructor grading your programs) if you use meaningful variables names.

3 Representing data in a computer

Note: I strongly recommend that you review the class on Digital Data from the si110 website. You are expected to understand about bits and bytes, binary-to-decimal and decimal-to-binary conversion, and how the ASCII table defines a mapping between characters and bytes. What’s here in the notes is just a brief overview of that. Here’s a link to a full ASCII table.

You’ve probably heard terms like bits and bytes used in connection with computers, and you’ve probably heard people say that inside a computer everything is 0’s and 1’s. If not, I’ll say it now: Inside a computer everything is 0’s and 1’s! (A bit is just a 0/1 value.) But how can all of these things — chars, ints, and doubles — be represented by zeros and ones? Our understanding of types will really depend on being able to answer these questions.

3.1 Binary numbers

First we’ll look how 0’s and 1’s suffice to represent any integer number, then we’ll look at other types of objects. When we deal with numbers we use the decimal number system, i.e. the base 10 number system. This means that all our numbers (lets look at non-negative integers for now) look like sequences of decimal digits, which are numbers in the range [0,9]. A number like 3027 is short-hand:

3027 → 3*10^3 + 0*10^2 + 2*10^1 + 7*10^0

Or, for another example,

1011 → 1*10^3 + 0*10^2 + 1*10^1 + 1*10^0

In the binary number system we have the same idea, but the base is now 2 rather than 10. So, binary digits are in the range [0,1], and now 1011 has a different interpretation. In binary it is short-hand for:

1011 → 1*2^3 + 0*2^2 + 1*2^1 + 1*2^0 = 2^3 + 2 + 1 = 11 (in decimal)

So, in binary the decimal number 11 is represented as 1011. The binary number 1001 = 2^3 + 1 = 9, for another example. With four bits, i.e. four binary digits, we can represent any number from 0 up to 15 (which is 2^3 + 2^2 + 2^1 + 2^0). With four decimal digits I can represent from 0 up to 9999, i.e. from 0 up to 10000 - 1. So we need more bits than decimal digits, but given enough bits we can represent any number we care to. Using k-bits, we can represent the numbers from 0 up to 2^k - 1.

3.2 Bytes - How type depends on interpreting bit-sequences

The memory of a computer is simply one long sequence of bits. However, these bits are organized into chunks of 8 called bytes. To emphasize, a byte consists of 8-bits. In a byte, we can represent the numbers from 0 to 255.

The type char is one way of interpreting a byte of memory. For example, the byte 01100001 is interpreted as the character a. This interpretation of bytes as characters is called the ASCII encoding, and this table, for example, shows you the whole thing. Interpreting 01100001 as a number in binary, we get the number 97, and if you look up 97 in the table, you’ll see that it corresponds to the character a.

Already we see one of the fundamental ideas behind computing, different types of objects may be represented by treating sequences of 0’s and 1’s in different ways. That’s why C needs to keep track of the types of objects, so it knows how to interpret the contents of the chunk of memory associated with each object.

3.3 Other types

A full int on your PC consists of 4 bytes, or 32 bits, so it can represent pretty big numbers. We’re not going to get into the question of how negative numbers are represented in binary. Essentially an int looks like the binary number representation we just talked about, but in 32 bits.

Technically, the int 5 could be represented as

00000000 00000000 00000000 00000101

… or it could be represented as

00000101 00000000 00000000 00000000

… depending on what’s referred to as as the “endianness” of the underlying machine. That particular distinction is beyond the scope of this course, but you will encounter it in subsequent CS/IT course.

So, The int 5 is represented in the computer as:

00000000 00000000 00000000 00000101

… where I’ve broken things up into bytes to make it all a little clearer.

A double takes up 8 bytes, or 64 bits. The format is more complex, however, and we will not go over it here, except to say that it is a binary version of the familiar scientific notation. However, instead of a base of 10, it uses a base of two. (Example: 12 is represented as 1.5 x 2^3.) Let it suffice to say that the double 1.0 is represented by the following 64 bits:

00000011 11111111 11111111 00000000  00000000 00000000 00000000 00000000

There are many other numerical types that use more or less bits. For example, short int is a 16-bit integer, float is a 32-bit decimal number, and long long int is a 64-bit integer. But please forget about all that for now; we can safely stick to int and double for the entirety of this class!

4 Input/Output and Type Conversions

One thing we will want to do with every type is reading in and writing out, referred to commonly as I/O. Here are the functions provided by the si204.h library for input and output on each type:

  • Integers

    int x;
    x = readnum(stdin);
    writenum(x, stdout);
  • Doubles

    double x;
    x = readnum(stdin);
    writenum(x, stdout);
  • Characters

    char c;
    c = readchar(stdin);
    fputc(c, stdout);
  • Strings

    cstring s;
    readstring(s, stdin);
    fputs(s, stdout);

4.1 I/O Streams

We’ve already seen how to output information from a program using fputs and writenum. In C (and in many other places) we refer to an output stream, the idea being that each thing we write goes out sequentially in the order we write it.

In exactly the same way, we read from an input stream. Just as the standard output stream that prints to the terminal is called stdout, the standard input stream is called stdin.

Code

double x;
cstring s;
x = readnum(stdin);
readstring(s, stdin);

User Types

12.3
poptarts

Effect

x gets the value 12.0, y gets the value "poptarts".

Notice that the syntax of the readstring command is slightly different than readnum. That has to do with the fact that double is a built-in type whereas cstring is only available through the si204.h library.

Both readnum and readstring skip any whitespace (spaces, tabs, and newlines) before they start actually reading. A readstring command will read everything up to the next whitespace, whereas readnum will stop reading as soon as it sees anything that’s not part of a number (such as a letter or a comma).

Here’s a slightly more tricky example:

Code

double x;
cstring s;
double y;
x = readnum(stdin);
readstring(s, stdin);
y = readnum(stdin);

User Types

12.0  25.0
FUN

Effect

x gets the value 12.0, s gets the value "25.0" as a string, but reading y causes an error because the letters F isn’t part of a valid number.

Putting this together, we can construct a very simple program Addition Calculator, which reads in two numbers from the user and adds them together. Notice that the variable that contains the sum of the two numbers input by the user is actually called sum. This is just to enhance the readability of my code. I could’ve called the variable “George” and it would’ve worked just the same.

Let’s look at a more useful example. The following input:

Jones   3:25
Smith   4:11

seems to be spread over several lines and composed of different elements - numbers, strings, and characters. However, it is in fact just one long line of characters, and by reading data we move through this line of characters called an input stream. Suppose, for example, we ran the code

cstring str1, str2;
int m1, m2, s1, s2;
char c1, c2;
readstring(str1, stdin);
m1 = readnum(stdin);
c1 = readchar(stdin);
s1 = readnum(stdin);
readstring(str2, stdin);
m2 = readnum(stdin);
c2 = readchar(stdin);
s2 = readnum(stdin);

with the above as input. Then, because whitespace is skipped and readnum stops when it hits a non-numeric character, the values that end up in each of these variables are as follows:

str1

"Jones"

m1

3

c1

':'

s1

25

str2

"Smith"

m2

4

c2

':'

s2

11

4.2 When types collide - conversion

You might have noticed that the I/O operations for ints and doubles are the same (at least in the si204.h library): readnum and writenum. But how can this be?

Well, technically readnum and writenum only operate on doubles. But since every int can also be represented as a double, the C compiler does the conversion back and forth for us automatically without us even noticing.

We can also practice these kinds of conversions ourselves:

int x;
double y;
x = 3.7;
y = 10;
x = y;    // now x = 3, truncated
x = 12.1; // now x = 12
y = x;    // y = 12.0, no truncation
y = x/5;  // tricky! y = 2.0, do you see why?

Although a char represents a single character like 'a' or '?', we know that these characters are actually represented by numbers in the range from 0 up to 127. When doing conversions with type char, they get treated as integers equal to their ASCII value. For example, calling

writenum('d', stdout);

would cause 100 to be printed, which is the ASCII value for a lowercase d.

Interestingly, C doesn’t know how to do arithmetic with char types directly, but it is happy to automatically convert them to ints and do integer arithmetic! This turns out to be convenient because the ASCII values are in a meaningful order. Consider 'b' - 'a' for example. Looking at the ASCII table, we see that 'b' corresponds to the number 98, and 'a' to the number 97. The C compiler treats this as the int subtraction problem 98 - 97, which evaluates to 1. In fact, the letters of the alphabet appear in order, so that a is 97, b is 98, …, z is 122.

Things get even more interesting when we try to do arithmetic with different types. What should it mean to multiply a double and an int? The compiler knows how to convert between these two types, but it only knows how to multiply one type by another number of the same type. So if we have code like 7 * 2.3, should it do the multiplication as two doubles or as two ints?

For arithmetic, types are always converted in the direction that gives the most precision - this is referred to as type promotion - which in this case means that the int is converted (or promoted) to a double, and the operation is performed on two doubles. It wouldn’t make nearly as much sense the other way round, would it?

This kind of type conversion is called implicit because it happens automatically behind the scenes, without you doing anything directly. We have seen it in assignment, in calling I/O routines, and in doing arithmetic. The C compiler always assumes (often wrongly!) that the programmer knows what they’re doing and really means what they write. So the compiler will implicitly convert whatever types it knows how in order to make your program work, using the rule of type promotion.

We also have explicit type conversion, where the programmer says exactly what the type should be. Suppose, for example, that m and n are ints, n being the larger. We’d like to print out the value of m/n. Well,

writenum(m / n, stdout);

will just print out zero! (Make sure you know why!) We’d like to get some fractional value, in other words, we’d like these values treated as doubles. To explicitly convert them to doubles first we’d write:

writenum((double)m / n, stdout);

The (double) before the value m indicates that we want to convert m to a double first, and then do the division. Now the compiler is dividing a double by an int, so it will promote the other argument, int n, to a double as well, before doing the actual division.

Some Quick Conversion Rules
int → double : This does exactly what you’d expect.
double → int : This simply truncates the value, meaning that whatever’s after the decimal point just gets chopped. You can get in trouble if the double value is too big.
int → char : if the int is in the range 0-127 the char value is determined by the ASCII table;
char → int : the int value is determined by the ASCII table;

4.3 Variable declaration/assignment shortcuts and good ideas

So far, we’ve been declaring all of our variables at the beginning of the main with statements such as

int x;
double y;

then assigning them values later, with statements like

x = 4;
y = 91.3;

In fact this corresponds to what really happens in the computer: when your main starts, space is reserved in memory (allocated) to store x and y. Notice that the compiler really needs to know the types here so it knows how many bytes to allocate for each one! In this case, you would get 4 bytes for x and 8 bytes for y.

But there’s a potential pitfall here, which is that you can access the variables before they are ever assigned a value. For example, the following program:

int x;
writenum(x, stdout); // what will happen here??
x = readint(stdin);
writenum(x, stdout); // now we know what x will be

This program will actually compile (although it should generate a warning message), but we can’t say what will get printed on the second line. It just depends on whatever happens to be sitting around in your computer’s memory wherever those 4 bytes for x are allocated. It will most likely be 0, but we really can’t say! It’s what’s called undefined behavior — and it means that our program has a bug, a mistake.

As we get into more complicated programs, this kind of mistake is actually a pretty common one to make. So I recommend that you save yourself from undefined behavior and just assign variables for values as soon as they are declared, like so:

int x;
x = 0;
/* ... the rest of your program ... */

In fact, there’s a shortcut to doing this in C, which is that you can declare and assign all in the same line, like so:

int x = 0; // wow, how convenient!
/* ... the rest of your program ... */

However, it’s important to remember that this is really doing two things at once, declaring and assigning.

Another shortcut is that you can declare (and optionally assign) multiple variables all on the same line, separated by commas, as long as they have the same type, such as:

int x, y=3, z=12;
// x is an int that is uninitialized
// y is an int with value 3
// z is an int with value 12

I personally (Dr. Roche) don’t like doing multiple declarations on one line, because of some things that show up later when we get to arrays. But there are other good reasons to program this way also; it’s a personal choice.

I want to emphasize that all of the above about whether to assign right after you declare, and whether to do it on one line or as two separate statements, are all examples of some of the options we have when programming. There are always multiple correct ways to write a program, and we as programmers get to decide exactly how we want to do it. Sometimes there isn’t just one “best” way, just some good ways to do it (and usually some bad ways to do it too!). Even the examples on this website, are just showing you one good way of solving the problem; it doesn’t mean that there aren’t other equally good solutions as well!

5 Expressions and Statements

Almost everything in a C program is an expression. An expression is textual piece of source code that can be evaluated to produce an object that has a type and value. The most familiar expressions are arithmetical. For example, if k is an int that’s been assigned the value 4, (k - 2)*7 is an expression of type int and value 14.

Often the value of an expression cannot be determined until the program is run. For example, in the code below

int k = readnum(stdin);
writenum(k + 7, stdout);

… what is the value of the expression k+7? Well, it depends what the user enters for input. That’s something that cannot be known until the program is run. However, the type of the expression k+7 is something we do know before running the program: it’s int. This is always true in the C programming language, and because of it C is said to be a “statically-typed” language.

The library calls we have been making to do I/O are also expressions. For example, readchar(stdin) is an expression of type char. However, note that some library calls, such as writenum(5,stdout), don’t have any type. It depends on the function!

Even assignments in C, such as k = 5, are technically expressions as well. In this case, the type of k=5 is the same as the type of k (int in this case), and the value is the same as whatever it gets assigned to (5 in this case). But usually it’s a good idea not to use assignments as expressions, and just let them stand by themselves. That’s not because it’s incorrect or an error to use assignments as expressions, but it’s something that is easy to get confused about (or to confuse others who may read your code!).

At this point it may seem like everything’s an expression, but that’s not true. For example, anything with a ‘;’ (semicolon) at the end is a statement, not an expression. So while k = 4 is an expression as used in (k=4)*3 will evaluate to 12, the statement k = 4; as a line of code ending in a semicolon is not an expression. Declarations of variables, something like int k for example, are not expressions - regardless of whether the ; is there. Still, most things are expressions, and understanding this fact and being able to identify the types and values of expressions are key to understanding C … and most any other programming language.

5.1 A Note on expressions, precedence and associativity

What happens when an expression like 2 + 3 * 5 is evaluated? Do I get 17 or 25? Well, your math classes should have taught you that 17 is the answer, and indeed that’s true in C as well.

When you have two different operators in an expression and parenthesization does not tell you which operation is performed first, the relative precedence of operators is what determines which operation is performed first.

Since * has a higher precedence than +, the expression 2 + 3 * 5 is evaluated like 2 + (3 * 5). But what about when both operators are the same, or both have the same precedence? What happens with a * b * c?

When you have two identical operators in an expression (or two different operators with the same precedence) and parenthesization does not tell you which operation is performed first, the associativity of the operator(s) is what determines which operation is performed first.

The associativity of * and / (which both have the same precedence) is left-to-right, so a * b * c is evaluated as (a * b) * c. This can matter in C. (For example, what does 3 / 4 * 1.5 evaluate to?) However …

Always use parentheses rather than relying on subtle precedence and associativity rules!

This table lists the operators and their associativities. They are grouped together on lines with operators of the same precedence, and the lines go from highest precedence at the top, to lowest at the bottom. You should know about precedence and associativity, and you should be able to use tables like this to fix precedence and associativity related bugs, but rely on parentheses when you’re unsure.

Associativity matters: Associativity makes expressions like a = b = 0 do what you want, which is to assign both a and b to equal 0. Only in this case = is is right-to-left associative, so we get a = (b = 0). The key here is that an assignment expression evaluates to the value of the left-hand-side object after the assignment. So b gets assigned value zero, the expression has value zero, and then that’s what is assigned to a.

6 Branching

If you can wait and not be tired by waiting,
  Or being lied about, don’t deal in lies,

Or being hated, don’t give way to hating,
 And yet don’t look too good, nor talk too wise …


- Rudyard Kipling, “If”

Now we’re ready to learn about a very powerful construct in any programming language, the if statement.

The ability to make decisions and react to different input is a key part of the power of computers. For example, we would certainly expect it to be possible to write a program that reads in a number and writes “even” if the number is even, and “odd” if the number is odd. In C (as in English!) “if” is the key to expressing this.

int k;
k = readnum(stdin);
if (k is even) {
  fputs("even\n", stdout);
} else {
  fputs("odd\n", stdout);
}
Of course we’ve got to figure out some C that will do the “k is even” for us. What’s inside the ()’s in an if statement needs to be an expression that evaluates to a number. If the expression is not equal to zero, then the first block of code (code surrounded by {}’s forms a block) is executed. Otherwise the block following the else is executed. This is called the test condition.

A number is even if 2 divides it evenly, i.e. if its remainder when divided by 2 is 0. So for k to be even, k % 2 must be zero. We can test this using the == operator. A single “=” in C is used for assigning values to variables, whereas a double “=” (i.e. ==) is used to test whether two values are equal. A “==” expression evaluates to an int value that is either 1 if the two values are equal, or 0 if they are not equal. In general, the C language specifies that 0 can be used to indicate something is “false”, whereas any number other than 0 can be used to indicate something is “true”. (And 1 is the most convenient number other than 0!)

Thus (k % 2) == 0 is the test condition we need to replace “k is even” in the code above and make it work.

6.1 Scope and Blocks

We might consider solving the above problem in a slightly different way: We’ll assign a variable the value "even" if k is even and "odd" otherwise. Then, after the if-statement, we’ll do the printing. We might implement it like this:

if ((k % 2) == 0) {
  cstring s;
  strcpy(s, "even");
}
else {
  cstring s;
  strcpy(s, "odd");
}

fputs(s, stdout);
fputs("\n", stdout);

However, when we try to compile this the compiler complains that s is an “undeclared identifier”, which is exactly what it would say if we’d never defined s at all! Well, as far as the compiler is concerned when it processes the “fputs(s, stdout)” statement, we haven’t defined s. The problem is caused by the scope of variables in C.

In C, a variable only exists from the point at which it is declared to the } that closes off the innermost block in which it was declared. So the s that we define inside the else-block is invisible, is unknown, does not exist outside of that else-block. In particular, this is true for our fputs-statement. The scope of a variable is the portion of the code that “sees” the variable, i.e. that knows it exists. The scope of a variable ends with the innermost block in which it was defined.

To fix up this version of our even/odd program, we simply need to move the declaration of s outside of the if/else-blocks so that its scope extends to fputs statement. This’ll work:

cstring s;
if ((k % 2) == 0) {
  strcpy(s, "even");
}
else {
  strcpy(s, "odd");
}

fputs(s, stdout);
fputs("\n", stdout);

Code in between {}’s forms a block. So the if is followed by a block (the then block) and the else is followed by a block (the else block). You’ve already seen one example of a block: The block following main(). Anything you can write inside of the main block, you can write inside of any block. This means that our then blocks and else blocks can declare variables, read input, make assignments to variables… anything. So, suppose I wanted to write a program that would compute areas of circles and triangles for the user, first asking the user which kind of object she was interested in. My program might look something like:

int main() {
  // Read type of object: circle or triangle
  cstring s;
  fputs("Do you have a circle or a triangle? ", stdout);
  readstring(s, stdin);

  // declare variable for the area in the main scope
  double area;

  // remember, strcmp(s1,s2) returns 0 if s1 and s2 are the same
  if (strcmp(s, "circle") == 0) {
    // Compute area of circle
  }
  else {
    // Compute area of triangle
  }

  // print out the computed area
  fputs("The area is: ", stdout);
  writenum(area, stdout);
  fputs("\n", stdout);

  return 0;
}

… where we’d have to fill in the then-block with code that gets the radius of the circle and computes and outputs its area, and we’d have to fill in the else-block with code that reads in the base and height from the user and computes and outputs the triangle’s area.

Code for circle area

double Pi = 3.14159265358979324;
double radius;

fputs("Radius of the circle: ", stdout);
radius = readnum(stdin);

area = Pi * radius * radius;

Code for triangle area

double base;
double height;

fputs("Base of the triangle: ", stdout);
base = readnum(stdin);

fputs("Height of the triangle: ", stdout);
height = readnum(stdin);

area = .5 * base * height;

Each of these “miniprograms” can be placed in its appropriate block, and we get the whole program.

6.2 Boolean Operators

The “==” is an example of a comparison operator. Relational operators make comparisons of their left and right-hand arguments, and return 1 for true or 0 for false. These op values accordingly. They are:

== (equal)
!= (not equal)
<  (less than)
>  (greater than)
<= (less than or equal to)
>= (greater than or equal to)

As you can see from the operator precedence table, they have lower precedence than the arithmetic operators, so things like 2*k > k + 5 do what we’d like them to do - they evaluate the arithmetic expressions on the left and right, then they apply the > operator to compare the two values. This means, for example, that instead of writing ((k % 2) == 0) we could write (k % 2 == 0) and get the same result.

So what happens when we compare objects of different type? For example, what happens with k == x, where k is an int, and x is a double? The answer is that the same automatic type conversions are applied as in arithmetic expressions. So, k is implicitly converted to a double, and this double value is compared to x. Thus, 5 == 5.2 evaluates to 0, since 5 is converted to 5.0 prior to the comparison, and we end up actually doing 5.0 == 5.2.

Consider a code fragment that reads in a value from the user and returns the “arcsine” or inverse sine of the value. We’ll use the asin function from the math.h library to compute our value. Now, the arcsine is only defined for values between negative 1 and 1, so if the user enters a value outside of this range we should print an error message. Doing this with what we know so far is ugly:

if (x <= 1.0) {
  if (x >= -1.0) {
    writenum(asin(x), stdout);
    fputs("\n", stdout);
  }
  else {
    fputs("Error! Value outside of [-1,1]\n", stdout);
  }
} else {
  fputs("Error! Value outside of [-1,1]\n", stdout);
}

We want to be able to say in our test condition that x must be less than or equal to 1 AND greater than or equal to -1. In C the operator && is “and”. So our program fragment becomes:

if ((x <= 1.0) && (x >= -1.0)) {
  writenum(asin(x), stdout);
  fputs("\n", stdout);
} else {
  fputs("Error! Value outside of [-1,1]\n", stdout);
}

… which is substantially simpler. More to the point, this does a much better job of reflecting what we’re thinking.

Now at first glance, && looks like something mysterious and new. It’s not. It’s an operator just like +, *, and >, all of which we’ve used a lot already. It takes two objects of type int and evaluates to an object of type int. We have an intuitive idea of what “and” means, and it coincides with C’s technical, exact definition of what && means:

a b a && b
zero zero 0
zero nonzero 0
nonzero zero 0
nonzero nonzero 1

So, something like (x <= 1.0) && (x >= -1.0) makes sense because x <= 1.0 evaluates to an int that is 1 or 0 depending on the comparison, and x >= -1.0 also evaluates to 1 or 0 depending on the second comparison, and then the && operator combines those two comparison results.

This definition of “or” for the || operator is not quite what we sometimes mean in English by saying “or”. The || operator is what’s known as an inclusive or, meaning that if both parts are true (nonzero), then the || also evaluates to true (1). In English, if you said “I’m eating pizza or a burrito”, no one would expect you to be eating both, but in C that’s very possible!

There are three boolean operators, the now familiar && (and), the || operator (or), and the ! (not) operator. The || operator is much like &&, it takes two ints and evaluates to 1 if either of those ints is nonzero, or to 0 if both of the ints are 0.

The ! operator is a unary operator. Instead of operating on left and right-had values, it operates on a single value - the value following it on the right. It evaluates to 0 if the number following it is nonzero, and it evaluates to 1 if the number following it is 0.

Things get especially interesting when you combine boolean operators. For example: Suppose you want to write a program that reads a character from the user and prints “Letter” or “Not a Letter” depending on whether or not the user entered a character that’s a letter. Your program will, more or less, look like this:

  // Read char
  char c;
  fputs("Enter a letter: ", stdout);
  c = readchar(stdin);

  // Decide: Letter or not a letter?
  if (c is a letter) {
    fputs("Letter\n", stdout);
  } else {
    fputs("Not a letter\n", stdout);
  }

But, of course, the problem is the test condition c is a letter. Letters come in two flavors, uppercase and lowercase. So a refinement of our if statement wuold be:

  // Decide: Letter or not a letter?
  if ((c is an uppercase letter) || (c is a lowercase letter)) {
    fputs("Letter\n", stdout);
  } else {
    fputs("Not a letter\n", stdout);
  }

From any ASCII table we would see that uppercase letters range from 'A' (65) to 'Z' (90). So c is an uppercase letter boils down to (c >= 'A' && c <= 'Z'). Similarly, because lowercase letters range from 'a' (97) up to 'z' (122), we know that c is a lowercase letter boils down to (c >= 'a' && c <= 'z'). Put it all together and we get:

  // Decide: Letter or not a letter?
  if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')) {
    fputs("Letter\n", stdout);
  } else {
    fputs("Not a letter\n", stdout);
  }

The binary logical operators && and || follow short circuit evaluation rules. This means that the left-hand operand expression is evaluated first, and if the truth or falsity of the expression can be determined based only on the left-hand side, the right-hand operand expression is never even evaluated. So, for example, in the expression:

(x >= -1) && (x <= 1)

if x is, say, -5, then the first part of the &&, the expression (x >= -1), evaluates to 0. Since 0 && ANYTHING will always evaluate to 0, the right-hand side (x <= 1) is never actually evaluated.

Right now this is more of a curiosity than anything. However, later in the semester, this behavior becomes very important.

Note that I wrapped my uppercase and lowercase tests in parentheses, because I wanted to be sure that the &&’s were evaluated before the ||. Is that necessary? (Think about how you would decide this!) Even if it isn’t, adding the ()’s makes the meaning clear to everyone.

Also observe, we could have used the ASCII integer values like 65 for 'A' or 90 for 'Z' in the code above, but that would make it much less clear, plus we have to make sure to copy from the ASCII table correctly! When you can, it’s always better to use the character literals like 'A' and 'Z' rather than the integer values from the ASCII table.

6.3 If statement shortcuts

Dropping the Else

Sometimes you have a condition which, if it’s true, should cause you to do some extra work, but which, if it’s false, should have no effect on the program. For example, suppose we read an int from the user and we want to change it to its absolute value and print it out. We’d probably write something like this:

int k;
k = readnum(stdin);

if (k < 0) {
  k = -1*k;
} else {
  // nothing to do here!
}

writenum(k, stdout);
fputs("\n", stdout);

When there’s nothing to do in the else-block, we can simply drop it. (Sometimes your if-statement is written so that the then-block is empty and the else-block isn’t. You can’t simply drop the then-block, so first rewrite your if-statement with a new condition so that the then-block contains the work.)

int k;
k = readnum(stdin);

if (k < 0) {
  k = -1*k;
}

writenum(k, stdout);
fputs("\n", stdout);

Single-statement block

If your block is only a single statement, you can drop the ‘{’ and ‘}’ characters, like so:

if (readnum(stdin) > 100)
  fputs("You entered a big number.\n", stdout);
else
  fputs("You entered a small number.\n", stdout);

However, this shortcut is not recommended by us (Dr. Roche and Dr. Albing). The reason is, you are likely to add another line to one of those parts, with the same indentation, and think that it’s part of the same then-block or else-block, but it really isn’t since each block is a single statement only! Trust us, writing those two extra ‘{’ and ‘}’ characters isn’t that much effort, and it makes your code more bullet-proof to future bugs.

The else if

However, the single-statement block does allow us to simplify code in one very common situation, where you have a number of conditions to check and only one of the corresponding blocks should be executed.

For example, let’s say you want to output who won an important football game. Here’s how we might do it right now:

int eagles = 4;
int cowboys = 3;

if (eagles > cowboys) {
  fputs("Eagles win, hooray!\n", stdout);
}
else {
  if (cowboys > eagles) {
    fputs("Cowboys win, boo hiss\n", stdout);
  } else {
    fputs("Tie game, boring\n", stdout);
  }
}

But this is kind of bad looking, and it will get worse and worse (with more and more levels of indentation) if we have more cases than just three. Well, since an if is considered a “single statement”, we can drop the { and } around the second big else block in the code above, like so:

if (eagles > cowboys) {
  fputs("Eagles win, hooray!\n", stdout);
}
else
  if (cowboys > eagles) {
    fputs("Cowboys win, boo hiss\n", stdout);
  } else {
    fputs("Tie game, boring\n", stdout);
  }

Hmm, that still looks bad (worse, actually), but if we move around the spaces a little, it looks very nice:

if (eagles > cowboys) {
  fputs("Eagles win, hooray!\n", stdout);
} else if (cowboys > eagles) {
  fputs("Cowboys win, boo hiss\n", stdout);
} else {
  fputs("Tie game, boring\n", stdout);
}

This is called an “else-if” and it’s a very useful construction. We can add more “else if” cases if we want and they will be checked one at a time until you reach the last “else”, or until one of the conditions is true.

For an even better example of the way this makes code more readable, look at this example of converting a date. If we didn’t use else if for the months checking, it would end up being a nested if that goes 12 levels deep!

7 Switch statements (optional)

This section is optional reading if you want to learn more about C programming. You won’t be required to understand switch statements or to use them in your code. But if you read a lot of C code, you will probably come across switch sooner or later, so you might want to know about it. (You are also free to use switch yourself if you think there’s a good reason to do so.)

A switch statement is an alternative way to write a long if / else if / else statement if all of your conditions have the same certain form. You use a switch statement when you want to test a single variable or expression for several different cases. The hitch is that the variable pretty much needs to be an int or a char, which limits when switch can be used. Switch breaks things up into cases. You write

switch(expr) {

… where expr is an expression of type int or char, and then list cases consisting of possible values of expr. These cases must be constants, and each case is followed by a :, then a sequence of statements to be executed, and finally a break;. In other words, each case looks like this:

case constk:
  stmt1;
  stmt2;
  ...
  stmtr;
  break;

You can list as many of these cases as you want. You can also put

default:
  stmt1;
  stmt2;
  ...
  stmtr;
  break;

as one “case”. This is a catch-all that catches every situation in which expr didn’t match one of the other cases, similar to an else condition.

For example, the following two blocks of code are equivalent:

int year;
fputs("Which year are you? ", stdout);
year = readint(stdin);

if (year == 1) {
  fputs("Hi there, Plebe.\n", stdout);
} else if (year == 2) {
  fputs("Hello, Youngster.\n", stdout);
} else if (year == 3) {
  fputs("nameless...\n", stdout);
} else if (year == 4) {
  fputs("Howdy, Firstie.\n", stdout);
} else {
  fputs("Not at USNA!\n", stdout);
}
int year;
fputs("Which year are you? ", stdout);
year = readint(stdin);

switch(year) {
  case 1:
    fputs("Hi there, Plebe.\n", stdout);
    break;
  case 2:
    fputs("Hello, Youngster.\n", stdout);
    break;
  case 3:
    fputs("nameless...\n", stdout);
    break;
  case 4:
    fputs("Howdy, Firstie.\n", stdout);
    break;
  default:
    fputs("Not at USNA!\n", stdout);
    break;
}

Here’s another example program using switch. It reads a date in “mm/dd/yyyy” format and returns the date in “dd monthname, yyyy” format.

8 Practice Problems and Questions

Here is a list of problems and solutions. With even the very basic construction of input, output, variables, and expressions, we can write some useful programs.

Do You Know …

  • The difference between an “object” and a “type”?
  • Why 1/2 is 0 in C?
  • What is a literal?
  • What’s the difference between associativity and precedence?
  • What do we mean by a variable’s “scope”?
  • What is the distinction between a statement and an expression?
  • What are the comparison operators and how do they work?
  • How about the logical boolean operators?