IC210: Types & Expressions III

Reading

Appendix 3 of Problem Solving with C++ has the ASCII tasble.

Everything's an expression ... almost

Almost everything in a C++ program is an expression. An expression is textual piece of source code that can be evaluated to produce an object that has a type and value. The most familiar expressions are arithmetical. For example, if k is an int that's been assigned the value 4, (k - 2)*7 is an expression of type int and value 14.

Often the value of an expression cannot be determined until the program is run. For example, in the code below

int k;
cin >> k;
cout << k+7;

... what is the value of the expressioon k+1? Well, it depends what the user enters for input. That's something that cannot be known until the program is run. However, the type of the expression k+1 is something we do know before running the program: it's int. Most of the time (especially for the subset of C++ we concentrate on) the type of an expression is known at compile time, i.e. when compiling the program, as opposed to running it. Because of this, C++ is said to be a "statically typed" language.

One example of something that is less obviously an expression is k = 5, where k is once again a variable of type int. The type of this expression is int, and the value is 5. In general, an assignment expression has the same type as the object being assigned to, and the same value as that object after the assignment is carried out. Thus, oddly enough, if x is a variable of type double, then (x = 3.4)*2.0 makes perfect sense, it is an expression of type double, and after it is evaluated, it has value 6.8. This is our first explicit example of a side effect. The expression has type and value, but additionally it has the side effect that evaluating the expression changes the value of the variable x.

Another example of something that probably doesn't seem like an expression but is in fact, is cout << x. As we saw last lecture, cout is an object of type ostream. The expression cout << x also has type ostream, and its value is just cout (like multiplying zero by anything, you still get zero). However, there is a side-effect to evaluating this expression, namely that x gets written out to the screen!

What's not an expression?

At this point it may seem like everything's an expression, but that's not true. For example, anything with a ';' (semicolon) at the end is a statement, not an expression. So while k = 4 is an expression as used in (k=4)*3 will evaluate to 12, the statement k = 4; as a line of code ending in a semicolon is not an expression. Declarations of variables, something like int k for example, are not expressions - regardless of whether the ; is there. Still, most things are expressions, and understanding this fact and being able to identify the types and values of expressions are key to understanding C++ ... and most any other programming language.

When types collide - conversion

Things get interesting when expressions involve different types. For example, what is the type and value of the expression x*k, where x is of type double with value 3.3, and k is of type int with value -2? The answer is type double and value -6.6. The explanation is this:

C++ knows how multiply two int objects, and it knows how to multiply two double objects, it doesn't know how to multiply one of each. However, it understands that an int can be converted to a double and vice versa. So it converts one and performs the multiplication on two objects of the same type. But which way should it go? For arithmetic, types are always converted in the direction that gives the most precision - this is referred to as type promotion - which in this case means that the int is converted (or promoted) to a double, and the operation is performed on two doubles. It wouldn't make nearly as much sense the other way round, would it?

This implicit type conversion (implicit meaning that it happens automatically behind the scenes, without you doing anything directly) happens in other cases. The only one affecting us right now is assignments. You can assign an object of one type to an object of a different type, as long as C++ knows how to do the conversion. If it doesn't, the compiler will let you know. So, for example, x is of type double with value 3.3, and k is of type int, then k = x is an expression of type int with value 3. C++ truncates doubles when converting to ints.

Another way to do type conversion amongst the built-in types (int, double, char, bool) is to follow the C-language sytax, which is to preface the expression you are converting with the new type in parentheses. For example, if k is an int and you'd like to convert it to the equivalent double, you'd write:

(double)k

which would "cast" k to type double. C++ actually adds functionality to C, i.e. it is literally C plus some other stuff. So C constructs like (double)k all work.

We also have explicit type conversion. Suppose, for example, that m and n are ints, n being the larger. We'd like to print out the value of m/n. Well,

cout << m/n << endl;

will just print out zero! (Make sure you know why!) We'd like to get some fractional value, in other words, we'd like these values treated as doubles. To explicitly convert them to doubles first we'd write:

cout << double(m)/double(n) << endl;

[Challenge: can you explain what type and value you get with m / double(n)?] Explicit conversion can get tricky later on, but at this stage it's as simple as writing the new type name followed by the old object in ()'s.

Some Quick Conversion Rules

int → double : This does exactly what you'd expect.

double → int : This simply truncates the value, meaning that whatever's after the decimal point just gets chopped. You can get in trouble if the double value is too big.

bool → int : true goes to 1, false goes to 0.

int → bool : 0 goes to false, everything else goes to true;

int → char : if the int is in the range 0-127 the char value is determined by the ASCII table;

char → int : the int value is determined by the ASCII table;

Representing data in a computer

Note: I strongly recommend that you review the class on Digital Data from the si110 website. You are expected to understand about bits and bytes, binary-to-decimal and decimal-to-binary conversion, and how the ASCII table defines a mapping between characters and bytes. What's here in the notes is just a brief overview of that. Here's a link to a full ASCII table.

You've probably heard terms like bits and bytes used in connection with computers, and you've probably heard people say that inside a computer everything is 0's and 1's. If not, I'll say it now: Inside a computer everything is 0's and 1's! (A bit is just a 0/1 value.) But how can all of these things - chars, ints, bools, and doubles - be represented by zeros and ones? Our understanding of types will really depend on being able to answer these questions.

Binary numbers

First we'll look how 0's and 1's suffice to represent any integer number, then we'll look at other types of objects. When we deal with numbers we use the decimal number system, i.e. the base 10 number system. This means that all our numbers (lets look at non-negative integers for now) look like sequences of decimal digits, which are numbers in the range [0,9]. A number like 3027 is short-hand:

3027 → 3*10^3 + 0*10^2 + 2*10^1 + 7*10^0

Or, for another example,

1011 → 1*10^3 + 0*10^2 + 1*10^1 + 1*10^0

In the binary number system we have the same idea, but the base is now 2 rather than 10. So, binary digits are in the range [0,1], and now 1011 has a different interpretation. In binary it is short-hand for:

1011 → 1*2^3 + 0*2^2 + 1*2^1 + 1*2^0 = 2^3 + 2 + 1 = 11 (in decimal)

So, in binary the decimal number 11 is represented as 1011. The binary number 1001 = 2^3 + 1 = 9, for another example. With four bits, i.e. four binary digits, we can represent any number from 0 up to 15 (which is 2^3 + 2^2 + 2^1 + 2^0). With four decimal digits I can represent from 0 up to 9999, i.e. from 0 up to 10000 - 1. So we need more bits than decimal digits, but given enough bits we can represent any number we care to. Using k-bits, we can represent the numbers from 0 up to 2^k - 1.

Bytes - How type depends on interpreting bit-sequences

The memory of a computer is simply one long sequence of bits. However, these bits are organized into chunks of 8 called bytes. To emphasize, a byte consists of 8-bits. In a byte, we can represent the numbers from 0 to 255.

The type bool is just a way of interpreting a byte of memory. If all 8 bits of the byte are zero, the interpretation as a bool is false. Otherwise, the interpretation of a bool is true.

The type char is just a different way of interpreting a byte of memory! For example, the byte 01100001 is interpreted as the character a. This intepretation of bytes as characters is called the ASCII encoding, and this table, for example, shows you the whole thing. Interpreting 01100001 as a number in binary, we get the number 97, and if you look up 97 in the table, you'll see that it corresponds to the character a.

Already we see one of the fundamental ideas behind computing, different types of objects may be represented by treating sequences of 0's and 1's in different ways. That's why C++ needs to keep track of the types of objects, so it knows how to interpret the contents of the chunk of memory associated with each object.

A note on `char`s

In fact, you can look at a char as just being a small integer (I say small because 8-bits only allows us the range [0,255]). This interpretation pretty much tells us what to expect of conversions between chars and ints. One interesting feature of this match-up between characters and numbers is that statements like 'b' - 'a' make perfect sense. Looking at the ASCII table, we see that 'b' corresponds to the number 98, and 'a' to the number 97. So C++ treats this as the int subtraction problem 98 - 97, which evaluates to 1. In fact, the letters of the alphabet appear in order, so that a is 97, b is 98, ..., z is 122. So, char('b' + 3) is the character e.

Other types

A full int on your PC consists of 4 bytes, or 32 bits, so it can represent very large numbers. We're not going to get into the question of how negative numbers are represented in binary. Essentially an int looks like the binary number representation we just talked about, but in 32 bits.

Technically, the int 5 could be represented as

00000000 00000000 00000000 00000101

... or it could be represented as

00000101 00000000 00000000 00000000

... depending on what's referred to as as the "endianness" of the underlying machine. That particular distinction is beyond the scope of this course, but you will encounter it in subsequent CS/IT course.

So, The int 5 is represented in the computer as:

00000000 00000000 00000000 00000101

... where I've broken things up into bytes to make it all a little clearer.

A double takes up 8 bytes, or 64 bits. The format is more complex, however, and we will not go over it here, except to say that it is a binary version of the familiar scientific notation. However, instead of a base of 10, it uses a base of two. (Example: 12 is represented as 1.5 x 2^3.) Let it suffice to say that the double 1.0 is represented by the following 64 bits:

00000011 11111111 11111111 00000000  00000000 00000000 00000000 00000000

Problems

Converting radians to degrees, minutes, and seconds. This requires conversions between types! See if you can identify where they happen!
Ceasar Shift Encryption The Ceasar Shift is an early method of encrypted communication. This program implements it for 4-letter messages. It requires understanding the mod operator (%), and the correspondence between ASCII encoded characters and numbers.
GPA Calculator