Class 4: A C Primer Part I

Reading: APUE, Section 5.11 on printf and scanf gives all the possible placeholders for the various data types. Don't worry about sprintf or sscanf.
Homework: Printout the Homework and answer the questions on that paper.

Why are we learning C?

In this course, all of our programming will be in C. Why? Reasonable question given you just spent a semester learning C++. Here's the short version: UNIX and C share a common history, operating systems are still written in C (even Windows XP), and so are many systems programs. All of the operating system functions we'll call and structures we'll manipulate are all C ... so to understand them we'll have to understand the C way of doing things.

In fact, C is to the current world of programming as English is to the current world: it's the language everyone is expected to know ... at least well enough to get around an airport. Ironically, it is the lingua franca of the modern programming world. One of the reasons is that C maps easily to machine instructions, so you can use C syntax to describe high-level programming or even to describe what's going on at pretty near the assembly language level.

What's the difference between C and C++?

C and C++ are closely related: C is a subset of C++, so as you might expect there are some features of C++ that aren't available in C.

Things that are identical, or nearly so (this is not an exhaustive list):

Conditionals (if, if-else)
The main() function
Intrinsic types: int, float, double, char ... (but not string, which is not intrinsic in C++. but rather defined in a library!)
Naming rules
Syntax (with a few exceptions)
Arrays

Things that are different (also not an exhaustive list):

I/O (i.e. input/output): no cin/cout/ifstream/ofstream ...
Strings: no string type
Namespaces: no namespaces
Object-orientation: C is strictly procedural, not classes, inheritance, constructors, etc
Structs: the syntax of structs are slightly different
Functions: no pass-by-reference, no function overloading
Variable scopes: Code like "for(int i = 0; ...)" is not allowed, the variable "i" must be defined before the for-loop.

Similarities we'll focus on in this course; you'll need to know and understand the differences:

Output to the screen, and getting user input
File (and file-like) operations
Dynamic memory allocation
Pointer parameters (*) vs. reference parameters (&)
C style struct definitions

Basic I/O With fprintf and fscanf

A simple C program and it's C++ equivalent

/* This program reads in int's and prints out
   the average of the values read in. */
#include <stdio.h>

int main()
{
  double total = 0.0;
  int count = 0, next;
  while(fscanf(stdin,"%i",&next) == 1)
  {
    total += next;
    ++count;
  }
  if (count == 0)
  {
    fprintf(stderr,"Error!  No data entered!\n");
    return 1;
  }
  else
  {
    fprintf(stdout,"%f\n",total/count);
    return 0;
  }
}

/* This program reads in int's and prints out
   the average of the values read in. */
#include <iostream>
using namespace std;

int main()
{
  double total = 0.0;
  int count = 0, next;
  while(cin >> next)
  {
    total += next;
    ++count;
  }
  if (count == 0)
  {
    cerr << "Error!  No data entered!" << endl;
    return 1;
  }
  else
  {
    cout << total/count << endl;
    return 0;
  }
}

I/O is done in a fundamentally different manner. First, I/O is done with function calls rather than <</>>-expressions. Second, instead of the system inferring how to interpret input characters or format output characters based on they types of the objects you read/write, the programmer explicitly states in the I/O function calls what interpretation/formatting of characters should be used. Thus, instead of cout << 3.5/6, where the system realizes thta 3.5/6 is a double and prints it out as such, we have fprintf(stdout,"%f",3.5/6), where the "%f" tells printf that 3.5/6 should be printed out as a double. This example also shows the difference between C++ istreams and ostreams (like cout) and C "file streams": stdout is the stream associated with standard output. The C I/O library is stdio.h, so instead of "#include <iostream>", C programs that do I/O have:

#include <stdio.h>

In C you always have the three streams

stdin
stdout
stderr

which, of course, correspond to standard in, standard out, and standard error (just like cin, cout and cerr in C++). When you open up files, for example, you create more streams, and the first argument for most I/O routines is the stream the operation will act on.

Output with fprintf
When it comes to producing formatted output without a lot of pain, accept no substitute--fprintf is the function to use. The general format is:

int fprintf(FILE *, conversion string, list of arguments);

You must become comfortable with using conversion specifiers like %d, %s, and %lf. Section 5.11 of Advanced Programming in the Unix Environment (starts on p. 149) lists a lot of fprintf's options. Because it's so common to print to stdout, the function printf is available, which is just fprintf with an implied stdout, i.e. fprintf(stdout, ... , ...) = printf(... , ...)

Often fprintf is a lot more convenient than C++'s cout << .... Consider the following example. All I want to do is print out a few decimal numbers to 3 places, along with their signs, and I want them to look nice and right justified. Here's the code difference between C and C++:

io1.cpp: C++ Formatted I/O io1.c: C Formatted I/O Results (same in both cases)

#include <iostream> #include <iomanip> using namespace std; int main() { cout.setf(ios::showpoint); // Show decimal cout.setf(ios::showpos); // Show + signs cout.setf(ios::fixed); // No e-notation cout.precision(3); for(int i = 0; i < 10; i++) cout << setw(10) << (1 - 2*(i%2))*rand()/7.0 << endl; return 0; }

#include <stdio.h> int main() { int i; /* In C, i can't be declared "in the for" */ for(i = 0; i < 10; i++) printf("%+10.3f\n", (1 - 2*(i%2))*rand()/7.0); return 0; }

bear[1] [~/]> ./a.out +2405.429 -822.571 +1444.714 -2502.143 +4435.857 -803.857 +3287.143 -1059.857 +2316.000 -583.714

Often you print a lot of things together, maybe even the same thing printed several times over:

io2.c

#include <stdio.h>

int main()
{
  int i = 31;
  double f = 31.0;
  char *s = "hello!";

  printf("i = %i (or %x in hex), f = %f (or $%5.2f if you like), s = %s\n",i,i,f,f,s);

  return 0;
}

bear[2] [~/]> gcc io2.c
bear[3] [~/]> ./a.out
i = 31 (or 1f in hex), f = 31.000000 (or $31.00 if you like), s = hello!

The important thing here is that every "%..." in the "conversion string" has a matching argument following it. Like this:

printf("i = %i (or %x in hex), f = %f (or $%5.2f if you like), s = %s\n",i,i,f,f,s);
             |      |               |        |                      |    | | | | |
             `------|---------------|--------|----------------------|----' | | | |
                    |               |        |                      |      | | | |
                    `---------------|--------|----------------------|------' | | |
                                    |        |                      |        | | |
                                    `--------|----------------------|--------' | |
                                             |                      |          | |
                                             `----------------------|----------' |
                                                                    |            |
                                                                    `------------'

Output written by scanf (or fscanf(stdio,...)) is line buffered by default, meaning that output characters are collected until a '\n' is seen, and only then is it written out to the screen. (This is like C++'s cout, where you only get output to the screen when you write a endl. ) Especially when debugging, forgetting to put '\n's in your printf's can cause confusion! The return value of fprintf is the number of bytes written. This can be useful, especially in sprintf, which is a version of fprintf that writes its output to a string (array of char's) rather than a stream. Enter man fprintf at the command prompt for more information (in that wonderful man-page jargon). NOTE: if you need to actually print a % symbol, it needs to be "escaped" with another percent, e.g. printf("35%%\n").

Just as a side note since you're taking Java this semester as well. System.out.println() is horrible at this formatting stuff. Luckily, enough people were fed up with it that in Java 5, Sun added System.out.printf() that acts like the C printf function! It's an oldie, but a goodie!

Input with fscanf
A call to fscanf looks a lot like a call to fprintf, but it's purpose is to read data rather than write it. The format string is a template for what to read in, rather than what to read out, and the rest of the arguments are where to store the data that gets read. Here's the general format:

int fscanf(FILE *, conversion string, pointers to arguments);

Return value is the number of input items successfully assigned.

Note the bolded section above! fscanf is supposed to store the data that's read in and, since there's no pass-by-value, the only way to put the data somewhere that'll be available after the call to fscanf returns is to pass pointers to the locations in which the data should be stored. The return value is the number of input items successfully assigned, which is helpful in detecting errors. Here's example code that reads an integer and a double, separated by a comma, and stores them in the variables x and d.

io3.c

Sample runs

#include <stdio.h>

int main()
{
  int x, rv;
  double d;

  /* recall that "&" is the "address-of" operator */
  rv = fscanf(stdin, "%i, %lf", &x, &d );  
  fprintf(stdout,"rv = %i, x + d = %f.\n",rv,x+d);
  
  return 0;
}

bear[3] [~/]> gcc io3.c
bear[4] [~/]> ./a.out
3,6.5
rv = 2, x + d = 9.500000.
bear[5] [~/]> ./a.out
3:,6.5                        Notice that the read fails here because
rv = 1, x + d = 3.000000.     of the ":".  The rv = 1 clues us in!
bear[6] [~/]> ./a.out
   3,  6.5
rv = 2, x + d = 9.500000.

Read man fscanf for all the gory details, or pp.151-152 in APUE for a summary. One thing to note, however, is the way whitespace works. Each componant of the format string beginning with a % is called a conversion specification, and fscanf skips any whitespace before matching conversion specifications. The last run above shows that in action. Whitespace in the conversion string get's matched by any number of spaces (including zero) in the input that gets read. Everything else must match literally. Thus, in my above code, input " 3, 6.5" matches, but input "3 , 6.5" does not, because the space before the "," is not allowed. If I changed my format string to "%i , %lf", so that there's a space in front of the comma, whitespace in the input before the comma is OK, through not required.

There's a "scanf" function that is exactly like fscanf except that the first argument isn't required because it assumes you're reading from stdin.

C-style strings

There is no string data type in C, really. A string is simply an array of char's, i.e. it's a char*, with the convention that the "string" represented is the sequence of characters in the array up to, but not including, the null character '\0'. In particular, an array must contain a null character to be properly considered a string. The string.h header file contains prototypes for many useful string utilities. Try man strlen to see some of these. Strings can be printed with fprintf using the %s conversion specification. They can be read in with fscanf. Perhaps confusingly, if str is a string (i.e. char *str or char str[]) then fscanf gets passed str, not &str. More discussions on the topic of strings will have to wait 'til we know a bit more about memory management in C. However, here's a simple example:

io4.cpp

Sample runs

#include <stdio.h>

int main()
{
  char str[10];
  fscanf(stdin,"%s",str);
  fprintf(stdout,"I read \"%s\"\n",str);
  return 0;
}

bear[6] [~/]> gcc io4.c
bear[7] [~/]> ./a.out 
Hello
I read "Hello"
bear[8] [~/]> ./a.out
Hello???
I read "Hello???"
bear[9] [~/]> ./a.out
TheRainInSpainFallsMainlyOnThePlain
I read "TheRainInSpainFallsMainlyOnThePlain"
Segmentation fault                           ← What happened here?

This example shows a potentail pitfall with reading into strings with fscanf: buffer overflow. We had a 10-char buffer (which really only holds a 9-character string, since the 10th character is needed for '\0', and when we read in a really long string fscanf kept packing in the characters off the end of our array and overwrote some important data, which subsequently crashed our program! This kind of bug can be exploited, and is a common source of security breaches.

Grouping data together with structs

Think back to IC210. You were introduced to to the concept of a struct as a means to package homogenous data into a single object. In C, the syntax of structs is a bit different. The big difference is that if you define a struct Foo, then the name of the new type is not "Foo", but "struct Foo".

struct.c sample runs

#include <stdio.h> struct Point { double x, y; }; int quad(struct Point p) { int A[2][2] = {{1,4},{2,3}}; return A[p.x < 0][p.y < 0]; } int main() { struct Point p; fscanf(stdin,"( %lf , %lf )",&p.x,&p.y); fprintf(stdout,"That's in quadrant %i.\n",quad(p)); return 0; }

bear[284] [~/> gcc struct.c bear[285] [~/]> ./a.out (1,1) That's in quadrant 1. bear[286] [~/]> ./a.out (-1,2) That's in quadrant 2. bear[287] [~/]> ./a.out (-2,-5) That's in quadrant 3. bear[288] [~/]> ./a.out (2,-8) That's in quadrant 4.

The syntax for accessing the data members of the Point struct are identical to accessing data members of a C++ class: use the arrow operator (->) if you have a pointer to a struct, otherwise use the dot (.) operator.

Dave Stahl

Last modified: Tue Jan 27 16:11:21 EST 2009