Our simple calculator: now using flex to generate the scanner

The files ex1.ypp, ex1.l, and Makefile define a simple calculator program. In fact, it's the same simple program we had for Class 3's homework. The big difference is that now, instead of writing our scanner by hand, we're using the standard tool flex to create it.

/********************************************************
 * ex1.ypp 
 ********************************************************/
%{
#include <iostream>
#include <string>
#include <map>
#include <cstdlib> //-- I need this for atoi
using namespace std;

//-- Lexer prototype required by bison, aka getNextToken()
int yylex(); 
int yyerror(const char *p) { cerr << "Error!" << endl; }
%}

//-- SYMBOL SEMANTIC VALUES -----------------------------
%union {
  int val; 
  char sym;
};
%token <val> NUM
%token <sym> OPA OPM LP RP STOP
%type  <val> exp term sfactor factor res

//-- GRAMMAR RULES ---------------------------------------
%%
run: res run | res    /* forces bison to process many stmts */

res: exp STOP         { cout << $1 << endl; }

exp: exp OPA term     { $$ = ($2 == '+' ? $1 + $3 : $1 - $3); }
| term                { $$ = $1; }

term: term OPM factor { $$ = ($2 == '*' ? $1 * $3 : $1 / $3); }
| sfactor             { $$ = $1; }

sfactor: OPA factor   { $$ = ($1 == '+' ? $2 : -$2); }
| factor              { $$ = $1; }

factor: NUM           { $$ = $1; }
| LP exp RP           { $$ = $2; }

%%
//-- FUNCTION DEFINITIONS ---------------------------------
int main()
{
  yyparse();
  return 0;
}
/********************************************************
 * ex1.l 
 ********************************************************/
%{
#include "ex1.tab.hpp"
#include <iostream>
using namespace std;
%}

%option noyywrap

%%

[0-9]+   { yylval.val = atoi(yytext); return NUM; }
[\+|\-]  { yylval.sym = yytext[0]; return OPA; }
[\*|/]   { yylval.sym = yytext[0]; return OPM; }
"("      { return LP; }
")"      { return RP; }
";"      { return STOP; }
<<EOF>>  { return 0; }
[ \t\n]+ { }
.        { cerr << "Unrecognized token!" << endl; exit(1); }
%%
# Makefile: A simple makefile for ex1.
default:
<TAB>bison -d ex1.ypp
<TAB>flex ex1.l
<TAB>g++ -o mycalc ex1.tab.cpp lex.yy.c
What do flex and bison do?
Look at the Makefile. We start off with the file ex1.ypp that defines the grammar and action rules for our calculator. The call
bison -d ex1.ypp
creates files ex1.tab.cpp and ex1.tab.hpp. The first contains the C++ code for the parser (which is a function called yyparse()), the second is a header file that contains, among other things, the definitions of the token names NUM, OPA, etc., and the variable yylval that we use to pass the bison code the semantic values of tokens. The flex file ex1.l needs those definitions so that the yylex() function it defines for bison can pass back the information it needs to pass back.

The next thing we have is the flex file ex1.l. The call

flex ex1.l
creates the file lex.yy.c that contains, among other things, the definition of the yylex() function. Of course we haven't talked about flex, so this file is new. The basic idea is that you write regular expressions for the different token types, and actions for each regular expression. The flex generated scanner code tries to match characters from the current input stream to these regular expressions, and when a match is found, it executes the associtated action code. The variable yytext contains the string (in the C sense, i.e. '\0' terminated char*.) of characters that were matched. When more than one match is possible it breaks ties by going with the longest match --- i.e. it follows the "maximal munch" rule. When there are several "longest" matches, it goes with the regular expression listed first in the flex file. Of course, you should realize that flex is simply creating the same program implementing a finite automaton that we talked about in class. The difference is that we users specify tokens by regular expressions. So flex has to convert the regular expressions into finite automata for each token type, which are then combined into a single finite automaton. This is then converted, automatically, into a program.


Christopher W Brown
Last modified: Wed Sep 9 14:02:37 EDT 2009