The files ex1.ypp, ex1.l, and Makefile define a simple calculator program. In fact, it's the same simple program we had for Class 3's homework. The big difference is that now, instead of writing our scanner by hand, we're using the standard tool flex to create it.
/********************************************************
* ex1.ypp
********************************************************/
%{
#include <iostream>
#include <string>
#include <map>
#include <cstdlib> //-- I need this for atoi
using namespace std;
//-- Lexer prototype required by bison, aka getNextToken()
int yylex();
int yyerror(const char *p) { cerr << "Error!" << endl; }
%}
//-- SYMBOL SEMANTIC VALUES -----------------------------
%union {
int val;
char sym;
};
%token <val> NUM
%token <sym> OPA OPM LP RP STOP
%type <val> exp term sfactor factor res
//-- GRAMMAR RULES ---------------------------------------
%%
run: res run | res /* forces bison to process many stmts */
res: exp STOP { cout << $1 << endl; }
exp: exp OPA term { $$ = ($2 == '+' ? $1 + $3 : $1 - $3); }
| term { $$ = $1; }
term: term OPM factor { $$ = ($2 == '*' ? $1 * $3 : $1 / $3); }
| sfactor { $$ = $1; }
sfactor: OPA factor { $$ = ($1 == '+' ? $2 : -$2); }
| factor { $$ = $1; }
factor: NUM { $$ = $1; }
| LP exp RP { $$ = $2; }
%%
//-- FUNCTION DEFINITIONS ---------------------------------
int main()
{
yyparse();
return 0;
}
|
/********************************************************
* ex1.l
********************************************************/
%{
#include "ex1.tab.hpp"
#include <iostream>
using namespace std;
%}
%option noyywrap
%%
[0-9]+ { yylval.val = atoi(yytext); return NUM; }
[\+|\-] { yylval.sym = yytext[0]; return OPA; }
[\*|/] { yylval.sym = yytext[0]; return OPM; }
"(" { return LP; }
")" { return RP; }
";" { return STOP; }
<<EOF>> { return 0; }
[ \t\n]+ { }
. { cerr << "Unrecognized token!" << endl; exit(1); }
%%
|
# Makefile: A simple makefile for ex1. default: <TAB>bison -d ex1.ypp <TAB>flex ex1.l <TAB>g++ -o mycalc ex1.tab.cpp lex.yy.c |
bison -d ex1.yppcreates files
ex1.tab.cpp and
ex1.tab.hpp. The first contains the C++ code for
the parser (which is a function called
yyparse()), the second is a header file that
contains, among
other things, the definitions of the token names NUM, OPA,
etc., and the variable yylval that we use to pass
the bison code the semantic values of tokens. The flex file
ex1.l needs those definitions so that the
yylex() function it defines for bison can pass
back the information it needs to pass back.
The next thing we have is the flex file ex1.l.
The call
flex ex1.lcreates the file
lex.yy.c that contains, among
other things, the definition of the yylex()
function. Of course we haven't talked about flex, so this file
is new. The basic idea is that you write regular expressions
for the different token types, and actions for each regular
expression. The flex generated scanner code tries to match
characters from the current input stream to these regular
expressions, and when a match is found, it executes the
associtated action code. The variable yytext
contains the string (in the C sense, i.e. '\0' terminated
char*.) of characters that were matched. When more than one
match is possible it breaks ties by going with the longest
match --- i.e. it follows the "maximal munch" rule. When
there are several "longest" matches, it goes with
the regular expression listed first in the flex file.
Of course, you should realize that flex is simply creating the
same program implementing a finite automaton that we talked
about in class. The difference is that we users specify
tokens by regular expressions. So flex has to convert the
regular expressions into finite automata for each token type,
which are then combined into a single finite automaton. This
is then converted, automatically, into a program.