This is the archived website of SI 413 from the Fall 2013 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.
Pick a pair of two programming languages that you know, and come up with an example of each of the following. As always, you can work together, but everyone must turn in unique examples.
A homograph is a code fragment that is the same syntactically between the two languages, but has different semantics in each.
A synonym is a code fragment that is the same semantically between the two languages, but has different syntax.
C++ and Java support a few different kinds of numerical constants, or "literals". The most basic are regular ints that you know and love like
32. There are also floating-point numbers like
These can be described with the following regular expressions:
INT: [0-9]+ FLOAT: [0-9]*"."[0-9]*
+ means "one or more of the previous thing", and a
* means "zero or more of the previous thing". Square brackets
 indicate a range of characters (or multiple ranges), and anything in quotes
"" means that literal character.)
Draw the DFA for the scanner that accepts
INT tokens. Be sure to label each accepting state with the type of token, and put characters or character ranges on each transition.
Besides int and float literals, C++ and Java also support two kinds of numbers that you may not know about are different bases: to start an octal (base-8) number, you put a leading zero, so
0702 are octal numbers. And to write a number in hexidecmal (base 16), you start with "0x", like
Modify the DFA from the previous problem to incorporate these two new types of tokens. So all in all you should support:
Draw the DFA for the scanner that accepts the following tokens (as specified by regular expressions). Label accepting states with the name of the token.
OCTAL: 0[0-7]* HEX: 0x[0-9a-f]+ INT: [0-9]+ FLOAT: [0-9]*"."[0-9]*
Observe that every octal literal like
034 could also count as an INT according to this specification. Give octal literals higher priority in your scanner, so that whenever there is a conflict, the token type will be OCTAL.
The following grammar is ambiguous (in terms of parsing), but it still defines a language:
S → exp
exp → exp
Write a new grammar for the same language which is unambiguous and would work well for top-down parsing. (In particular, your new grammar should be LL(1).)
Using your modified grammar from the previous problem, draw the parse tree that would be generated by a top-down parser for the following string:
53 + 89 - 103
Give the PREDICT and FOLLOW sets for your modified grammar from two problems ago.
What part(s) of the sets above would be used to create a recursive-descent LL(1) parser?