CS 525 -- Assignment Four
Due Tuesday, June 19th @ 5pm in CS Homework Station
Homework Guidelines
Submission Instructions: Please hand in a hard copy of
your solutions to all problems, with a suitable cover page with your name, the
date, the course, and the number of the assignment. Do each problem on a separate
sheet clearly labelled at the top (e.g., "Problem 3"), and staple
all pages together in the upper-left-hand corner (do NOT submit solutions separately
or update your solutions after handing in your assignment). For any problem
statement that asks multiple questions, answer each in a separate paragraph,
labelled (1), (2), etc. Submit "pencil and paper" problems into the
CS 525 mail slot (not the shelf) in the CS homework station; submit code or
files produced by software using Gsubmit as specified on the course home page.
A small penalty will be taken for any homework that does not follow these guidelines.
Late Policy: Both parts of the homework (pencil and paper,
and electronic submission) are due at 5 pm the due date; late homeworks
will be penalized at the rate of 25% for each weekday late (each day ending
at 5pm, weekends not included).
Policy on Academic Honesty: Please see the end of this
homework for comments on the University policy for academic honesty that I adhere
to.
As the next installment of your project, you are going to write a parser for
a subset of C; for inspiration you may look at grammars in the Kernighan and
Richie book, or use Google. However, you will definitely need to massage any
grammar you find to make it suitable for this project. As in project one, I
have given you a working prototype parser and lexer on the web page to get you
started. As in project one, you will do this alone (starting with project three
you will work in groups).
This project is worth 40 points towards your project score.
Overview
The goal of this project is to write a grammar for a subset of C that will
form the basis for your later projects, and to build the parser for this grammar
using Yacc. You may have slight variations between your language and the standard
language of C, or the language implemented by others in the class, and that
is ok as long as you make reasonable choices and as long as it works on the
positive and negative test files I provide. . Feel free to consult any reference
materials on the C language, but you must document carefully any such use of
references outside your textbook and my lectures. Please check with me if you
have any questions or comments on this.
The relevant files you will need are posted on the web site.
Language for the Project
The language represented by your grammar must have the following features;
any details about the syntax not covered here should be answered by the test
files or your local friendly C reference; or ask me.
- A program consists of a sequence of declarations of variables or functions;
if a function is called main, it must have no return type and no arguments;
functions not called main must have a return type (which can be int, float,
or a struct) and one or more arguments, which can be int, float, or struct
type (no array arguments);
- The only data types are int, float, structs, and one-dimensional
arrays. Structs and arrays can be nested arbitrarily.
- Control structures while, for, if-then, if-then-else
only (no switch or do-while); expressions used in while and if
must be enclosed in parentheses; the three arguments to a for must
be non-empty;
- Arithmetic operators are +, *, /, -, %, ++ and -- plus unary + and -;
- You must include the standard relational operators >, >=, <, <=,
==, !=, plus the Boolean operators&&,||, and !; these are simply considered
to be arithmetic operators, so there is no distinction between arithmetic
and boolean expressions; all operators must follow the standard precedence
rules of C.
- You must include the break, continue, and return statements;
return must always take a single argument enclosed in parentheses;
- Comments will be indicated by the token "//"; any text between
this token and carriage return will be ignored;
- Declarations can occur in file scope, or at the beginning of a compound
statement (enclosed in curly braces); any declarations inside compound statements
must precede all executable statements;
- A compound statement (in curly braces) can be used anywhere a single statement
can be used; conversely, an if, for, or while statement may use single statements,
and a function definition may have a body which is a single statement.
- No (compound) statements separated by comma; no empty statements, no use
of expressions as statements, no empty compound statements (i.e., curly braces
must enclose at least one statement);
- No gotos;
- No typedefs;
- No compiler directives (e.g., #DEFINE, macros, etc.);
- Declarations may NOT initialize a variable.
Changes to Lexer
You will have to make the following changes to your lexer:
- Add the new keyword float and operator % if you did not do
so in the first project;
- Add the period (for struct references);
- Recognize comments (text between "//" and carriage return) and
ignore them (as if they were white space);
- Add floating point constants without initial sign and without exponents
(e.g., 2.3, .4, 2.). Distinguish integer constants and floating-point constants,
perhaps calling them intnum and floatnum;
- You will have to make some changes to get your lexer to interact with your
parser; these are detailed in the two sample files on the web.
You should keep the symbol table that you wrote in project one, but need to
do nothing with it just yet (it will form a big part of project three).
Additional requirements for the project:
- The only shift-reduce conflicts you may leave in place are those arising
from the if-then and if-then-else; you can not have any reduce-reduce
conflicts.
- You must print out rule numbers on each reduce, as in the sample parser;
put a carriage return after each statement found, as a rough way of formatting
the output. Print out "Success" after reduction by the start rule.
- You must give the line number, some simple error message, print out two
lines of context, and underline the offending token when a lexical or syntax
error is found, using the yyerror(...) function discussed in the parsamp2.y
file and on p.215 in the book. For example, here is the output of my version
of the parser for testerror2.1.c:
%a.out < testerror2.1.c
7, 15, 13, 6
Line 6: syntax error
5: int x
6: main ...
^^^^
%a.out < testerror2.2.c
Line 3: illegal token
2: main() {
3: $ ...
^
%
Helpful Advice
Please look at the sample files on the web page; these have comments in them
about how to run the files and what to do; in addition you should read through
the appropirate sections of the Lex & Yacc book carefully. Finally, I will
offer the following advice:
- Develop your grammar in small steps and verify for each feature that you
add that it works as you expect; you can run the parser in interactive mode
by simply running the executable file and feeding it tokens (hit control D
to end).
- Run yacc with the -v (verbose) flag to generate the y.output trace file;
this shows the content of the LR(0) sets of items and can be used to debug
your grammar. Note that underscore "_" corresponds to the dot I
used in items; do NOT use underscore in your non-terminal names if you don't
like to be massively confused.
- Use left recursive rules when generating repeated patterns (such as argument
lists, or lists of statements); this will make things easier when we proceed
to the code generation phase.
- One of the messiest things to get straight is the use of semicolons as statement
terminators. Remember that semicolons do not follow a compound statement and
always follow executable statements (except as the third argument to a FOR
loop). You might find it useful to define two kinds of statements, those which
are followed by semicolons, and those which are not. For example, an if
statement (without else) has the form: if (expr) stmt. The "stmt"
would be followed by a semicolon, but the entire if statement should
NOT be followed by a semicolon (or else you will have to put two semicolons
in!).
- START EARLY! If you do not start on this until the day before it is due,
you will NOT finish!
What to hand in
Put your code in files lexer2.l and parser2.y. Provide an extensive
header comment in your parser2.y file stating any assumptions you make,
any features of the project you want to comment on, and so on. You should run
your code on all the test files, positive and negative, provided on the web page.
Please hand in a hard copy of the two code files plus the script files demonstrating
your parser on the test files as your solution to problem 5, along with your solutions
to problems 1 -- 4, following the usual requirements for a cover pages and etc.
as detailed above. The two code files should be Gsubmitted as usual, into a directory
named "proj2" in your Gsubmit directory.
Policy on Academic Conduct
Whenever you submit a piece of academic work and sign your name to it, you
are verifying that this work is the result of your own intellectual efforts;
it is Plagiarism to submit work solely under your own name in which:
- You collaborated with another student in solving the homework problems;
- You copied the solution from a book, web site, or the work of another student;
or
- You got significant help in solving the problem, but do not acknowledge
this in your solution.
In this course, I will not allow you to submit any joint work, although in
other courses and in other contexts you may be permitted to do this. In general,
you must always provide proper attribution of authorship by naming all persons,
books, or resources that provided intellectual content towards the final result.
In some cases, you will need to describe the extend of the contribution, particularly
when literally copying the words or artistic artifacts of another. To fail to
provide proper attribution is plagiarism. Plagiarism demeans the seriousness
of what we do in class, and does not allow you as a student to obtain a fair
grade for the results you worked hard for.
I will discuss the issue of plagiarism with the grader, and use, when appropriate,
automated tools to check submitted programs and files for copying. I don't like
to be involved in this, but it is necessary in order to provide a fair environment
for your work. In cases of suspected plagiarism, I will discuss the matter with
the student(s) involved and, when warrented, submit the case to the Academic
Conduct Committee (of which I was chairman for three years). Unfortunately,
students in my classes violate these policies several times a year on average;
it is my hope that by being completely clear and straight-forward about my policy
that I can reduce this number. Please take this issue serious and talk to me
if you have any concerns or questions.