The objective of this assignment is to write a code generator from a fragment of the C++ programming language to JVM, Java Virtual Machine. The code generator should produce Java class files, which can be run in the Java bytecode interpreter so that they correctly perform all their input and output actions.
The code generator is partially characterized by compilation schemes in Chapter 6 of the PLT book. More JVM instructions are given in Appendix B. These explanations follow Jasmin assembler; the code generator may emit Jasmin assembly code into text files, which are then processed further by Jasmin to create class files. Jasmin can be downloaded here, but is also included in the test suite.
The recommended implementation is via a BNF grammar processed by the BNF Converter (BNFC) tool. The syntax tree created by the parser is first type checked by using the type checker created in Lab 2. The code generator should then make another pass of the type-checked code.
The approximate size of the grammar is 50 rules, and the code generator should be around 300-700 lines, depending on the programming language used. All BNFC supported languages can be used, but guidance is guaranteed only for Haskell and Java.
Before the work can be submitted, the program has to pass some tests, which are given on the course web page via links later in this document.
To pass the assignment, it is sufficient to treat the grammar of Lab 2 excluding doubles. Thus, operations involving type double need not be covered by the code generator.
However, for the ambitious student, we recommend to extend the code generator also to doubles (after it works for integers and the statements). The extension requires some extra work:
The recommended procedure is two passes:
You can copy your CPP.cf
grammar and the
TypeChecker
module from Lab 2 to the same directory.
The language is the same as in Lab 2, and you can use the
grammar file CPP.cf
.
Also its type system is the same.
There are four built-in functions:
void printInt(int x) // print an integer and a newline in standard output void printDouble(double x) // print a double and a newline in standard output int readInt() // read an integer from standard input double readDouble() // read a double from standard input
These functions can be defined in a separate runtime class, which can be obtained e.g. from writing these functions in Java and compiling to a class file. A ready-made Java file is here.
Boilerplate code, see PLT book, Chapter 6.
Methods in the class, main
special, see Chapter 6.
The semantics is the same as in Lab 3. In other words,
running the generated classes in java
produces the same behaviour
as running the source code in the lab2
interpreter.
File good03.cc
int main () { int arg = readInt() ; int ret = 1 ; int i = 1 ; while (i < arg + 1) { ret = i * ret ; ++i ; } printInt(ret) ; }
could compile to good03.j as follows:
;; Boilerplate: a wrapping class for cc code .class public good03 .super java/lang/Object .method public <init>()V aload_0 invokespecial java/lang/Object/<init>()V return .end method ;; The java-style main method calls the cc main .method public static main([Ljava/lang/String;)V .limit locals 1 invokestatic good03/main()I pop return .end method ;; Program .method public static main()I .limit locals 3 .limit stack 10 ;; int arg = readInt(); invokestatic Runtime/readInt()I istore_0 ;; int ret = 1; iconst_1 istore_1 ;; int i = 1; iconst_1 istore_2 ;; while (i < arg + 1) L0: ;; // beginning of loop, check condition iload_2 ;; i iload_0 iconst_1 iadd ;; arg + 1 if_icmplt L2 ;; test i < arg + 1 iconst_0 goto L3 L2: ;; i < arg + 1 is true iconst_1 L3: ;; i < arg + 1 is false iconst_0 if_icmpeq L1 ;; if last comparison was false, exit while loop ;; ret = i * ret iload_2 iload_1 imul istore_1 iload_1 pop ;; ++i iload_2 iconst_1 iadd istore_2 ;; // i = i + 1 iload_2 pop ;; // continue loop goto L0 ;; printInt(ret) L1: iload_1 invokestatic Runtime/printInt(I)V nop ;; // return 0 iconst_0 ireturn .end method
(The comments are only for seeing the connection between .cc and .j).
The code generator must be a program called lab3
, which is
executed by the command
lab3 <SourceFile>
and produces a class (.class
) file. It may do this by first generating
Jasmin assembly code (a .j
file) and then calling Jasmin on this code.
Jasmin can be called by
java -jar jasmin.jar <File>.j
The generated class file should have the same name and be in the same directory as the original source file:
lab3 ../a/b/c.cc
This should produce a class file ../a/b/c.class
.
The output at failure is a code generator error, or a
TYPE ERROR
as in Assignment 2, or a
SYNTAX ERROR
as in Assignment 1.
The input can be read not only from user typing on the
terminal, but also from standard input redirected from
a file or by echo
. For instance,
./java fibonacci <test-input echo 20 | java fibonacci fibonacci.cc
Source file
// file good.cc int main () { int i = readInt() ; //5 printInt(i) ; //5 printInt(i++) ; //5 printInt(i) ; //6 printInt(++i) ; //7 printInt(i) ; //7 }
Running the compiler
% lab3 good.cc # produces good.class % echo 3 | java good.class 3 3 4 5 5
The compiler is submitted as source files that can be compiled by typing make
.
We also provide a test suite for lab 3. This test suite only contains test files which deal with integers; passing all these tests is the minimum required for passing this lab.
If you are ambitious, you can also try all the source files in the lab2 testsuite. For this you will also need to handle doubles in your compiler.
Run your compiler and the generated code with the testsuite before submitting the assignment.
Download the entire test suite folder or the tarball and compile it using make
.
The test script can then be run as follows:
./progs-test-lab3 ../path/to/solution/
Tip: Make sure your classnames in Jasmin have simple names without slashes or dots. If the first line of your Jasmin file is
.class public x/y/z.cc
then Jasmin will compile it to a file
x/y/z/cc.class
regardless of what the name of the .j
file is.
The easiest option, and also what the test suite expects, is that your class name is just a string
without any slashes or dots (in this example, just "z").
Your program must be compatible with the test suite and pass all the tests.
The solution must be written in an easily readable and maintainable way. In particular, tailoring it for the programs in the test suite is not maintainable!
Submit your lab by using Fire. Please include exactly all the files that are required for building your solution, including a Makefile. Do not however submit any generated files, and kindly avoid using archives (upload each file individually).
If you have any problems getting the test program to run, or if you think that there is an error in the test suite, contact the teachers via the course Google group.