Lecture 8: Implementing Type Checking
Programming Languages Course
Aarne Ranta (aarne@cs.chalmers.se)
%!target:html
%!postproc(html): #NEW
%!postproc(html): #HR
%!postproc(html): #sub1 1
%!postproc(html): #subn n
Book: 6.3, 6.5
#NEW
==Plan==
From typing rules to type checking code.
Type checker implementation in Haskell.
Type checker implementation in Java.
The location of errors
Extra: scoping puzzles in real C++
#NEW
==From typing rules to type checking code==
Basic idea: from rule
```
J#sub1 ... J#subn
---------- C
J
```
generate the code "upside down"
```
check J =
check J#sub1
...
check J#subn
check_condition C
```
Example:
```
Env => exp1 : bool Env => exp2 : bool check Env => exp1 && exp2 : bool =
----------------------------------------- check Env => exp1 : bool
Env => exp1 && exp2 : bool check Env => exp1 : bool
```
#NEW
==From typing rules to type checking code: more examples==
Judgements are easy: recursive calls to check.
```
Env => exp : bool Env => stm valid check Env => while (exp) stm valid =
------------------------------------ check Env => exp : bool
Env => while (exp) stm valid check Env => stm valid
```
Side conditions are unlimited code, so you have to think harder.
```
---------------- var : typ is in Env check Env => var : typ =
Env => var : typ check_condition lookup(var,Env) == typ
```
It is ``lookup`` and such conditions that in the end generate the error messages.
```
lookup(var,Env) = message ("variable " var "not found") // if var is not in Env
check_condition x == y = message ("expected " y " but found " x) // if not equal
```
#NEW
==The need of type inference==
There is a grammar rule saying that expressions can be used as statements:
```
Stm ::= Exp ";"
```
How do we check that such statements are valid?
```
Env => exp : ?
------------------
Env => exp ; valid
```
The problem is that we have no type ``typ`` to check ``exp : typ``.
Solution 1: check ``exp`` with each of the four types
```
check Env => exp ; valid =
try each typ in [bool,double,int,void]:
check exp : typ
```
This is inefficient, and does not scale up to infinitely many types.
Solution 2: do type inference with ``exp``. If it succeeds, the statement
is valid - because expressions of any type can be used as statments.
#NEW
==Type inference==
The general scheme is a rule where the conclusion has a type depending in
some way on the premises and the condition:
```
J#sub1 ... J#subn
--------------------------------- C
Env => exp : typ(J#sub1, ..., J#subn, C)
```
We should then use recursive calls of ``check`` and ``infer`` so that
- everything we need for constructing the type is inferred
- everything else is just checked
Often the type is independent of the premisses (which still have to be checked of course!):
```
Env => exp1 : bool Env => exp2 : bool infer Env (exp1 && exp2) =
------------------------------------------ check Env => exp1 : bool
Env => exp1 && exp2 : bool check Env => exp2 : bool
return bool
```
It can also come from the condition:
```
---------------- var : typ is in Env infer Env var =
Env => var : typ return lookup(var,Env)
```
#NEW
==Type checking overloaded operations==
Arithmetic operations in most languages are **overloaded**.
This means that they apply to many types.
The general rule for ``+ - * /`` is: both operands have the same type as the value,
which must be ``int`` or ``double``.
```
Env => exp1 : typ Env => exp2 : typ
-------------------------------------- typ is int or double
Env => exp1 + exp2 : typ
```
What we do is infer the type of the first operand and check the second.
```
infer Env (exp1 + exp2) =
typ := infer Env exp1
check_condition typ == int or typ == double
check Env => exp2 : typ
return typ
```
Also the comparison operators are overloaded, but
the return type is of course ``bool``.
#NEW
==Relating inference and checking==
Now we can check expression statements:
```
check Env => exp ; valid =
infer Env exp
```
If ``infer`` fails, we get any error message it generates.
If ``infer`` succeeds, we discard the type.
In the same way, we only need to write ``infer`` for expressions.
Then we define ``check`` uniformly,
```
check Env => exp : typ =
typ2 := infer Env exp
check_condition typ2 == typ
```
The ``check_condition`` call usually returns a message at failure, e.g.
```
TYPE ERROR
type of exp: expected typ, inferred typ2
```
#NEW
==The top-level checkers==
To check the whole program,
+ collect the types of each function into the signature
+ check that function names are unique
+ check each function definition using the signature
To check a function definition
+ check that argument variables are unique
+ initialize the topmost context with the argument variables
+ check the body in this context
+ check that there is a ``return``, with an expression
that has the expected return type of the function (or just
a ``return`` if the type is ``void``)
To check a sequence of statements
+ check the validity of the first statement and update the environment
if appropriate
+ check the remaining sequence in the new environment
+ an empty sequence is always valid
#NEW
==Type checker in Haskell==
You can copy the contents of
[``laborations/lab2/haskell/`` ../laborations/lab2/haskell]:
```
CPP.cf -- grammar
lab2.hs -- main module
Makefile
TypeChecker.hs -- type checking module
```
You only have to modify ``CPP.cf`` and ``TypeChecker.hs``.
But you can already compile them: just type
```
make
```
and run the type checker with
```
./lab2
```
The rest is "debugging the empty file"!
#NEW
===The Main module===
You don't have to write this - just copy the file
[``laborations/lab2/haskell/lab2.hs`` ../laborations/lab2/haskell/lab2.hs].
This file shows how compiler phases are linked together.
```
check :: String -> IO ()
check s = case pProgram (myLexer s) of
Bad err -> do putStrLn "SYNTAX ERROR"
putStrLn err
exitFailure
Ok tree -> case typecheck tree of
Bad err -> do putStrLn "TYPE ERROR"
putStrLn err
exitFailure
Ok _ -> putStrLn "OK"
```
In other words: call the parser; if it succeeds, call the type checker.
Notice the use of the **error type**,
```
data Err a = Ok a | Bad String
```
The value is either ``Ok`` of the expected type or ``Bad``
with an error message.
#NEW
===Using the Err type===
The ``Err`` type
```
data Err a = Ok a | Bad String
```
is a **monad** - a type of actions returning ``a`` but also doing
other things (in this case: exceptions).
Monad actions can be **sequence**d: if
```
inferExp :: Env -> Exp -> Err Type
```
then you can make several inferences one after the other by using ``do``
```
do inferExp env exp1
inferExp env exp2
```
You can **bind** variables returned from actions, and **return**
values.
```
do typ1 <- inferExp env exp1
typ2 <- inferExp env exp2
return TBool
```
If you are only interested in side effects, use the dummy value type
``()`` (corresponds to ``void`` in C and Java).
#NEW
==Symbol tables==
Environment type
```
type Env = (Sig,[Context])
type Sig = [(Id,([Type],Type))] -- or Map Id ([Type],Type)
type Context = [(Id,Type)] -- or Map Id Type
```
Auxiliary operations on the environment
```
lookVar :: Env -> Id -> Err Type
lookFun :: Env -> Id -> Err ([Type],Type)
updateVar :: Env -> Id -> Type -> Err Env
updateFun :: Env -> Id -> ([Type],Type) -> Err Env
newBlock :: Env -> Err Env
emptyEnv :: Env
```
Keep the datatypes abstract, i.e. use them only via these operations.
Then you can switch to another implementation if needed (more efficient,
more stuff in the environment).
#NEW
===The TypeCheck module===
The environment datatypes and operations.
Type signatures of the checking methods
```
typecheck :: Program -> Err () -- required function in lab2
checkDef :: Env -> Def -> Err () -- check a function definition
checkStms :: Env -> Type -> [Stm] -> Err ()
checkStm :: Env -> Type -> Stm -> Err Env
checkExp :: Env -> Type -> Exp -> Err ()
inferExp :: Env -> Exp -> Err Type
```
Some other auxiliaries.
```
checkUnique :: (Ord a, Print a) => [a] -> Err ()
checkCondition :: Bool -> Err ()
```
#NEW
===Some examples of checking===
```
checkStm :: Env -> Type -> Stm -> Err Env
checkStm env val x = case x of
SExp exp -> do
inferExp env exp
return env
SDecl type' x ->
updateVar env id type' -- also check that x is not in context already
SWhile exp stm -> do
checkExp env Type_bool exp
checkStm env val stm
checkExp :: Env -> Type -> Exp -> Err ()
checkExp env typ exp = do
typ2 <- inferExp env exp
if (typ2 == typ) then
return ()
else
fail $ "type of " ++ printTree exp -- ...
```
#NEW
===Some examples of type inference===
```
inferExp :: Env -> Exp -> Err Type
inferExp env x = case x of
ETrue -> return Type_bool
EInt n -> return Type_int
EId id -> lookVar env id
EPIncr exp -> inferNumeric env exp
ETimes exp0 exp -> inferNumericBin env exp0 exp
inferNumeric :: Env -> Exp -> Err Type
inferNumeric env exp = do
typ <- inferExp env exp
if elem typ [Type_int, Type_double] then
return typ
else
fail $ "type of expression " ++ printTree exp -- ...
inferNumericBin :: Env -> Exp -> Exp -> Err Type
```
#NEW
==Type checker in Java==
You can copy the contents of
[``laborations/lab2/java/`` ../laborations/lab2/java1.5]:
```
CPP.cf -- grammar
lab2 -- script running the type checker
lab2.java -- main program
Makefile
TypeChecker.java -- type checker class
TypeException.java -- exceptions for type checking
```
You only have to modify ``CPP.cf`` and ``TypeChecker.java``.
But you can already compile them: just type
```
make
```
and run the type checker with
```
./lab2
```
The rest is "debugging the empty file"!
Before ``make``, you may have to set your class path so that it finds
java_cup and JLex, as well as the current directory.
```
export CLASSPATH=.:::$CLASSPATH
```
#NEW
===The Main module===
This is given in
[``laborations/lab2/java/lab2.java`` ../laborations/lab2/java1.5/lab2.java],
hence you don't have to write this.
It shows how compiler phases are linked together.
```
try {
l = new Yylex(new FileReader(args[0]));
parser p = new parser(l);
CPP.Absyn.Program parse_tree = p.pProgram();
new TypeChecker().typecheck(parse_tree);
} catch (TypeException e) {
System.out.println("TYPE ERROR");
System.err.println(e.toString());
System.exit(1);
} catch (IOException e) {
System.err.println(e.toString());
System.exit(1);
} catch (Throwable e) {
System.out.println("SYNTAX ERROR");
System.out.println("At line " + String.valueOf(l.line_num())
+ ", near \"" + l.buff() + "\" :");
System.out.println(" " + e.getMessage());
System.exit(1);
}
```
#NEW
==Symbol tables==
Environment types
```
public static class FunType {
public LinkedList args ;
public Type val ;
}
public static class Env {
public Map signature ;
public LinkedList