We implement interpretation of expressions and statements via Haskell functions:
eval
: evaluate an expressionexec
: execute a statementexecs
: execute a statement sequenceThe interpreter need some structure for the following tasks:
readInt
, printInt
),Env
),return
a value (type Val
) from the middle of a computation, andSig
) that maps function names to their definition.We can implement these structures as follows:
For input/output, there is the Haskell IO
monad.
State threading: pass in the environment "before", return the environment "after".
Exception: the return value is an handled as exceptional value, propagated through the computation, caught at the function call.
Supply in the function signature as separate parameter; no need to return it, since it does not change. (Alternatively, we could make the function signature part of the environment and just never change it.)
The return
exception can be thrown only when statements are executed.
The considerations above suggest the following type signatures
eval :: Exp -> Sig -> Env -> IO (Val, Env)
exec :: Stm -> Sig -> Env -> IO (Either Val Env)
execs :: [Stm] -> Sig -> Env -> IO (Either Val Env)
and these implementations fragments:
Binary operations
EBinOp op e1 e2) sig env = do
eval (<- eval e1 sig env
(v1, env1) <- eval e2 sig env1
(v2, env2) <- binOp op v1 v2 -- May throw runtime exception in IO (division by 0)
v return (v, env2)
Statement sequences
= return (Right env)
execs [] sig env :ss) sig env = do
execs (s<- exec s sig env
r case r of
Left v -> return (Left v)
Right env1 -> execs ss sig env1
Statement: while
loop
SWhile e s) sig env = do
exec (<- eval e sig env
(v, env1) if isFalse v then return (Right env1)
else execs [SBlock [s], SWhile e s] sig env1
Return statement
SReturn e) sig env = do
exec (<- eval e sig env
(v, _) return (Left v)
Function call
ECall f es) sig env = do
eval (<- evals es sig env
(vs, env') let (Def t _ params ss) = lookupFun f sig
<- execs ss sig (makeEnv params vs)
r case r of
Left v -> return (v, env')
Right _ -> runtimeError "Function did not return anything"
Value lists
evals :: [Exp] -> Sig -> Env -> IO ([Val], Env)
= return ([], env)
evals [] sig env :es) sig env = do
evals (e<- eval e sig env
(v , env1) <- evals es sig env1
(vs, env2) return (v:vs, env2)
Using evals
, we can hide the state threading in the case of binary operators:
EBinOp op e1 e2) sig env = do
eval (<- evals [e1,e2] sig env
([v1,v2], env') <- binOp op v1 v2
v return (v, env')
Remark: The matching [v1,v2]
is total because the output list of evals
has the same length n
as the input list. However, GHC does not see that and might complain. We would need dependent types to communicate the invariant:
evals :: Vector Env n -> Sig -> Env -> IO (Vector Val n, Env)
This is possible in GHC but a bit tricky and clearer in languages like Agda.
The approach so far is a bit pedestrian and slightly error prone. In particular, we need to be very careful to pass the right environment to the recursive call when several environments are in scope.
However, you may be content with this style of implementation, since everything is explicit and the code clearly says what is going on.
The code though exhibits common programming patterns that are used all the time in Haskell and can be automated using standard monads and monad transformers. Using monad transformers, we can remove some clutter from our code and hide it behind the standard monad combinators.
In the remainder of this text, we step-by-step identify the common programming patterns and how to encapsulate them by monad transformers.
A monad m
is a type constructor that implements the Monad
type class.
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
For example, IO
is a monad, and we assume that the reader is familiar with the IO
monad already.
The return
method embeds a pure computation of type a
into the monadic computation. The bind method ma >>= k
is supposed to run computation ma
and feed its result of type a
to continuation k
. The effects of ma >>= k
are the effects of ma
and k a
combined. We can think of bind in the terms of do
notation:
>>= k = do
m <- m
a k a
In reality, it is the other way round: do
is desugared to uses of bind.
A monad transformer t
takes a monad m
and makes a new monad t m
, usually adding some effects on top of the ones of m
.
This can be explained best with instances of Monad
.
The library Control.Monad.Except
defines the exception monad transformer. In the library, it is a newtype
, here we just use a type
synonym to simplify things.
type ExceptT e m a = m (Either e a)
A computation in ExceptT e m a
will return a result of Either e a
, where Left
alternatives of type e
are considered errors, and Right
alternatives of type a
regular results. Further, the effects of the inner monad m
can happen. For now, let us think of m = IO
, but m
can be any monad.
The monad implementation embeds pure computations into Right
and propagates Left
results---discarding the continuation k
when ma
produced a Left
.
instance Monad m => Monad (ExceptT e m) where
return a = return (Right a)
>>= k = do
ma <- ma
r case r of
Left e -> return (Left e)
Right a -> k a
We saw this pattern already: in our implementation of execs
! Hence, we can simply execs
, making use of the exception monad transformer:
exec :: Stm -> Sig -> Env -> ExceptT Val IO Env
execs :: [Stm] -> Sig -> Env -> ExceptT Val IO Env
= return env
execs [] sig env :ss) sig env = do
execs (s<- exec s sig env
env1 execs ss sig env1
Raising and catching of errors is encapsulated in the MonadError
class defined in Control.Monad.Except
:
class MonadError e m where
throwError :: e -> m a
catchError :: m a -> (e -> m a) -> m a
instance Monad m => MonadError e (ExceptT e m) where
= return (Left e)
throwError e = do
catchError ma h <- ma
r case r of
Left e -> h e
Right a -> return (Right a)
The throwError
method returns a non-regular result (Left
). The catchError
method takes a computation ma
and looks at its result. If it is an exception, the handler h
is invoked. Otherwise, the result is passed on.
We use throwError
in our execution of SReturn
:
SReturn e) sig env = do
exec (<- eval e sig env -- Problem here!
(v, _) throwError v
There is a slight problem remaining here: eval
returns something in the IO
monad, but we are now computing in the ExceptT Val IO
monad! We shall deal with this problem shortly.
But first let us observe that we do not really need to catch and handle errors. Rather, when we call a function, we want to extract the exceptional value, and then continue in a monad without exceptions. This is facilitated by the runExceptT
function defined in Control.Monad.Except
.
runExceptT :: ExceptT e m a -> m (Either e a)
= id runExceptT
In our simplification, the run method is the identity, but in the true implementation, we unwrap the newtype
:
newtype ExceptT e m a = ExceptT { runExceptT :: m (Either e a) }
Evaluation of function calls uses runExceptT
:
ECall f es) sig env = do
eval (<- evals es sig env
(vs, env') let (Def t _ params ss) = lookupFun f sig
<- runExceptT (execs ss sig (makeEnv params vs))
r case r of
Left v -> return (v, env')
Right _ -> runtimeError "Function did not return anything"
We have observed above the problem of accessing an IO
computation inside a ExceptT Val IO
computation. Luckily, this problem has been solved by the MonadIO
class that gives us access to the IO
in any monad m
that supports I/O:
class MonadIO m where
liftIO :: IO a -> m a
instance MonadIO (ExceptT e IO) where
= do
liftIO io <- io
a return (Right a)
The method liftIO
provided by MonadIO
can be generalized to a lift
method that gives access to the functionality of the inner monad m
in a transformed monad t m
. Module Control.Monad.Trans
defines a type class MonadTrans
to this end, which has an instances for ExceptT
:
class MonadTrans t where
lift :: Monad m => m a -> t m a
instance MonadTrans (ExceptT e) where
= do
lift ma <- ma
a return (Right a)
We can then think of MonadIO
being implemented simply as:
instance MonadIO IO where
= id
liftIO
instance (MonadTrans t, MonadIO m) => MonadIO (t m) where
= lift . liftIO liftIO
We can fix our problem now by using liftIO
(or, equivalently here, lift
):
SReturn e) sig env = do
exec (<- liftIO (eval e sig env)
(v, _) throwError v
In general, we may want to think about how to run expression evaluation inside the statement execution.
type Eval = IO
type Exec = ExceptT Val IO
liftEval :: Eval a -> Exec a
= liftIO
liftEval
SReturn e) sig env = do
exec (<- liftEval (eval e sig env)
(v, _) throwError v
While this may look like an over-generalization at the moment and just boilerplate code, we will refine the definition of liftEval
as we introduce more monad transformers.
The threading of state, passing in an environment and returning a modified environment, is a pattern that is encapsulated in the state monad transformer of Control.State.Monad
. Again, dropping the newtype
wrapping, the definition would be:
type StateT s m a = s -> m (a, s)
This equips a monadic computation m a
with reading from an extra parameter of type s
, the input state, and returning an extra result of type s
, the output state.
The state threading is taken care of by the Monad
instance:
instance Monad m => Monad (StateT s m) where
return a = \ s -> return (a, s)
>>= k = \ s -> do
ma <- ma s
(a, s') k a s'
The state s'
returned from ma
(invoked with the original state s
) is passed on to continuation k
.
We hand-threaded the environment in a couple of places. All of these places can now benefit for the encapsulation into the state monad:
type Eval = StateT Env IO
evals :: [Exp] -> Sig -> Eval [Val]
= return []
evals [] sig :es) sig = do
evals (e<- eval e sig
v <- evals es sig
vs return (v:vs)
eval :: Exp -> Sig -> Eval Val
EBinOp op e1 e2) sig = do
eval (<- eval e1 sig
v1 <- eval e2 sig
v2 <- liftIO $ binOp op v1 v2
v return v
The use of liftIO
requires StateT s IO
to implement MonadIO
, or more generally, StateT s
to be a monad transformer:
instance MonadTrans (StateT s) where
= \ s -> do
lift ma <- ma
a return (a, s)
Reading and writing to the state is encapsulated in the MonadState
class.
class MonadState s m where
get :: m s
put :: s -> m ()
modify :: (s -> s) -> m ()
instance MonadState s (StateT s m) where
= \ s -> return (s, s)
get = \ _ -> return ((), s)
put s = \ s -> return ((), f s) modify f
These services of the state monad are needed when we access variables.
EId x) sig = do
eval (<- get
s return $ lookupVar x s
EAssign x e) sig = do
eval (<- eval e sig
v -> updateVar x v s)
modify (\ s return v
StateT
in exec
The form of StateT
does not directly match the pattern of the type of exec
:
exec :: Stm -> Sig -> Env -> ExceptT Val IO Env
We need to formally add a result to statement execution. The result has no information, though; we use the unit type ()
.
exec :: Stm -> Sig -> Env -> ExceptT Val IO ((), Env)
Now it fits the form of StateT
:
type Exec = StateT Env (ExceptT Val IO)
exec :: Stm -> Sig -> Exec ()
execs :: [Stm] -> Sig -> Exec ()
We can use automatic state threading now also in statement execution:
= return ()
execs [] sig :ss) sig = do
execs (s
exec s sig
execs ss sig
SWhile e s) sig = do
exec (<- liftEval (eval e sig)
v if isFalse v then return ()
else execs [SBlock [s], SWhile e s] sig
The definition of liftEval
has to be updated:
liftEval :: Eval a -> Exec a
= \ s -> do
liftEval ev <- lift (ev s)
(a, s') return (a, s')
This is simply:
liftEval :: Eval a -> Exec a
= \ s -> lift (ev s) liftEval ev
However, this definition does not fly with the newtype
wrapping of StateT
in Control.Monad.State
:
newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }
We need to wrap and unwarp StateT
in the right places:
liftEval :: Eval a -> Exec a
= StateT (\ s -> lift (runStateT ev s)) liftEval ev
Point-free:
= StateT (lift . runStateT ev) liftEval ev
Finally, we want to get rid of passing the signature sig
around explicitly. One solution is to make it part of Env
. Or, we can employ the reader monad transformer.
type ReaderT r m a = r -> m a
This is simpler than the state monad: we only pass a state r
in, but do not return one.
instance Monad m => Monad (ReaderT r m) where
return a = \ r -> return a
>>= k = \ r -> do
ma <- ma r
a
k a r
instance MonadTrans (ReaderT r) where
= \ r -> ma lift ma
Bind simply distributes the read-only state r
to both ma
and k
. The embeddings return
and lift
simply ignore r
.
Accessing the state is facilitated through the MonadReader
class:
class MonadReader r m where
ask :: m r
local :: (r -> r) -> m a -> m a
instance MonadReader r (ReaderT r m) where
= \ r -> return r
ask = \ r -> ma (f r) local f ma
The method ask
returns the state, and local f ma
modifies if via f
for a subcomputation ma
. Modifying a read-only state seems paradoxical, but note that the modification only applies to the continuation ma
. Any computation following a local
in sequence will receive the original, unmodified state. We shall not need local
in interpreter nor type-checker.
The official definition of ReaderT
uses a newtype
:
newtype ReaderT r m a = ReaderT { runReaderT :: r -> m a }
The projection runReaderT :: ReaderT r m a -> r -> ma
allows us to unwind the reader component by supplying an initial state r
.
The final definitions of the monads for evaluation and execution are thus:
type Eval = ReaderT Sig (StateT Env IO)
type Exec = ReaderT Sig (StateT Env (ExceptT Val IO))
The lifting of Eval
computations into Exec
computations needs to unwind the reader and state layers and restore them after the lift
for ExceptT
.
liftEval :: Eval a -> Exec a
= ReaderT (\ r -> StateT (\ s -> lift (runStateT (runReaderT ev r) s))) liftEval ev
Its "inverse" is needed to call execs
from the evaluator of function calls:
runExec :: Exec a -> Eval (Either Val a)
The idea is that exceptions thrown in a computation Exec a
should be transformed into a Left
result in the Eval
monad, whereas regular termination of the computation manifests as Right
result. The problem is that exceptions do not carry the state, because
StateT Env (ExceptT Val IO) a = Env -> IO (Either Val (a, Env)).
If we look at the relevant part of Eval (Either Val a)
, this is
StateT Env IO (Either Val a) = Env -> IO (Either Val a, Env).
So runExec
needs a state to continue with after the exception. In our case, we can even reset the state after a regular termination of the Exec a
computation, since we do not need the state of a function's locals after the function returns.
runExecAndReset :: Exec a -> Env -> Eval (Either Val a)
=
runExecAndReset ex env ReaderT (\ r ->
StateT (\ s -> do
<- runExceptT (runStateT (runReaderT ex r) s)
r case r of
Left v -> return (Left v, env)
Right (a, _s) -> return (Right a, env)))
This gives the following implementation of the function call:
ECall f es) = do
eval (<- evals es
vs <- ask
sig let (DFun _ _ params ss) = lookupFun f sig
<- get
env
put (makeEnv params vs)<- runExecAndReset (execs ss) env
r case r of
Left v -> return v
Right() -> runtimeError "Function did not return anything"
We developed Exec
and Eval
to capture the patterns found in our initial solution that used only the IO
monad.
Of course, it is possible to use the Exec
monad also for the evaluator, even though the evaluator never raises an exception. In this case, there is no need for liftEval
nor runExec
. Instead, we use catchError
instead of runExec
in the definition of eval (ECall f es)
.
This simplification is recommended, as the resulting interpreter implementation is easier to understand.
Now that we have completely "monadified" our interpreter,
eval :: Exp -> Eval Val
exec :: Stm -> Exec ()
what have we gained? Well, we have removed clutter from our implementation and the chance of introducing bugs in the boilerplate code. But further, the lifting of eval
and exec
to lists are now completely generic!
evals :: [Exp] -> Eval [Val]
= mapM eval es
evals es
execs :: [Stm] -> Exec ()
= mapM_ exec ss execs ss
It is not even worthwhile anymore to define evals
and execs
, we can simply write mapM eval
and mapM_ exec
directly.