Assignment 2

Description

This assignment consists of 2 parts.

In the first part, you will design, implement and test a Monad for programs that can be replayed.

In the second part use this monad to implement a very simple library for writing CGI programs, implement an optimization to your Replay monad that makes it much more useful, and use your CGI library to implement a simple application.

A monad for replaying computations

Consider the following scenario: A program is running. At some point, the program decides to stop, and produces something called a trace (akin to a core dump).

The user would later on like to restart the program, making it run from the exact same point it previously stopped on. Not only should the program point be the same, but all values of all variables should be precisely the same as when the program decided to stop. The trace should contain all information that is necessary to do this.

A monad, in which we can decide to stop in the middle of a running computation, and later on pick up where we left off, is called a replay monad (because we can replay computations in it). Here is a possible interface to a replay monad:


-- Types
Question
Answer
Trace
Replay a

-- Operations
instance Monad Replay
liftIO :: (Show a, Read a) => IO a -> Replay a
ask  :: Question -> Replay Answer

emptyTrace :: Trace
addAnswer :: Trace -> Answer -> Trace

run :: Replay a -> Trace -> IO (Either (Question, Trace) a)

There are three abstract types in the above, Question, Answer, and Trace. We leave these abstract for now. However, it is important to note that Trace is going to be something that we can represent as a String, in order to be able to write it to a file, for example.

There are two basic operations in the monad. The first one, liftIO, allows us to perform any IO action inside a Replay action.

The second one, ask q, makes the whole program stop! When the program stops, it produces a trace (of everything it has done so far), and a question, q. The user of the program can now examine the question that was produced by the program, and think of a suitable answer. When the user wants to continue the program where it last stopped, and provide the program with an answer, she simply runs the program again, providing it with the generated trace extended with the answer.

In this way, the user can run the program several times, keeping track of different evaluation points, and providing different answers to the same questions.

Lastly, the run function, given a computation in the Replay monad, and a trace (produced by a previous run of the program), re-traces the program according to this trace, and starts running the program from there. When the program stops with a question, the result of the run function is Left (q,t), where q is the question, and t the trace of everything that was done so far. When the program finishes with a result, the result of the run function is Right x, where x is the result of the whole computation.

An Example

Here is an example of a computation in the Replay monad. We have chosen the types Question and Answer to be both Strings here.


example :: Replay Int
example = do
  t0 <- liftIO getTime
  liftIO (putStrLn "Hello!")
  age <- ask "What is your age?"
  liftIO (putStrLn ("You are " ++ age))
  name <- ask "What is your name?"
  liftIO (putStrLn (name ++ " is " ++ age ++ " years old"))
  t1 <- liftIO getTime
  liftIO (putStrLn ("Total time: " ++ show (t1 - t0) ++ " seconds"))
  return (read age)

The example starts with asking the current time, prints a message, and then stops the whole program, with a trace and the question "What is your age?". If the program is rerun from that point, it prints a message and then stops again, asking: "What is your name?". If the program is rerun from that point again, a message is printed, and the total time for the whole process is calculated.

(The example works for a suitably chosen function getTime that returns the current time in seconds since a particular fixed date:


getTime :: IO Integer
getTime = do
  TOD secs _ <- getClockTime
  return secs

getClockTime is defined in System.Time.

To understand how to deal with traces, here is a possible type declaration for Trace:


type Trace = [Item]

data Item
  = Answer Answer
  | Result String
 deriving ( Show, Read )

emptyTrace = []
addAnswer t r = t ++ [Answer r]

When we run the above example, providing it with the empty trace, we get the following result:


GHCi> do x <- run example []; print x
Hello!
Left ("What is your age?",[Result "1164117157",Result "()"])

We can see that the program printed the message, and then stopped, with the question "What is your age?", and a trace. The trace records what has happened in the program so far. When we have found the answer to the question, (for example, 27), we can re-run the computation, augmenting the trace with an extra element:


GHCi> do x <- run example [Result "1164117157", Result "()", Answer "27"]; print x
You are 27.
Left ("What is your name?",[Result "1164117157",Result "()",Answer "27",Result "()"])

Again, the program prints its message, and stops with a question and a trace. Notice that we are running the same program (example) again, but now with a different trace!

Again, we augment that trace with the answer we want to give, running the program again:


GHCi> do x <- run example [Result "1164117157", Result "()", Answer "27", Result "()",
Answer "Starbuck"]; print x
Starbuck is 27 years old
Total time: 19 seconds
Right 27

We can see that the program printed "Starbuck is 27 years old" on the screen, and calculated how long time the whole process took (from the beginning). The program then terminated normally with the result 27.

We can automate the process of re-running the program and augmenting it with the answer every time. This is done by a "super run-function" that re-runs the program with an answer everytime the program stops and asks a question. Here is how:


running :: Replay a -> IO a
running prog = play emptyTrace
 where
  play t = do
    eqa <- run prog t    -- this is the same prog every time!
    case eqa of
      Left (q,t') -> do
        putStr ("Question: " ++ q ++ " ")
        r <- getLine
        play (addAnswer t' r)
      Right x -> return x

There is very little reason for why we would like to use this particular run function, because it does not use the full generality of the Replay monad. Here is how it works on the example:


GHCi> running example
Hello!
Question: What is your age? 59
You are 59
Question: What is your name? Adama
Adama is 59 years old
Total time: 5 seconds

Note that, in the above, the same program is re-run with a new trace, each time the answer to the question is provided by the user.

Part I

Deadline: Wed Feb 13

The first part of this assignment is to implement the Replay monad.

Part I - Task 1 - The Replay monad

First create a Haskell module called Replay in an empty directory, then create a cabal library package called replay by running cabal init and answering all questions. Look through the resulting cabal file and make sure you understand what it does and that your module is included in the package. Add any dependencies that you know you will need. Also you should relax the dependency on base to 4.* or similar. Implement a monad Replay with the following interface:


-- Types
Replay q r a
Trace r

-- Operations
instance Monad (Replay q r)
instance Show r => Show (Trace r)
instance Read r => Read (Trace r)

liftIO :: (Show a, Read a) => IO a -> Replay q r a
ask  :: q -> Replay q r r

emptyTrace :: Trace r
addAnswer  :: Trace r -> r -> Trace r

run :: Replay q r a -> Trace r -> IO (Either (q, Trace r) a)

Notice the following:

  1. We have generalised the type of Questions (q) and Answers (r) in this type, in order for the monad and the trace type to be more general.
  2. You can decide yourself how to implement the type Trace r, but it needs to be a type that can be written to and read from a file! (So, it needs to be an instance of Show and Read.)
Make sure that the examples above can actually be run correctly!

Part I - Task 2 - Testing the Replay monad

Once you have implemented your Replay monad you should make sure that it works properly. Below is a testing framework to help you get started. Put the framework in a file called test.hs in a subdirectory called test, then add this to your .cabal file to create a test suite:


Test-Suite test-replay
    type:            exitcode-stdio-1.0
    hs-source-dirs:  test
    main-is:         test.hs
    build-depends:   base, replay

If done correctly this sequence of commands should execute your test-suite:

  cabal configure --enable-tests
  cabal build
  cabal test
  
The idea is to test replay computations with integers as answers and whose only IO action is to increment a counter a number of times. We check that when running the program on a list of input integers we get the correct result and that the counter has the right value.


import Data.IORef
import Replay
import System.Exit

-- | Runs the test suite for the replay library
main :: IO ()
main = do
  results <- runTests
  if and results
    then return ()
    else exitFailure

-- | Programs are parameterised over a 'tick' action.
--   Questions are () and answers are integers.
type Program = IO () -> Replay () Int Int

-- | A result is a pair of the final result of the program
--   and the number of 'ticks' executed.
type Result  = (Int, Int)
type Input   = [Int]

-- | A test case collects a program and a list of answers together
--   with its expected result.
data TestCase = TestCase
  { testName    :: String
  , testInput   :: Input
  , testResult  :: Result
  , testProgram :: Program
  }

-- | Running a program.
runProgram :: Program -> Input -> IO Result
runProgram p inp = do
    counter <- newIORef 0
    let tick = modifyIORef counter (+1)
    x <- play (p tick) emptyTrace inp
    n <- readIORef counter
    return (x, n)
  where
    play prog t inp = do
      r <- run prog t
      case r of
        Right x      -> return x
        Left (_, t') -> case inp of
          []       -> error "too few inputs"
          a : inp' -> play prog (addAnswer t' a) inp'

-- | Checking a test case. Compares expected and actual results.
checkTestCase :: TestCase -> IO Bool
checkTestCase (TestCase name i r p) = do
  putStr $ name ++ ": "
  r' <- runProgram p i
  if r == r'
    then putStrLn "ok" >> return True
    else putStrLn ("FAIL: expected " ++ show r ++
                  " instead of " ++ show r')
         >> return False


-- | List of interesting test cases.
testCases :: [TestCase]
testCases =
  [ TestCase
    { testName    = "test1"
    , testInput   = [3,4]
    , testResult  = (8, 1)
    , testProgram = \tick -> do
        liftIO tick
        a <- ask () -- should be 3
        b <- liftIO (return 1)
        c <- ask () -- should be 4
        return (a + b + c)
    }
  ]

-- | Running all the test cases.
runTests = mapM checkTestCase testCases

You should write down enough test cases that you are confident that your implementation is correct. Try to think about possible corner cases.

For grade 5

Use QuickCheck to generate random test cases.

Part II

Deadline: Wed Feb 27

In the second part of the assignment you will extend your library with a CGI program DSL on top of the Replay monad and write an example application.

CGI Programming

One way of doing web programming is via CGI (Common Gateway Interface). Normally, a client browser sends a request for a static webpage to a webserver. The webserver then reads this file and sends its content back to the client. A CGI program is a program that is run by the webserver and that produces new content, which is then sent to the client.

A typical way a CGI program is used in a web-based system is as follows. A webpage is sent to the user. The user fills in some information, and sends it to the server. The server looks at it, and sends new content, which is filled by the user, and so forth.

A drawback in CGI programming is that it enforces a completely different programming model on the programmer than ordinary programming. CGI programs in principle do not remember anything from one invocation to the next. This means that the programmer needs to keep track of all important information, especially information that was computed in a previous interaction with the user, and is needed later on.

An example of this is a sequence of webpages where the user fills in different forms with name and address information, and then later this form needs to be processed somehow. There are two solutions to this: (1) This information is stored on the webserver. A drawback here is that it is not clear what to do when the user uses the browser's "Back" to go back to an earlier state, or the browsers "Clone" button, often resulting in webpages with inconsistent information, in the worse case breaking the database on the server. (2) The information is stored in the webpage sent to the user. In this way, when the user submits new information, it will always be paired with the relevant correct information.

Solution (2) is the one we will explore in this lab assignment.

Here is a very simple CGI program in Haskell. It produces a very simple HTML page, which contains a form. The user can provide information in the form and click OK. The program is then run again, with the input provided by the user.


main :: IO ()
main = do
  s <- getContents  -- gets all input from stdin
  putStr $ unlines $
    [ "Content-type: text/html"
    , "" -- This newline is significant.
    , "<html><body>"
    , "Input was: " ++ show s
    , "<p>"
    , "Type something here:"
    , "</p>"
    , "<form method=POST>"
    , "<p>"
    , "<input name=input_example>"
    , "<input type=submit value=OK>"
    , "</p>"
    , "</form>"
    , "</body></html>"
    ]

If you compile this program and call the result example.cgi, add it at an appropriate place on a web server, set configuration parameters to allow execution of cgi scripts, and then access it through a web browser, it should work as intended.

Whatever is filled in in the forms is given as input to the CGI script when the user presses OK. Try experimenting with adding more <INPUT> tags to the form and see how this affects the CGI program.

Make sure that you can run the above program as a CGI script, and understand what is going on before you continue!

For more information on HTML and CGI programming, see below.

Part II - Task 1 - A CGI library

The idea is now that you will now extend your replay library with CGI programming, based on your Replay monad. This library should have the following interface:


type Cgi a = Replay Question Answer a

type Question = ?

type Answer = ?

runCGI :: Cgi () -> IO ()

In other words, the library provides a monad Cgi that is used to communicate with a user, by exchanging web-pages, and receiving answers.

The idea is that the CGI programmer can send HTML forms to the user by using ask. When the user enters data in the form and sends it back, the CGI program is re-run up to the point where ask was used, and continues with the answer that we got from the user.

In order for this to work, you need to do the following.

You may provide any extra functionality, or change the type Cgi if you feel the need for it.

Part II - Task 2 - Optimizing the Replay monad

For grade 4 or 5

The above way of dealing with CGI programs works, but it is not very space-efficient. The Replay monad remembers all results ever computed, even if they are not affecting the current computation anymore.

For example, if we would ask for the user's age like this:


askAge :: Replay String String Int
askAge = do
  birth <- ask "What is your birth year?"
  now   <- ask "What is the current year?"
  return (read now - read birth)

Then every time we use this function, both results would be remembered in the trace forever, although we know we will never actually use the fact that we now know the user's birth year and the current year. Instead, we would like to just remember the age, and shortcut the trace there.

Implement a function:


cut :: (Read a, Show a) => Replay q r a -> Replay q r a

The idea with this function is cut m should produce the same results as m; it only affects the replay behaviour. Namely, when we replay the computation cut m, and it has completed, only the result of m is remembered, and none of the things that happened inside m to make this happen.

Thus the user can use cut to annotate their Replay programs and make them more space-efficient. For example, askAge above could be implemented as follows:


askAge :: Replay String String Int
askAge = cut $ do
  birth <- ask "What is your birth year?"
  now   <- ask "What is the current year?"
  return (read now - read birth)

In short, when you replay cut m, one of three things can happen. (1) The replay-trace has not been here before. In this case, we enter m and look inside. (2) The replay-trace has been here before, but the last time we ran the program, somewhere inside m we stopped at an occurrence of ask. In this case, we enter m and trace normally. (3) The replay-trace has been here before, and completed the computation m. In this case, we never look inside m, but the trace knows what the result of m was, and we short-cut the tracing behavior.

In order to implement cut, you need to adapt your type Trace. There needs to be a way for the trace to keep track of the fact if we should enter m, or if we already know the result of m.

Don't forget to add test cases for cut. Again, think about corner cases such as cut (return 0).

Part II - Task 3 - A CGI Program

Implement a more realistic example that uses your Cgi library. Add it as an executable called cgi in your cabal package, similarly to how the example in lab 1 was implemented. The following elements should occur in your program: Hint: remember that your CGI-monad supports IO actions, so your CGI can interact with the server machine in any way. For instance you could make a little command shell that first takes a commans such as ls, cd and cat. For some commands it will then require parameters to be supplied (i.e. a filename). It then displays the contents of directories and files.

Another cool script would be one that fetches a given URL and transforms it in some way, for instance by asking the user for a two words and replacing one with the other. It then displays the modified page or writes it to a file on the webserver and displays a link.

Both these are considered ambitious to very-ambitious for the purpose of grading, the minimal requirements are lower.

More Information

(This list might grow. Feel free to send me suggestions.)

Extra Assignments

These are completely voluntary, you can get grade 5 on the assignment without doing any of these. But having done some of these is not going to hurt your grade :-)

Submission

Deadline

The deadline for part 1 is Wed Feb 13 and for part 2 Wed Feb 27. The final deadline is Sun Mar 24. (Please read the rules on what first and final deadline mean.)

Clean Code

Before you submit your code, Clean It Up! Submitting clean code is Really Important, and simply the polite thing to do. After you feel you are done, spend some time on cleaning your code; make it simpler, remove unneccessary things, etc. We will reject your solution if it is not clean. Clean code:

Submission

Your submission needs to include the following information:

Before you submit, please read the note on cheating.

You are supposed to submit your solution using the Fire system.

Good luck!

[Chalmers]
[GU]