Advanced Functional Programming

Description

This assignment consists of 2 parts.

In the first part, you will design, implement and test a Monad for programs that can be replayed.

In the second part use this monad to implement a very simple library for writing CGI programs, implement an optimization to your Replay monad that makes it much more useful, and use your CGI library to implement a simple application.

A monad for replaying computations

Consider the following scenario: A program is running. At some point, the program decides to stop, and produces something called a trace (akin to a core dump).

The user would later on like to restart the program, making it run from the exact same point it previously stopped on. Not only should the program point be the same, but all values of all variables should be precisely the same as when the program decided to stop. The trace should contain all information that is necessary to do this.

A monad, in which we can decide to stop in the middle of a running computation, and later on pick up where we left off, is called a replay monad (because we can replay computations in it). Here is a possible interface to a replay monad:

-- Types Question Answer Trace Replay a -- Operations instance Monad Replay liftIO :: (Show a, Read a) => IO a -> Replay a ask :: Question -> Replay Answer emptyTrace :: Trace addAnswer :: Trace -> Answer -> Trace run :: Replay a -> Trace -> IO (Either (Question, Trace) a)

There are three abstract types in the above, Question, Answer, and Trace. We leave these abstract for now. However, it is important to note that Trace is going to be something that we can represent as a String, in order to be able to write it to a file, for example.

There are two basic operations in the monad. The first one, liftIO, allows us to perform any IO action inside a Replay action.

The second one, ask q, makes the whole program stop! When the program stops, it produces a trace (of everything it has done so far), and a question, q. The user of the program can now examine the question that was produced by the program, and think of a suitable answer. When the user wants to continue the program where it last stopped, and provide the program with an answer, she simply runs the program again, providing it with the generated trace extended with the answer.

In this way, the user can run the program several times, keeping track of different evaluation points, and providing different answers to the same questions.

Lastly, the run function, given a computation in the Replay monad, and a trace (produced by a previous run of the program), re-traces the program according to this trace, and starts running the program from there. When the program stops with a question, the result of the run function is Left (q,t), where q is the question, and t the trace of everything that was done so far. When the program finishes with a result, the result of the run function is Right x, where x is the result of the whole computation.

An Example

Here is an example of a computation in the Replay monad. We have chosen the types Question and Answer to be both Strings here.

example :: Replay Int example = do t0 <- liftIO getTime liftIO (putStrLn "Hello!") age <- ask "What is your age?" liftIO (putStrLn ("You are " ++ age)) name <- ask "What is your name?" liftIO (putStrLn (name ++ " is " ++ age ++ " years old")) t1 <- liftIO getTime liftIO (putStrLn ("Total time: " ++ show (t1 - t0) ++ " seconds")) return (read age)

The example starts with asking the current time, prints a message, and then stops the whole program, with a trace and the question "What is your age?". If the program is rerun from that point, it prints a message and then stops again, asking: "What is your name?". If the program is rerun from that point again, a message is printed, and the total time for the whole process is calculated.

(The example works for a suitably chosen function getTime that returns the current time in seconds since a particular fixed date:

getTime :: IO Integer getTime = do TOD secs _ <- getClockTime return secs

getClockTime is defined in System.Time.

To understand how to deal with traces, here is a possible type declaration for Trace:

type Trace = [Item] data Item = Answer Answer | Result String deriving ( Show, Read ) emptyTrace = [] addAnswer t r = t ++ [Answer r]

When we run the above example, providing it with the empty trace, we get the following result:

GHCi> do x <- run example []; print x Hello! Left ("What is your age?",[Result "1164117157",Result "()"])

We can see that the program printed the message, and then stopped, with the question "What is your age?", and a trace. The trace records what has happened in the program so far. When we have found the answer to the question, (for example, 27), we can re-run the computation, augmenting the trace with an extra element:

GHCi> do x <- run example [Result "1164117157", Result "()", Answer "27"]; print x You are 27. Left ("What is your name?",[Result "1164117157",Result "()",Answer "27",Result "()"])

Again, the program prints its message, and stops with a question and a trace. Notice that we are running the same program (example) again, but now with a different trace!

Again, we augment that trace with the answer we want to give, running the program again:

GHCi> do x <- run example [Result "1164117157", Result "()", Answer "27", Result "()", Answer "Starbuck"]; print x Starbuck is 27 years old Total time: 19 seconds Right 27

We can see that the program printed "Starbuck is 27 years old" on the screen, and calculated how long time the whole process took (from the beginning). The program then terminated normally with the result 27.

We can automate the process of re-running the program and augmenting it with the answer every time. This is done by a "super run-function" that re-runs the program with an answer everytime the program stops and asks a question. Here is how:

running :: Replay a -> IO a running prog = play emptyTrace where play t = do eqa <- run prog t -- this is the same prog every time! case eqa of Left (q,t') -> do putStr ("Question: " ++ q ++ " ") r <- getLine play (addAnswer t' r) Right x -> return x

There is very little reason for why we would like to use this particular run function, because it does not use the full generality of the Replay monad. Here is how it works on the example:

GHCi> running example Hello! Question: What is your age? 59 You are 59 Question: What is your name? Adama Adama is 59 years old Total time: 5 seconds

Note that, in the above, the same program is re-run with a new trace, each time the answer to the question is provided by the user.

Part I

Deadline: Wed Feb 13

The first part of this assignment is to implement the Replay monad.

Part I - Task 1 - The Replay monad

First create a Haskell module called Replay in an empty directory, then create a cabal library package called replay by running cabal init and answering all questions. Look through the resulting cabal file and make sure you understand what it does and that your module is included in the package. Add any dependencies that you know you will need. Also you should relax the dependency on base to 4.* or similar. Implement a monad Replay with the following interface:

-- Types Replay q r a Trace r -- Operations instance Monad (Replay q r) instance Show r => Show (Trace r) instance Read r => Read (Trace r) liftIO :: (Show a, Read a) => IO a -> Replay q r a ask :: q -> Replay q r r emptyTrace :: Trace r addAnswer :: Trace r -> r -> Trace r run :: Replay q r a -> Trace r -> IO (Either (q, Trace r) a)

Notice the following:

We have generalised the type of Questions (q) and Answers (r) in this type, in order for the monad and the trace type to be more general.
You can decide yourself how to implement the type Trace r, but it needs to be a type that can be written to and read from a file! (So, it needs to be an instance of Show and Read.)

Make sure that the examples above can actually be run correctly!

Part I - Task 2 - Testing the Replay monad

Once you have implemented your Replay monad you should make sure that it works properly. Below is a testing framework to help you get started. Put the framework in a file called test.hs in a subdirectory called test, then add this to your .cabal file to create a test suite:

Test-Suite test-replay type: exitcode-stdio-1.0 hs-source-dirs: test main-is: test.hs build-depends: base, replay

If done correctly this sequence of commands should execute your test-suite:

  cabal configure --enable-tests
  cabal build
  cabal test

The idea is to test replay computations with integers as answers and whose only IO action is to increment a counter a number of times. We check that when running the program on a list of input integers we get the correct result and that the counter has the right value.

import Data.IORef import Replay import System.Exit -- | Runs the test suite for the replay library main :: IO () main = do results <- runTests if and results then return () else exitFailure -- | Programs are parameterised over a 'tick' action. -- Questions are () and answers are integers. type Program = IO () -> Replay () Int Int -- | A result is a pair of the final result of the program -- and the number of 'ticks' executed. type Result = (Int, Int) type Input = [Int] -- | A test case collects a program and a list of answers together -- with its expected result. data TestCase = TestCase { testName :: String , testInput :: Input , testResult :: Result , testProgram :: Program } -- | Running a program. runProgram :: Program -> Input -> IO Result runProgram p inp = do counter <- newIORef 0 let tick = modifyIORef counter (+1) x <- play (p tick) emptyTrace inp n <- readIORef counter return (x, n) where play prog t inp = do r <- run prog t case r of Right x -> return x Left (_, t') -> case inp of [] -> error "too few inputs" a : inp' -> play prog (addAnswer t' a) inp' -- | Checking a test case. Compares expected and actual results. checkTestCase :: TestCase -> IO Bool checkTestCase (TestCase name i r p) = do putStr $ name ++ ": " r' <- runProgram p i if r == r' then putStrLn "ok" >> return True else putStrLn ("FAIL: expected " ++ show r ++ " instead of " ++ show r') >> return False -- | List of interesting test cases. testCases :: [TestCase] testCases = [ TestCase { testName = "test1" , testInput = [3,4] , testResult = (8, 1) , testProgram = \tick -> do liftIO tick a <- ask () -- should be 3 b <- liftIO (return 1) c <- ask () -- should be 4 return (a + b + c) } ] -- | Running all the test cases. runTests = mapM checkTestCase testCases

You should write down enough test cases that you are confident that your implementation is correct. Try to think about possible corner cases.

For grade 5

Use QuickCheck to generate random test cases.

Part II

Deadline: Wed Feb 27

In the second part of the assignment you will extend your library with a CGI program DSL on top of the Replay monad and write an example application.

CGI Programming

One way of doing web programming is via CGI (Common Gateway Interface). Normally, a client browser sends a request for a static webpage to a webserver. The webserver then reads this file and sends its content back to the client. A CGI program is a program that is run by the webserver and that produces new content, which is then sent to the client.

A typical way a CGI program is used in a web-based system is as follows. A webpage is sent to the user. The user fills in some information, and sends it to the server. The server looks at it, and sends new content, which is filled by the user, and so forth.

A drawback in CGI programming is that it enforces a completely different programming model on the programmer than ordinary programming. CGI programs in principle do not remember anything from one invocation to the next. This means that the programmer needs to keep track of all important information, especially information that was computed in a previous interaction with the user, and is needed later on.

An example of this is a sequence of webpages where the user fills in different forms with name and address information, and then later this form needs to be processed somehow. There are two solutions to this: (1) This information is stored on the webserver. A drawback here is that it is not clear what to do when the user uses the browser's "Back" to go back to an earlier state, or the browsers "Clone" button, often resulting in webpages with inconsistent information, in the worse case breaking the database on the server. (2) The information is stored in the webpage sent to the user. In this way, when the user submits new information, it will always be paired with the relevant correct information.

Solution (2) is the one we will explore in this lab assignment.

Here is a very simple CGI program in Haskell. It produces a very simple HTML page, which contains a form. The user can provide information in the form and click OK. The program is then run again, with the input provided by the user.

main :: IO () main = do s <- getContents -- gets all input from stdin putStr $ unlines $ [ "Content-type: text/html" , "" -- This newline is significant. , "<html><body>" , "Input was: " ++ show s , "<p>" , "Type something here:" , "</p>" , "<form method=POST>" , "<p>" , "<input name=input_example>" , "<input type=submit value=OK>" , "</p>" , "</form>" , "</body></html>" ]

If you compile this program and call the result example.cgi, add it at an appropriate place on a web server, set configuration parameters to allow execution of cgi scripts, and then access it through a web browser, it should work as intended.

Whatever is filled in in the forms is given as input to the CGI script when the user presses OK. Try experimenting with adding more <INPUT> tags to the form and see how this affects the CGI program.

Make sure that you can run the above program as a CGI script, and understand what is going on before you continue!

For more information on HTML and CGI programming, see below.

Part II - Task 1 - A CGI library

The idea is now that you will now extend your replay library with CGI programming, based on your Replay monad. This library should have the following interface:

type Cgi a = Replay Question Answer a type Question = ? type Answer = ? runCGI :: Cgi () -> IO ()

In other words, the library provides a monad Cgi that is used to communicate with a user, by exchanging web-pages, and receiving answers.

The idea is that the CGI programmer can send HTML forms to the user by using ask. When the user enters data in the form and sends it back, the CGI program is re-run up to the point where ask was used, and continues with the answer that we got from the user.

In order for this to work, you need to do the following.

Decide how to represent the type Question. This somehow needs to be a representation of a web-form, that later on will be sent to the user. Note that you should support forms with an arbitrary number of fields--it's not enough to just give a web-interface to the simple string based examples from the introduction.
Decide how to represent the type Answer. This needs represent the answer you get from the user, so it needs to be some kind of table holding the values filled in by the user.
Decide how to implement the function runCGI. The implementation will be very similar to that of the function running, but it also needs to take care of some CGI stuff. In particular:
- Run the CGI monad with the right trace, and grabbing the result, which will be a question (web-form) and a trace.
- Store the trace as "hidden" information on the generated web-form. Hidden information can be introduced by adding a tag like <INPUT TYPE=hidden NAME=name VALUE=value> in a form.
- After the webpage is generated, the CGI program terminates!
- When the answer comes back from the user, the CGI program is restarted, with the input from the user, including the hidden information storing the previous trace.
- The function runCGI needs to retrieve all this information, and re-run the CGI monad with the trace, providing the answer to the question that was asked.
- Repeat this process as many times as necessary.

You may provide any extra functionality, or change the type Cgi if you feel the need for it.

Part II - Task 2 - Optimizing the Replay monad

For grade 4 or 5

The above way of dealing with CGI programs works, but it is not very space-efficient. The Replay monad remembers all results ever computed, even if they are not affecting the current computation anymore.

For example, if we would ask for the user's age like this:

askAge :: Replay String String Int askAge = do birth <- ask "What is your birth year?" now <- ask "What is the current year?" return (read now - read birth)

Then every time we use this function, both results would be remembered in the trace forever, although we know we will never actually use the fact that we now know the user's birth year and the current year. Instead, we would like to just remember the age, and shortcut the trace there.

Implement a function:

cut :: (Read a, Show a) => Replay q r a -> Replay q r a

The idea with this function is cut m should produce the same results as m; it only affects the replay behaviour. Namely, when we replay the computation cut m, and it has completed, only the result of m is remembered, and none of the things that happened inside m to make this happen.

Thus the user can use cut to annotate their Replay programs and make them more space-efficient. For example, askAge above could be implemented as follows:

askAge :: Replay String String Int askAge = cut $ do birth <- ask "What is your birth year?" now <- ask "What is the current year?" return (read now - read birth)

In short, when you replay cut m, one of three things can happen. (1) The replay-trace has not been here before. In this case, we enter m and look inside. (2) The replay-trace has been here before, but the last time we ran the program, somewhere inside m we stopped at an occurrence of ask. In this case, we enter m and trace normally. (3) The replay-trace has been here before, and completed the computation m. In this case, we never look inside m, but the trace knows what the result of m was, and we short-cut the tracing behavior.

In order to implement cut, you need to adapt your type Trace. There needs to be a way for the trace to keep track of the fact if we should enter m, or if we already know the result of m.

Don't forget to add test cases for cut. Again, think about corner cases such as cut (return 0).

Part II - Task 3 - A CGI Program

Implement a more realistic example that uses your Cgi library. Add it as an executable called cgi in your cabal package, similarly to how the example in lab 1 was implemented. The following elements should occur in your program:

The interaction with the user should go in several stages, and information from previous interactions should be used later on.
There should be a way for the user to make mistakes in filling out the forms, and a way for your program to point this out and to give the user a chance to fix the mistakes. For instance, when asking the user about her age you should check that the answer is a number and if it's not scold the user appropriately and let her try again.
For grade 4 or 5
You should use cut at the right places, to minimize the size of your traces.

Hint: remember that your CGI-monad supports IO actions, so your CGI can interact with the server machine in any way. For instance you could make a little command shell that first takes a commans such as ls, cd and cat. For some commands it will then require parameters to be supplied (i.e. a filename). It then displays the contents of directories and files.

Another cool script would be one that fetches a given URL and transforms it in some way, for instance by asking the user for a two words and replacing one with the other. It then displays the modified page or writes it to a file on the webserver and displays a link.

Both these are considered ambitious to very-ambitious for the purpose of grading, the minimal requirements are lower.

More Information

(This list might grow. Feel free to send me suggestions.)

For more information on HTML and CGI, there are lots of tutorials to be found on the web. Here is a reasonable one: HtmlCodeTutorial.
There are some Haskell libraries that do much of the CGI stuff and HTML stuff for you. Take a look at Network.CGI.Protocol and Text.Html for example. The package urlencoded may also help. You do not have to use these, but you can.
If you do not have access to a webserver where you can run CGI programs, run your own server! There are a number of possibilities:
- Install Apache Webserver, configure it to run CGI programs, run it, and point your browser to http://localhost:port/. Be sure to switch CGI programming on in the configuration, and be sure to set the port number right. (Recommended for Windows.)
- Run a Haskell webserver, we recommend mohws for this course. Get it here.
Don Stewart's guide on How to write a Haskell Program covers cabal in detail, but is a few years old.

Extra Assignments

These are completely voluntary, you can get grade 5 on the assignment without doing any of these. But having done some of these is not going to hurt your grade :-)

Add a layer for dealing with HTML forms to your CGI monad. For example, it would be nice to not have give explicit names (as strings) to the form-components. Instead, one can imagine "creating" form-components (such as input boxes and buttons) in the CGI monad, and being able to ask for their value after the user has entered the form. This would lead to a programming style similar to WxHaskell.
Add a form of compression for traces, to avoid excessive amounts of data being sent back and forth. Simply using show leads to lots of extra clutter. One simple thing to do is to avoid printing the full constructor names from the trace type. Another thing is to capture repetition; many traces will have lots of repetitions of 'Result "()"' for example. Being able to say 'Repeat 24 (Result "()")' would bring down this cost enormously. Lastly, a zip-like compression method would work wonders.
To protect your CGI program from being fooled by hackers, add a form of encryption in your traces. In this way, hackers will not be able to change the traces and fool the CGI program.
Add a way of dealing with errors in the trace formats. What happens when the CGI program gets a trace that is not really a trace?
Turn your Replay monad into a monad transformer, so that any underlying monad can be used, instead of only the IO monad. (Unfortunately, you can not make your transformer an instance of MonadTrans -- do you see why?).

Submission

Deadline

The deadline for part 1 is Wed Feb 13 and for part 2 Wed Feb 27. The final deadline is Sun Mar 24. (Please read the rules on what first and final deadline mean.)

Clean Code

Before you submit your code, Clean It Up! Submitting clean code is Really Important, and simply the polite thing to do. After you feel you are done, spend some time on cleaning your code; make it simpler, remove unneccessary things, etc. We will reject your solution if it is not clean. Clean code:

Does not have long lines (< 80 characters)
Has a consistent layout
Has type signatures for all top-level functions
Has good comments
Has no junk (junk is unused code, commented code, unnecessary comments)
Has no overly complicated function definitions
Does not contain any repetitive code (copy-and-paste programming)

Submission

Your submission needs to include the following information:

Your cabal package containing: a library with at least two modules (one for the Replay monad and one for the CGI DSL), an executable example CGI, a test suite for the Replay monad. Use "cabal sdist" to generate the source tarball. Make sure the tarball is working by extracting it in another directory and running "cabal configure --enable-tests", "cabal build", "cabal test" and "cabal haddock" and checking that everything looks right.
report.txt or report.pdf, a file containing documentation of what you have done. Also give the motivations you were asked to give in the assignments, answers to questions, and how to use your language.

Before you submit, please read the note on cheating.

You are supposed to submit your solution using the Fire system.

Good luck!

Assignment 2

Description

A monad for replaying computations

An Example

Part I

Part I - Task 1 - The Replay monad

Part I - Task 2 - Testing the Replay monad

Part II

CGI Programming

Part II - Task 1 - A CGI library

Part II - Task 2 - Optimizing the Replay monad

Part II - Task 3 - A CGI Program

More Information

Extra Assignments

Submission

Deadline

Clean Code

Submission