Part 3: Solving the "Please take me out of here!" with Dynamic Programming

Back to labs page.

The third task requires you to solve the "Please take me out of here!" problem by working out a Dynamic Programming solution.

Recap of Dynamic Programming

Dynamic programming can be applied when the original problem P can be divided in several smaller subproblems {P1,...,Pn} in such a way that:

they are clearly "smaller" than P, and all of them can be solved with the same strategy;
optimal solutions to these subproblems can be combined together to achieve optimal solution to the original problem P. This property is known as "optimal substructure", and it is said that P satisfies the "Principle of Optimality".

When a problem P meets these conditions, a Dynamic Programming algorithm can be designed to solve it.
A Dynamic Programming strategy always has two clearly identifiable components:

base case(s), the smallest instance(s) of the problem for which the answer is easy to provide without applying further decomposition.
an inductive step, in which subproblems are solved recursively and results, one for each of them, are combined according to problem's definition to provide the solution for the original problem.

These components of your strategy are typically defined by a recurrence relation. A straight-forward implementation of this recurrence is almost certainly inefficient. The trick with dynamic programming is to (drastically) improve efficiency through use of memoization, which stores solutions to subproblems, to alleviate the need to recompute them when they are used several times.

Warning

The good news with Dynamic Programming is once you have the strategy working, it is almost straightforward to implement it.

The bad news is that the most difficult aspect of dynamic programming is getting the problem decomposition right, and writing the corresponding recursive strategy.

Fortunately, this assignment is TOTALLY about algorithms, and not being code masters. You are supposed to focus mostly on this analysis and modeling part of the task, and most -if not all- the credits for your work will be assigned on the quality of your report, not on your ability of making your solution running in your favourite programming language (as this is straight-forward, once you get your strategy right).

In case you do not know where to begin, here is a hint towards one way to approach this problem (there are plenty others).

Submission Requirements

The submission for this part will contain exactly two files, one for the report and one for the program.

Here we collected the requirements we expect from your files, but given the importance of the report in this part, we swap the presentation order we had previously.

Report Requirements

Due to the specific issues of Dynamic Programming, this section is totally different than the corresponding section of Part1.

Your report will be a document written in ENGLISH with the following structure:

First page: put there your group number on the fire system, your names, personal numbers and email addresses.
Chapter 1: Dynamic Programming Strategy
- Write your dynamic programming strategy to solve the problem. Keep it simple and explain IN GREAT DETAILS:
  - Which base case(s) you have, which inductive step(s) you have;
  - How the return type of your function looks like, and why;
  - Present execution on instance 1, showing that the strategy indeed returns the right solution (expected latency AND the path producing it);
  - Explain WHY the strategy is the right one for all instances of this problem. In other words, you need to prove that for your proposed solution to the problem, "Principle of Optimality" holds in general.
- Perform a complexity analysis on your solution.
Chapter 2:Improving Efficiency
- Check whether your solution in Chapter 1 contains repeated patterns that you can reuse many times (there should be, otherwise your strategy can be improved!) and extend your strategy in such a way it uses memoization.
- Explain which data you are storing in the memoization memory, and why. Remember: the full answer to the problem contains both expected latency AND the path producing it.
- Show, through the execution of instance 1, that memoization is indeed working (a result is fetched from the memory several times).
- Perform a complexity analysis on your solution (and ensure it is better than the previous one!)
Chapter 3: Pseudocode
Give here the complete pseudocode of your dynamic programming strategy, including the efficiency improvements from Chapter 2. This should be very similar to the notation you already have in Chapter 2, but more detailed in the use of data structures, temporary variables and control flow.
Remember pseudocode is a mixture of programming language, mathematics, english and other suitable notation. You can use abstract data types (for instance lists...) and sentences like "for every edge e in E loop"
Chapter 4: Complexity
Give here the time- and space-complexity of your pseudocode from Chapter 3. This result as well should be pretty similar to what you already stated in Chapter 2, and it is mostly supposed to be a way to guarantee your implementation did not add useless, extra costs.
Remember it is not enough to say what complexity should be, you must motivate it. It is not enough to analyze the problem, for instance to say that "since we are traversing a list it must be O(the number of elements in the list)". This only gives a lower bound. You must analyze your algorithm, refer to your pseudocode/code. Also state clearly which operations you count as elementary (unit time) operations.
Chapter 5: Performance Measurements
Our expectation is the Dynamic Programming solution is better than the Exaustive Search one. To show this, implement the pseudocode in Chapter 3 and measure its performance.
- Run your program on several instances and record the running time.
  Each instance is characterized by the following features:
  - Different number of nodes.
  - Different edge costs.
  - Different probability distributions.
  What we expect here is a sequence of experiments to show how variations of graph's features affect performances. One obvious experiment is to test the behavior of your program when the number of nodes is increased. But one might ask what happens when number of nodes and distribution is fixed, while edge costs are changing.
  For all the experiments you perform, try to interpret the result you see and relate it to the theoretical estimation of performance you gave in Chapter 4.
  Collect your results in tables like the following one:
  
  Nodes Execution time
  
  ... ...
  
  and then draw Cartesian graphs to visualize the relation between the parameters you are considering.
  Avoid testing the application on instances that take more than 5 minutes to be completed.
- Import the results from the first lab (if you solved it) and compare them with the one you got for your Dynamic Programming solution. Did it go any better?
Appendix
References to all sources you used, if any. And an example about how to compile and run your program.

Program Requirements

They are ABSOLUTELY the same as in part one.

INPUT: Your program must be able to read input data from files. Two instances (which define the standard input format we will use to check your program) are available as an example:
- instance 1
- instance 2

The first line contains a natural number bigger than 0, the number of nodes; let us call it n.
Line does not contain spaces and it is terminated by a newline character.
Following there are n lines containing probabilities for each node. Please note all those probabilities must be bigger than 0, and the sum of them must be exactly 1. All lines are spaces free and terminated by a newline character.
Then a matrix for edge costs is printed. Since the graph is complete, the matrix has n lines and in each line n natural numbers (bigger than 0) are presented, separated by one space character. All lines are terminated by a newline character.

INTERNALS: Since this assignment focuses on algorithms, the programming related overhead should be as low as possible. Complex class hierarchies, awkward programming style and all other distractions can only hide your ideas and result in a negative feedback.
OUTPUT: your program must output both the value and the composition (i.e. the set of edges) of the optimal path
- Running instance 1 should give this
- Running instance 2 should give this
Please remember a solution always starts from node 1.

No correctness proof?

The only requirement wrt. demonstrating correctness is proving the Principle of Optimality. When it holds, the strategy itself always guarantees to achieve optimal solution.

Submission

To submit you will use the Fire system.

Deadline for this assignment is 7 October, and the deadline is strict.

Nodes	Execution time
...	...