Recap of Dynamic Programming
Dynamic programming can be applied when
the original problem P can be divided in several smaller subproblems
{P1,...,Pn} in such a way that:
- they are clearly "smaller" than P, and
all of them can be solved with the same strategy;
- optimal solutions to these subproblems can be combined together to achieve optimal solution to the original problem P.
This property is known as "optimal substructure", and it is said that P satisfies the "Principle of Optimality".
When a problem P meets these conditions, a Dynamic Programming
algorithm can be designed to solve it.
A Dynamic Programming strategy always has two clearly
identifiable components:
- base case(s), the smallest instance(s) of the
problem for which the answer is easy to provide without applying further decomposition.
- an inductive step, in which subproblems are solved recursively
and results, one for each of them, are combined according to
problem's definition to provide the solution for the original problem.
These components of your strategy are typically defined by a recurrence relation. A straight-forward implementation of this recurrence is almost certainly inefficient. The trick with dynamic programming is to (drastically) improve efficiency through use of memoization, which stores solutions to subproblems, to alleviate the need to recompute them when they are used several times.
Warning
The good news with Dynamic Programming is once you have the strategy
working, it is almost straightforward to implement it.
The bad news is that the most difficult aspect of dynamic programming is getting the problem decomposition right,
and writing the corresponding recursive strategy.
Fortunately, this assignment is TOTALLY about
algorithms, and not being code masters. You are supposed to focus
mostly on this analysis and modeling part of the task, and most -if
not all- the credits for your work will be assigned on the quality of
your report, not on your
ability of making your solution running in your favourite programming language (as this is straight-forward, once you get your strategy right).
In case you do not know where to begin, here is a hint towards one way to approach this problem (there are plenty others).
Submission Requirements
The submission for this part will contain exactly two files, one
for the report and one for the program.
Here we collected the requirements we expect from your
files, but given the importance of the report in this part, we swap
the presentation order we had previously.
Report Requirements
Due to the specific issues of Dynamic Programming, this section is
totally different than the corresponding section of Part1.
Your report will be a document written in ENGLISH with the following structure:
- First page: put there your group number on the fire system,
your names, personal numbers and email addresses.
- Chapter 1: Dynamic Programming Strategy
- Write your dynamic programming strategy to solve the
problem. Keep it simple and explain IN GREAT DETAILS:
- Which base case(s) you have, which inductive step(s) you have;
- How the return type of your function looks like, and why;
- Present execution on instance 1,
showing that the strategy indeed returns the right solution (expected
latency AND the path producing it);
- Explain WHY the strategy is the right one for all instances of this problem. In other words, you need to prove that for your proposed solution to the problem, "Principle of Optimality" holds in general.
- Perform a complexity analysis on your solution.
- Chapter 2:Improving Efficiency
- Check whether your solution in Chapter 1 contains repeated patterns that
you can reuse many times (there should be, otherwise your strategy
can be improved!) and extend your strategy in such a way it uses
memoization.
- Explain which data you are storing in the memoization memory,
and why. Remember: the full answer to the problem contains
both expected latency AND the path producing it.
- Show, through the execution of instance 1, that memoization is
indeed working (a result is fetched from the memory several times).
- Perform a complexity analysis on your solution (and ensure it
is better than the previous one!)
- Chapter 3: Pseudocode
Give here the complete pseudocode of your dynamic programming
strategy, including the efficiency improvements from Chapter 2.
This should be very similar to the notation you already have in Chapter 2,
but more detailed in the use of data
structures, temporary variables and control flow.
Remember pseudocode is a mixture of programming language, mathematics, english and
other suitable notation. You can use abstract data types (for instance lists...)
and sentences like "for every edge e in E loop"
- Chapter 4: Complexity
Give here the time- and space-complexity of your pseudocode from Chapter 3.
This result as well should be pretty similar to what you already
stated in Chapter 2, and it is mostly supposed to be a way to
guarantee your implementation did not add useless, extra costs.
Remember it is not enough to say what complexity should be, you must motivate it.
It is not enough to analyze the problem, for instance to say that "since we are
traversing a list it must be O(the number of elements in the list)". This only gives a
lower bound. You must analyze your algorithm, refer to your pseudocode/code.
Also state clearly which operations you count as elementary (unit time) operations.
- Chapter 5: Performance Measurements
Our expectation is the Dynamic Programming solution is better than
the Exaustive Search one. To show this, implement the pseudocode in Chapter 3 and measure its performance.
- Run your program on several instances and record the running
time.
Each instance is characterized by the following features:
- Different number of nodes.
- Different edge costs.
- Different probability distributions.
What we expect here is a sequence of experiments to show how
variations of graph's features affect performances.
One obvious experiment is to test the behavior of your program when
the number of nodes is increased. But one might ask what happens
when number of nodes and distribution is fixed, while edge costs are
changing.
For all the experiments you perform, try to interpret the result you
see and relate it to the theoretical estimation of performance you gave
in Chapter 4.
Collect your results in tables like the following one:
Nodes |
Execution time |
... |
... |
and then draw Cartesian graphs to visualize the relation between the
parameters you are considering.
Avoid testing the application on instances that take more
than 5 minutes to be completed.
- Import the results from the first lab (if you solved it) and compare them with
the one you got for your Dynamic Programming solution. Did it go
any better?
- Appendix
References to all sources you used, if any. And an
example about how to compile and run your program.
Program Requirements
They are ABSOLUTELY the same as in part one.
- INPUT: Your program must be able to read input data from files. Two
instances (which define the standard input format we will use to
check your program) are
available as an example:
Please note files are structured in this way:
- The first line contains a natural number bigger than 0, the
number of nodes; let us call it n.
Line does not contain spaces and it is terminated by a newline character.
- Following there are n lines containing probabilities for each
node. Please note all those probabilities must be bigger
than 0, and the sum of them must be exactly 1. All lines
are spaces free and terminated by a newline character.
- Then a matrix for edge costs is printed. Since the graph is
complete, the matrix has n lines and in each line n natural
numbers (bigger than 0) are presented, separated by one space
character. All lines are terminated by a newline character.
- INTERNALS: Since this assignment focuses on algorithms, the
programming related overhead should be as low as possible. Complex
class hierarchies, awkward programming style and all other
distractions can only hide your ideas and result in a negative feedback.
- OUTPUT: your program must output both the value
and the composition (i.e. the set of edges) of the optimal path
Please remember a solution always starts from node 1.
No correctness proof?
The only requirement wrt. demonstrating correctness is proving the Principle of Optimality. When it holds, the strategy itself always guarantees to achieve optimal solution.
Submission
To submit you will use the Fire system.
Deadline for this assignment is 7 October, and the deadline is
strict.