Problem 1

The statement we are proving does not have the form

n ∈ .P(n)

for some predicate P, so the form of induction used in the argument is not applicable.

The Θ notation is possibly confusing. If we use a higher-order variant of Θ instead, then the statement to be proved can be reformulated as follows:

The equations have a solution T, which has the property T  ∈  Θ(λn.n).

Here we can see that we cannot perform "induction on n", because n is not in scope.

As an aside we could ask ourselves if we can fix the proof. The answer is no, because T(n)=(n  +  1)n/2 = Θ(n²). Seeing a failed attempt could be instructive, though. Let us try to prove

n ∈ .T(n)=n

by induction on n:

The last couple of steps look suspicious, right?

Problem 2

Worst-case time complexity

pop takes constant time, so multi-pop(n,s) is linear in n (Θ(n)).

Amortised time complexity

The goal is to prove that all operations, including multi-pop, take constant amortised time. The key observation is that we can only multi-pop n elements after we have already pushed n elements.

Assume that the stack ADT only includes the following operations:

We can see every operation as being one of those or a multi-pop.

Let us choose the time unit so that

Let us further assign the following amortised time complexities to the operations:

We need to prove that the sum of the amortised time complexities for a sequence of operations (for a given stack, starting with its creation) is an upper bound of the sum of the worst-case time complexities:

  i = 1nti  ≤  ∑i = 1nai,

where the sequence contains n operations, ti stands for the actual (idealised) time complexity of the i-th operation, and ai is its amortised time complexity.

The inequality above follows from the following inequality:

  i = 1nti  +  sn  ≤  ∑i = 1nai.

Here si is the size of the stack after i operations (or 0 if the stack has not been created yet). We can prove the inequality by induction on n:

Case 0: The left and the right hand sides both evaluate to 0.

Case 1+n: We proceed by case analysis on the 1+n-th operation:

We get that all operations take constant amortised time.

Is this the "best possible amortised time complexity of multi-pop" (up to constant factors)? A slightly more refined analysis shows that one can assign multi-pop the amortised time complexity zero whenever at least one element is popped. However, in the worst case we pop nothing. Consider the operation sequence consisting of the creation of an empty stack followed by n multi-pops of nothing. This sequence takes 1 + n time units. Furthermore creation can only be assigned a constant amortised time complexity (because it does not take any input). Hence we cannot, in general, assign multi-pop a "sub-constant" amortised time complexity.

Some notes:

Problem 3

Program

Basic idea: Maintain a set which contains all words seen so far. For each word, check if its mirror image is a member of the set.

allocate a set, implemented as a hash table with separate chaining,
  where the chaining is implemented using linked lists

do the following for each word w in the file:
  if reverse(w) exists in the set
    print "yes"
    exit the program
  else
    insert w into the set

print "no"

Time complexity

Assuming a uniform cost model.

Let us assume that the hash function does constant work per character, i.e. that the time needed to compute a hash value for the word w is Θ(|w|) time units, where |w| is the length of w.

The number of characters in the file is denoted by N.

Perfect hash function

Let us now assume that we have a perfect hash function, and consider an iteration of the loop body:

The last three operations take O(|w|) time units (amortised in the case of insertion, due to potential resizing/rehashing). We get that the loop body is amortised linear in the number of processed characters.

Each character in the file is processed at most once, so the total time complexity is O(N). In the worst case the program must read all the input, so this bound is tight; we get the worst-case time complexity Θ(N).

Note that in this scenario most of the time is likely to be spent reading the program's input.

Worst-case scenario

If all words hash to the same bucket, then looking up the word w takes O(i|w|) time units, where i is the current size of the hash table. (Note that comparing two words w and w for equality takes O(min(|w₁|, |w₂|)) time units.) The other operations have the same time complexity as in the previous case, so the loop body's time complexity is O(i|w|+s) (amortised).

Assume now that we have a worst-case scenario where the file does not contain any duplicated words and no two words are mirror images of each other. (It is not immediately obvious that such an input can be constructed for all alphabet sizes and all N, but this assumption will at least give us an upper bound on the time complexity.) In this case the worst-case time complexity is

  O(∑i = 0n − 1(i|wi|+si)) = O(cn²+N),

where n is the number of words, and every word has at most c characters. If we have both c = Θ(N) and n = Θ(N), then we get the time complexity O(N³).

Is the scenario above reasonable? Can we find a constant alphabet size A such that, for all (large) file sizes N, there is a file without duplicated or mirrored words, satisfying c = Θ(N) and n = Θ(N), and requiring Θ(c) character comparisons in order to see that any two distinct words are not equal? No: if Θ(N) character comparisons are required, then every word has Θ(N) characters, in which case we cannot have Θ(N) words.

Challenge: Improve the analysis above (either prove that the bound is tight, or come up with a better bound).

Problem 4

Graph representation

Let us represent graphs as follows:

Note that this representation ensures that one can find the successors of a node without having to look up strings.

Program

Basic idea:

  1. Parse the file into a list of list of strings.
  2. Go through the list once to find all node identifiers, allocate node objects for them (with empty adjacency lists) and construct the hash table mapping node identifiers to node objects.
  3. Go through the list again to populate the adjacency lists.

Program:

l ← parse the file into a list of lists of strings, where
    * the outer list contains one inner list per line in the file,
      and
    * every inner list contains one node identifier (string) per
      word on the line, in the order in which the words occurred

h ← an empty hash table mapping strings (node identifiers) to node
      objects

for each inner list l′ of l do
  id ← the first element of l′
  n  ← new node object: {identifier = id, adjacent = empty list}
  h.insert(id, n)

for each inner list l′ of l do
  id  ← the first element of l′
  adj ← h.lookup(id).adjacent
  for each element id′ of l′, except for the first do
    adj.insert(h.lookup(id′))

Problem 5

Algorithm

Add a new node to the graph, connected to all the other nodes. Return the result of running the Hamiltonian cycle algorithm on the extended graph.

Correctness

Denote the node added to the input graph by v.

The algorithm is correct if we can prove that the input graph has a Hamiltonian path iff the extended graph has a Hamiltonian cycle.

Time complexity

If the input graph is G = (V, E), then it takes O(|V|) time units to add the extra node (if a suitable adjacency list representation is used). The linear Hamiltonian cycle algorithm is now applied to the larger graph; its running time is O((|V|+1)+(|E|+|V|))=O(|V|+|E|). Hence the total running time is also O(|V|+|E|), if our assumption about the graph's representation is correct (perhaps the Hamiltonian cycle algorithm uses some non-standard graph representation).

Problem 6

Assume that a graph G = (V, E) is represented using an array of length |V|, where position n contains the adjacency list for the node with node index n (n ∈ {0,…,|V|-1}), and the adjacency lists contain node indices.

We need to check if a given graph is reflexive, symmetric, and transitive. Checking for reflexivity and symmetry is easy:

reflexive(G):
  for n ∈ {0,…,number of nodes in G - 1} do
    if n does not occur in its own adjacency list then
      return false

  return true

symmetric(G):
  M ← adjacency matrix for G

  for m ∈ {0,…,number of nodes in G - 1} do
    for n ∈ {0,…,number of nodes in G - 1} do
      if M[m,n] ≠ M[n,m] then
        return false

  return true

(The code for symmetric can be optimised.)

A reflexive graph G = (V, E) is transitive iff all shortest paths between any two nodes contain at most one edge. Proof:

Hence we can check transitivity as follows:

// Precondition: G must be reflexive.
transitive(G):
  for every node v in G do
    p ← an array containing the lengths of the shortest paths from
          v to every node in G (using -1 to mark the absence of a
          path)

    if any element in p is ≥ 2 then
      return false

  return true

Finally the pieces above can be combined as follows:

equivalence-relation(G):
  return (reflexive(G) and symmetric(G) and transitive(G))

Time complexity

Assuming a uniform cost model.

Assume that the input is G = (V, E). The different functions have the following time complexities:

We see that the time complexity is dominated by transitive. The total time required is O(|V|(|V|+|E|)).

Problem 7

Algorithm

Basic idea: Find a minimum spanning tree, return the weight of (one of) its most expensive edge(s).

This can be implemented by modifying Kruskal's algorithm so that the weight of the last edge added to the minimum spanning tree is returned.

Time complexity

Assuming a uniform cost model.

This algorithm has the same time complexity as Kruskal's algorithm: O(|E|log|V|), if we assume that we can compare two weights in constant time.

Correctness

More

You may want to try to generalise the algorithm to partitions with more than two components.

This exercise is related to cluster analysis, see for instance the book Algorithm Design by Kleinberg and Tardos. Some examples found online.

Problem 8

No solution given.

Problem 9

Part 1, solution 1

Perform a postorder traversal of the tree, swapping left and right subtrees.

Time complexity analysis

We do a constant amount of work per tree node, so the algorithm is linear in the size of the tree.

Correctness proof (sketch)

We want to prove that the resulting tree is the reverse of the input. We do this by induction on the structure of the tree.

The empty tree case: The algorithm does nothing, which is fine, because an empty tree is its own reverse.

The node case: Assume that we start with a node ⟨l, x, r⟩, where l is the left subtree, x is the node element, and r is the right subtree. We know that all elements in l are smaller than x, and that x is smaller than all elements in r.

We recursively reverse l and r, ending up with l′ and r′, respectively. By the inductive hypothesis we know that l′ and r′ are reversals of l and r, respectively. The final tree is ⟨r′, x, l′⟩, which is sorted in reverse order because

Part 1, solution 2

Haskell code:

  data BinaryTree a = Empty | Node (BinaryTree a) a (BinaryTree a)

  reverseTree :: BinaryTree a -> BinaryTree a
  reverseTree Empty        = Empty
  reverseTree (Node l x r) = Node (reverseTree r) x (reverseTree l)

Time complexity analysis

As before. (If we assume that strict evaluation is used.)

Correctness proof

We want to prove, for all (total, finite) trees t, that

  toList (reverseTree t) = reverse (toList t),

where reverse reverses a list and toList converts a tree into a list:

  toList :: BinaryTree a -> [a]
  toList Empty = []
  toList (Node l x r) = toList l ++ [x] ++ toList r

Proof by induction on t:

Case Empty:

  toList (reverseTree Empty)
    =
  toList Empty
    =
  []
    =
  reverse []
    =
  reverse (toList Empty)

Case Node l x r:

  toList (reverseTree (Node l x r))
    =
  toList (Node (reverseTree r) x (reverseTree l))
    =
  toList (reverseTree r) ++ [x] ++ toList (reverseTree l)
    = { Inductive hypothesis (twice). }
  reverse (toList r) ++ [x] ++ reverse (toList l)
    = { Property of reverse. }
  reverse (toList l ++ [x] ++ toList r)
    =
  reverse (toList (Node l x r))

Part 2

We can store one boolean in the tree (not one boolean per tree node), which indicates whether the tree is reversed or not. Other operations (lookup, delete, etc.) need to consult this bit at most once per visited node, so their asymptotic worst-case time complexities do not change.

Problem 10

This solution may be hard to understand for those who are not familiar with functional programming.

Definition of binary trees:

  data BinaryTree a = Empty | Node (BinaryTree a) a (BinaryTree a)

We need to check if a given binary tree is a search tree satisfying the AVL tree balancing invariant:

  isAVL : BinaryTree a → Bool
  isAVL t = searchTree t ∧ balanced t

Here I assume that the type a comes with a constant-time comparison operator

  _<_ : a → a → Bool

which implements a strict total order.

We can check if a tree is a search tree by turning the tree into a list using an inorder traversal, and then checking if this list is strictly increasing:

  searchTree : BinaryTree a → Bool
  searchTree t = strictlyIncreasing (inorderList t)

  inorderList : BinaryTree a → [a]
  inorderList Empty        = []
  inorderList (Node l x r) = inorderList l ++ [x] ++ inorderList r

  strictlyIncreasing : [a] → Bool
  strictlyIncreasing []           = True
  strictlyIncreasing [x]          = True
  strictlyIncreasing (x ∷ y ∷ xs) =
    x < y ∧ strictlyIncreasing (y ∷ xs)

The call inorderList t is linear in the size of t, assuming that an implementation of lists which supports constant-time concatenation (++) is used.1 Furthermore strictlyIncreasing xs is, in the worst case, linear in the length of xs (because x < y executes in constant time). Hence searchTree t is linear in the size of t (in the worst case).

When checking if the tree is balanced it seems reasonable to avoid recomputing heights. The following function accomplishes this by checking if a tree is balanced and computing its height, both in a single postorder traversal:

  balancedAndHeight : BinaryTree a → Bool × ℤ
  balancedAndHeight Empty        = (True, -1)
  balancedAndHeight (Node l x r) =
    (bl ∧ br ∧ |hl - hr| ≤ 1, 1 + max hl hr)
    where
    (bl, hl) = balancedAndHeight l
    (br, hr) = balancedAndHeight r

Because a single traversal is used and we do a constant amount of processing per tree node2 we get that balancedAndHeight is linear in the size of the tree. We can wrap up by throwing away the computed height:

  balanced : BinaryTree a → Bool
  balanced t = b
    where (b, _) = balancedAndHeight t

The total time needed by isAVL t is linear in t (in the worst case). Is this optimal? Yes, one cannot be sure that an arbitrary tree is an AVL tree without inspecting every node.

Problem 11

Let N(h) denote the smallest possible number of nodes in an AVL tree of height h. An AVL tree of a given height with as few nodes as possible must be maximally unbalanced, so we can define N recursively as follows (assuming that the height of an empty tree is -1):

  N(-1) = 0
  N(0) = 1
  N(h+1) = 1 + N(h) + N(h-1), h ∈ ℕ

This recursive definition is well-founded, so it must have a unique solution.

For all h ∈ ℕ we have N(h) ≥ ϕh, where ϕ is the golden ratio (1+√5)/2. Proof by induction on h:

Now let H(n) denote the largest possible height of an AVL tree with n nodes. For any AVL tree with n nodes and height h we have N(h) ≤ n (by the definition of N). In particular, this applies to a tree with n nodes and height H(n). If we assume that n ≥ 1, then we get

  n
    ≥
  N(H(n))
    ≥ { n ≥ 1 implies H(n) ≥ 0 }
  ϕH(n).

Both sides of this inequality are positive, so we can apply the golden ratio logarithm (which is monotone) to both sides and get the following inequality:

  ∀n ≥ 1. H(n) ≤ logϕ n,

i.e. the height of an AVL tree with n nodes is O(log n).

Problem 12

I would expect you to find that the "efficient" implementation is much more efficient than the quadratic one for large input, and that the optimised library implementation is quite a bit faster than your own "efficient" implementation, but not wildly so.

As for testing, a good idea seems to be to test that the three implementations always give the same result (up to reordering of duplicates).

Problem 13

Two possible answers:

If the input is likely to be partially sorted, then one can also consider using an adaptive sorting algorithm.

Problem 14

Assume that the input consists of n elements, out of which n/c have the same key k. Then quicksort using this partitioning strategy (and no optimisations, such as switching to other sorting algorithms) requires Ω(n²/c²) steps.

Proof: Whenever the input is partitioned, all elements with key k end up in the same subset (S or S), except perhaps for one (the pivot). Quicksort does not terminate until every element has been selected as a pivot once, so we need to partition the input at least n/c times. Computing these partitions takes at least

  i = 1n/c i = Ω(n²/c²)

steps.


  1. I assume that strict evaluation is used.

  2. Assuming a uniform cost model.