Deadlines: first deadline 16th November, final deadline 30th November. Submissions by Fire.

In this lab you will implement an ADT (abstract data type) for sets. You will do this by implementing a data structure, dynamic arrays, and two algorithms, insertion into a sorted array and binary search. You will get practice with using Java classes, interfaces and generics as well as the standard interface Comparable. You will measure your code’s execution time empirically and use these measurements to deduce the code’s time complexity.

Don’t forget that you should work in pairs. If you have a good reason to work alone, please get in touch with one of the course assistants.

Part 1: Sets of integers

In this part you will implement a data structure that represents a set of integers. The data structure should provide methods for adding an element, removing an element and checking if a given element is a member of the set. You should write a class that implements the following interface, which you can find in IntSet.java:

public interface IntSet {
    public void add(int element);
    public boolean contains(int element);
    public void remove(int element);
}

The interface has three methods, which are similar in functionality to the standard Java interface Set:

  • add(element) adds the integer element to the set if it is not already there. If it is already there, the set remains unchanged.

  • contains(element) returns true if the integer element is a member of the set, and false otherwise. The set itself is not changed.

  • remove(element) removes the integer element from the set. If the integer is not in the set, the set remains unchanged.

Your implementation should store the set as a dynamic array of integers, with the array having type int[]. When add is called, if the array is full, its size should be doubled (or multiplied by some other constant greater than 1); in other words, it should use the same algorithm that you saw in the first lecture. You should not attempt to sort the array. You do not need to reduce the size of the array when calling remove, but you may do if you want. Because the array is unsorted, contains should use a linear search.

You should define a class called LinSearchIntSet which implements the IntSet interface, and which has a constructor with no arguments (LinSearchIntSet()). This constructor should create a LinSearchIntSet which represents the empty set. You are not allowed to import any standard Java libraries or classes; you should implement everything yourself. In particular, you must not use ArrayList or Set in your implementation.

Here are some hints to get you started:

  • In add you will need to check if the element is already present in the set. You can do this by calling contains.

  • In order to implement remove, you first need to find the index where a given element is found in the array (in order to remove it). It might be helpful to pull out this functionality into a separate method, int indexOf(int element), which returns the index where a given element is found, or a special value (such as -1) if it is not found. This method can be used in remove and can also be used to easily define contains.

  • There is no built-in operation in Java to remove a value from an array. You will need to do it yourself. If you are stuck, try drawing a picture of the array before and after a remove and think how you could get from the before picture to the after picture.

  • The initial capacity of the array should be a small value. Using an initial capacity of 0 will complicate your code, so 1 might be a better choice.

  • Don’t forget that the size of the set is not always the same as the capacity of the array. You will need to use both in your code.

Here is an example program that illustrates how LinSearchIntSet can be used:

class TestProgram {
    public static void main(String[] args) {
        IntSet set = new LinSearchIntSet();
        set.add(1);
        set.add(2);
        set.add(1);
        set.remove(3);
        set.remove(1);
        System.out.println(set.contains(1)); // prints false
        System.out.println(set.contains(2)); // prints true
        System.out.println(set.contains(3)); // prints false
    }
}

Automatic testing

The file Test.java is a program that can automatically test your implementation of IntSet. To test your LinSearchIntSet class, run the program as follows:

java Test LinSearchIntSet

The program will run tests for five seconds. If a bug is found, the sequence of operations that led to the bug will be shown. You can copy this sequence and run it as a manual test (see below), which may help you understand what has gone wrong.

Please make sure that your program passes the automatic tester before going on.

Suppose that the set is modified only rarely, but that contains is called frequently. Then it makes sense to make contains as quick as possible. One way to do this is to use a sorted array, so that contains can be implemented using binary search. This is what you will do in this part.

You should copy your file LinSearchIntSet.java to a new file BinSearchIntSet.java (don’t overwrite your code from part 1 as you will need to submit it!). Then modify BinSearchIntSet.java so that:

  • contains uses a binary search to find the element.

  • add makes sure to keep the array sorted, by inserting the new element at the right location in the array, rather than always adding it at the end.

  • remove also makes sure not to disturb the order of the array. You may find that your implementation of remove from part 1 already works!

Just as before, you may not use the Java standard libraries in your solution.

Automatic testing

Again, your solution should pass the automatic tester. You can run the tester as follows:

java Test BinSearchIntSet

Part 3: A generic set

Finally, you should implement a generic data structure for sets, i.e., a data structure where the element type is not limited to integers. Your job is to write a class BinSearchGenSet that implements the following interface, found in GenSet.java:

public interface GenSet<E> {
    public void add(E element);
    public boolean contains(E element);
    public void remove(E element);
}

The class should have a constructor with no arguments that creates an empty set. Just as before, your class should use binary search and a dynamic array. You may want to copy BinSearchIntSet.java to use as a starting point.

In order to keep the array sorted, the element type E needs to be a type that supports comparison operations. The way we express this in Java is by declaring that E must implement the interface Comparable. To do this, declare the BinSearchGenSet class as follows:

public class BinSearchGenSet<E extends Comparable<? super E>> implements GenSet<E> {
       // Your code goes here
}

(The complicated declaration <E extends Comparable<? super E>> is explained in the course book, section 1.5.5 in the second edition or secton 1.5.6 in the third.) Having done this, you can use the standard Java method compareTo in your code to compare two objects of type E.

In this implementation, you should not use a primitive array, because it works a bit awkwardly with generics. Instead, use the standard class ArrayList. This happens to make your code quite a bit simpler:

  • You will not need to worry about resizing the array.

  • You can use ArrayList.add(int index, E element) to insert into the array.

  • You can use ArrayList.remove(int index) to delete from the array.

By building on existing data structures you can often save quite a lot of work!

You may not import any other Java classes, apart from ArrayList.

Here is an example snippet of code which illustrates how to call your BinSearchGenSet. Make sure you understand it:

GenSet<String> kurser = new BinSearchGenSet<>();
kurser.add("DAT037");
System.out.println(kurser.contains("DAT037")); // prints true

Automatic testing

Again, your solution must pass the automatic tester. The file BinSearchGenSetAsIntSet.java is a wrapper class that converts your class BinSearchGenSet to an IntSet. You can use it to run the automatic tester as follows:

java Test BinSearchGenSetAsIntSet

Final part: Empirical measurement and analysis of execution time

The file Benchmark.java is a program that measures the average execution time of the method contains in an implementation of IntSet, for several different sizes of sets.

Use the following commands to measure the performance of your three implementations:

java Benchmark LinSearchIntSet
java Benchmark BinSearchIntSet
java Benchmark BinSearchGenSetAsIntSet

You may find that the execution sometimes times out for larger inputs; that’s OK.

Think about how LinSearchIntSet and BinSearchIntSet differ in execution time when the set becomes large. Then write a text file with the name complexity.txt containing the following:

  • The time complexity that you expect the three implementations to have (you do not need to justify why).

  • An explanation of how the empirical results you obtained agree with the complexity you wrote down.

Submission

Labs will be submitted through Fire. If you have not used Fire before, you might want to read the separate page on how to use the Fire system. If you have permission to work alone, you will still need to create a group in Fire, but that group will only contain yourself.

You should send in the following files:

  • LinSearchIntSet.java

  • BinSearchIntSet.java

  • BinSearchGenSet.java

  • complexity.txt

In order to pass the lab, your implementations must pass the test program, as described above. That is, the test program must run to completion and say that no errors were found. Your code should also be of a reasonable quality: if we can’t understand it, we can’t pass it!