## Introduction

In this lab, you’ll create a timing test for the `SLList`

and `AList`

implementations of the `List61B`

interface. You’ll also create a randomized comparison test for a new interface called a `Lab5FloorSet`

.

The ideas in this lab will be very important for when we get to project 2a and project 2b in mid October. These two projects will be similar to proj1a, except that we will not be providing an autograder! That is, you’ll be responsible for verifying the correctness and speed of your code.

## Timing Tests for List61B

For project 1A, you were given a suite of extensive autograder tests that validated your code’s accuracy and speed. However, in the real world, you’ll be responsible for ensuring the correctness and efficiency of your code. While we’ve seen some correctness testing (in the testing lecture, lab, and in the optional gold points for project 1), we have not yet discussed timing tests.

### Timing the construction of an AList with a bad resize strategy

As discussed in lecture, a multiplicative resizing strategy will result in fast add operations / good performance, whereas an additive resizing strategy will result in slow add operations / bad performance.

For this lab, we’ve provided the `AList`

class created in lecture with the bad resizing strategy below:

```
public void addLast(Item x) {
if (size == items.length) {
resize(size + 1);
}
items[size] = x;
size = size + 1;
}
```

Your goal for this part of the lab is to write code that tabulates the amount of time needed to create a `AList`

of various sizes using the `addLast`

method above. The output of this timing test will look something like this:

```
Timing table for addLast
N time (s) # ops microsec/op
------------------------------------------------------------
1000 0.00 1000 0.20
2000 0.01 2000 0.20
4000 0.01 4000 1.20
8000 0.04 8000 4.30
16000 0.10 16000 10.00
32000 0.50 32000 49.70
64000 1.15 64000 114.80
128000 3.74 128000 374.30
```

The first column `N`

gives the size of the data structure (how many elements it contains). The second column `time (s)`

gives the time required to complete all operations. The third column `# ops`

gives the number of calls to `addLast`

made during the timing experiment. And finally the fourth column `microsec/op`

gives the number of microseconds it took on average to complete each call to `addLast`

. Note that for this experiment, `N`

and `# ops`

is redundant, since the result of making 128,000 calls to `addLast`

will result in an `N`

of 128,000.

The important thing to notice here is that `addLast`

is not “constant time”. That is, the time to takes each `addLast`

call to complete varies significantly with the size of the list: 374.30 microseconds when the list is long, and only 0.20 microseconds when the list is short. This is essentially how our autograder tests worked for your `LinkedListDeque`

and `ArrayDeque`

classes, i.e. we made sure that the time was constant for operations that should have been constant.

You might notice that the time per `addLast`

operation is the same for N = 1000 and N = 2000. This is common for timing tests. For small inputs, results are unreliable for two reasons: The variance in runtime is high (due to issues like caching, process switching, branch prediction, etc. which you’ll learn about if you take 61C), and the accuracy of our timer (milliseconds) is insufficient to resolve the difference between N = 1000 and N = 2000. This can even lead to the timing of N = 1000 being greater than that of N = 2000.

Now that you understand the table above, add a function `public void timeAListConstruction`

to the class `TimeAList`

that generates the table above for an `AList`

. Note: If your computer is a little slow, you might want to stop at 64,000 instead of 128,000. Make sure to add a function call to `timeAListConstruction`

to the `main`

method of `TimeAList`

class.

For your convenience, we’ve provided a method called `printTimingTable(List<Integer> Ns, List<Double> times, List<Integer> opCounts)`

that will print the table above, where `Ns`

is the first column, `times`

is the second column, and `opCounts`

is the third column. The fourth column (`microsec/op`

) is automatically computed for you. Your times should be in seconds. You should use the `Stopwatch`

class. See `stopwatchDemo`

for an example.

### Timing the construction of an AList with a good resize strategy

Now modify the `AList`

class so that the resize strategy is multiplicative instead of additive and rerun `timeAListConstruction`

. Your `AList`

objects should now be constructed nearly instantly, even for N = 128,000, and each add operation should only take a fraction of a microsecond.

Optional: Try increasing the maximum N to larger values, e.g. 10 million. You should see that the time per add operation remains constant.

Optional: Try experimenting with different resizing factors and see how the runtimes change. For example, if you resize by a factor of 1.01, you should still get constant time addLast operations!

```
public void addLast(Item x) {
if (size == items.length) {
resize((int) (size * 1.01));
}
items[size] = x;
size = size + 1;
}
```

### Timing the getLast method of SLList

In your `LinkedListDeque`

, you were supposed to have `addLast`

operations that were fast, or as the spec put it: “`add`

and `remove`

operations must not involve any looping or recursion. A single such operation must take “constant time”, i.e. execution time should not depend on the size of the deque. This means that you cannot use loops that go over all/most elements of the deque.”

Above, we showed how we can time the construction of a data structure. It is also common to compute the time per operation on a data structure that is pre-built before the test begins. That is, suppose we want to compute the time per operation for `getLast`

for an `SLList`

and want to know how this runtime depends on N. To do this, we need to follow the procedure below:

- Create an
`SLList`

. - Add N items to the
`SLList`

. - Start the timer.
- Perform M getLast operations on the
`SLList`

. - Check the timer. This gives the total time to complete all M operations.

It’s important that we do not start the timer until after step 2 has been completed. Otherwise the timing test is including something other than the `getLast`

operations.

In the `TimeSLList`

class, edit the function `timeGetLast`

to perform the procedure above, and generate a table similar to the one shown below:

```
Timing table for getLast
N time (s) # ops microsec/op
------------------------------------------------------------
1000 0.02 10000 1.70
2000 0.03 10000 3.10
4000 0.06 10000 6.20
8000 0.13 10000 12.50
16000 0.25 10000 25.00
32000 0.53 10000 52.80
64000 1.35 10000 135.30
128000 2.57 10000 257.30
```

Note that the `N`

and `# ops`

columns are no longer the same. This is because we are always calling `getLast`

the same number of times regardless of the size of the list, i.e. `M = 10000`

for step 4 of the procedure described above.

Note that the operations are again not constant time! This means that as the list gets bigger, the `getLast`

operation becomes slower. This would be a serious problem in a real world application. For example, suppose the list is of ATM transactions, and the `getLast`

operation was being called in order to get the most recent transaction to print a receipt. Every time the ATM is used, the next receipt would take a little bit longer to print. Eventually over many months or years, the list would become so large that the `getLast`

operation would be unusably slow. While this is a contrived example, similar problems have plagued real world systems!

Optional question to ponder: Why is `getLast`

so slow? Was your `LinkedListDeque`

`getLast`

function also slow?

## Randomized Comparison Tests

For many interfaces, we’ll often find that there are two extremes:

- A simple but inefficient implementation.
- A complex but efficient implementation.

For example, building an `ArrayDeque`

where the list is always stored as an array of length corresponding to the items in the list would be easy, but very slow, whereas what you did in project 1a was much more complex, but also very fast.

For example, the slow but simple way would store `[9, 15, 31, 35]`

as a length 4 array containing only `[9, 15, 31, 35]`

. Your approach project 1A might have stored this list as `[0, 0, 9, 15, 31, 35, 0, 0]`

, `nextFirst = 1`

, `nextLast = 6`

.

One way you might have tested your code if you were doing this in a real world setting would be to first implement a `SlowArrayDeque`

and then `ArrayDeque`

, and then compare the output of the two after applying random operations to make sure your `ArrayDeque`

was correct.

For this part of the lab, we’ll try to validate an implementation of a new interface called `Lab5FloorSet`

class, which has two methods:

`add(double x)`

adds x to the set. If x is already present, it has no effect.`floor(double x)`

gives the largest value in the set that is less than or equal to x. If no values are smaller than x, it should return negative infinity (`Double.NEGATIVE_INFINITY`

).

For example, if we call `add(2.5)`

, `add(10.0)`

, and `add(11.2)`

, then `floor(9)`

would return 2.5, since 2.5 is the largest value in the list that is smaller than 9. `floor(0)`

would return negative infinity.

We’ve provided an implementation `RedBlackFloorSet`

. Your goal in this part of this lab will be to first write a correct but inefficient solution, then to use a randomized test to determine whether or not `RedBlackFloorSet`

is correct. You can use the `StdRandom`

class in order to get random numbers.

#### Implementing AListFloorSet (Optional)

Create a simple but correct implementation of `Lab5FloorSet`

called `AListFloorSet`

. Your `AListFloorSet`

should have exactly one instance variable: an `AList`

. Do not modify the `AList`

class.

Or if you’d rather skip this exercise, you can find the solution here.

#### Using AListFloorSet to verify RedBlackFloorSet correctness

There is a bug somewhere in `RedBlackFloorSet`

. **You don’t need to understand this implementation of RedBlackFloorSet**. To show that this bug exists, fill in the JUnit test in the `TestRedBlackFloorSet`

file so that it follows the following procedure:

- Generate 1,000,000 random doubles between -5000 and 5000, and add them to both an
`AListFloorSet`

and a`RedBlackFloorSet`

. - Repeats the same random call 100,000 times to the
`floor`

method of the`AListFloorSet`

and`RedBlackFloorSet`

. Use`assertEquals`

to compare the results. Note that since we’re using doubles, you’ll need to specify a tolerance, e.g. if you pick`0.000001`

, then two doubles will be considered equal so long as they are within`0.000001`

of each other.

To generate a random number between a and b, use `StdRandom.uniform(a, b)`

.

Note: Since your `AListFloorSet`

provides the expected output, make sure to use this as your left argument to `assertEquals`

.

## Conclusion

Now that you’ve run timing tests and randomized tests, you are ready to use these tools when you do project 2a and 2b. In the real world, both of these methods are used very often since there is no autograder usually available with the ground truth solution. Time tests are important on larger scale projects since most real world software is expected to run on a large scale, making up to billions of calls at a time. Randomized testing is very useful in finding out whether an unlikely series of calls breaks your program. When you write your randomizwed tests for a larger project, its a good idea to save a log of the calls made somewhere so that when the program makes, you are able to trace it back (otherwise you won’t know what errored).

## Submission

To summarize, here’s what you need to submit to the autograder:

`TimeAList.java`

- With a`timeAListConstruction()`

function that should generate a time table for`addLast`

operations on`AList`

.`AList.java`

- With a better resize strategy than what is given.`TimeSLList.java`

- With a`timeGetLast()`

function that should generate a time table for`getLast`

operations on`SLList`

.`TestRedBlackFloorSet.java`

- With a`randomizedTest()`

function that executes randomized operations to check that`RedBlackFloorSet`

matches`AListFloorSet`

. This lab is very dependent on random numbers (your computer’s load, the random numbers you generate), so if you fail a test with a number that is close to the bound try resubmitting once before looking at your code.