Building better programs (Part 3)

Black box tests

Instead of relying solely on error messages, we can get information about the correctness of our code by writing tests.

Definition

Black box testing checks the correspondence between possible inputs and correct outputs.

A black box test treats the code as a “black box”. Namely, we focus on what goes in and what comes out of the box, not the inner workings of the box. The goal of the tests is not to check all possibilities, but to test representatives of all possibilities. To do this, you need to have a test for each possible case.

Make sure there is at least one test for each possible case (such as empty strings).
Test boundary cases.
Each test should have a specific purpose.
Keep tests simple.
Figure out the output by hand.

This might include empty and nonempty strings, even and odd numbers, and many other possibilities. If there are values right on the boundary between two cases, those are also good values to test. Just as with comments, more is not necessarily better. If you include a test, you should have a reason for doing so. Does it test a case that hasn't already been tested? If not, don't include it. Simple tests are clearer and easier to understand. Try to construct the simplest test that does the job. And most importantly, make sure your tests are really tests. How can you know if the computer made the correct calculation if you don't know what the correct calculation is? If your tests are based on the assumption that the program is correct, why bother testing at all?

Examples of black box testing

Example 1

Suppose we wish to test a function that produces the smaller of two integers:

The function smaller produces the smaller of integers x and y.

What are the possible different cases? The output could be the first integer, if the first integer is smaller, or the second integer, if the second integer is smaller. We should also check that the function performs correctly when the two inputs are equal. Here we have three tests for the three possible cases. Each test has a specific purpose and they are all simple.

Test cases: smaller(1, 2), smaller(2, 1), and smaller(1, 1).

Example 2

Consider the function mid as defined below.

The function mid produces the floor of the length of the nonempty string word.

For a function that produces the position of the middle character in a string, we need to check that the floor is computed properly for both even and odd length inputs. So we have two tests for the two cases.

Test cases: mid("at") and mid("cat")

Again, they are simple.

Example 3

As our third example, we have a Boolean function.

The function discount produces True if age is less than 21 or greater than or equal to 65.

At a minimum, we should have one test for the output True and one for the output False. Since there are two ranges of age for which the output is True, we choose a test for each. That is, 10 represents ages less than 21, 70 represents ages greater than or equal to 65, and 50 represents ages for which the output will be False. It is a good idea to check the boundary cases 21 and 65, just to make sure that the function works properly at the boundaries between ranges.

Test cases: discount(10), discount(70), discount(50), discount(21), discount(65).

White box tests

In addition to black box testing, we include tests that view the function as a white box, that is, one in which we can see the inner workings.

Definition

White box testing checks the use of all the code, including all branches and all settings of Boolean expressions.

These tests make sure that all of the code has been tested such as all branches and all ways of Boolean expressions being true or false. In this example, we want to test both branches of the if.

define minimum(first, second) if first <= second smaller ← first else smaller ← second return smaller

We come up with one test for which the condition is true and one for which it is false.

Test for first branch: minimum(1, 2)
Test for second branch: minimum(2, 1)

Example: white box tests

Let's construct white box tests for the example of determining if a time is eligible for offpeak rates. We've omitted the comments to make more space for the tests.

define offpeak(hour) return is_integer(hour) and ((hour < 9) or (hour > 17)))

To consider all possible cases, we need to consider all the ways of making the expression true or false. All possibilities are covered by the following four cases:

Not an integer: "hour"
Integer less than 9: 4
Integer greater than 17: 20
Integer between 9 and 17: 12

If the input is not an integer, is_integer(hour) will be false, and due to short-cut evaluation, False will be returned. We use a string to test this case. For our other tests, we use integers and focus on the expressions hour < 9 and hour > 17 joined by an or. In this case, it isn't possible for both expressions to be true, but if it were, we would need to test that possibility. So, we use 4 to test the case in which the first condition is true and the second false, 20 to test the case in which the first smaller expression is false and the second true, and 12 to test the case in which both conditions are false.

Debugging techniques

Suppose a test fails. Now what do you do?

Tracing and visualization

Go through the code step by step, drawing pictures of what is in memory.

One strategy is to use tracing and pictures to figure out what is going on in memory at each step of the program. If your result matches your pictures, then you can go through the steps by hand and figure out where you made an error.

Check values

Use print statements to take “snapshots” of values of variables to make sure the code matches your pictures.

If instead there is a difference between what the computer did and what you thought it did, you can figure out where the problem is by using print statements to verify whether or not the values stored matched the ones you thought were being stored. Good candidates of places to take “snapshots” of values is when you are using conditions, to see if the correct branch was taken, or when there is a function call, to see if the inputs and output are what you expected.

Narrow focus

Narrow down the area where the error occurred by temporarily making lines into comments.

You don't have to deal with the whole program at once. You can temporarily make lines into comments so that only part of the program is being examined.

Testing recipe

The following is a summary of ways of testing a program.

Order functions by dependency:
You can test your program by first ordering the functions by what depends on what.
Test in order of dependency:
By testing the least dependent function first, you can test each subsequent function knowing that what it relies upon is correct. For each, use both black box and white box testing.
Find sources of errors:
Once you've found that there is an error, use whatever techniques that work best to track down the problem: drawing pictures (tracing and visualization) to figure out what the inner workings should be, using print statements to determine what the inner workings actually are, and using comments to narrow down the focus of your investigation.