Common coverage criteria in white-box testing

White box testing is a typical scenario in software testing. In this scenario, testers can access the program code, design test cases based on information such as control flow and data flow in the code, and judge whether the program meets expectations, whether there are bugs, and the quality of the test cases based on the actual running results of the test cases.

Code coverage is commonly used to measure whether a test case sufficiently tests the program being tested. Generally speaking, if a test case executes as much code as possible and triggers as many events as possible, it indicates that the testing is more thorough. If no obvious bugs occur, the higher the quality of the program, and the more stable (robust) the performance is when given input from users in practical situations.
A program's code usually includes special structures such as loops, conditions, and function calls. Different coverage rules can be derived by designing different coverage rules for these special structures. Common ones include:

Statement coverage
Branch coverage (decision coverage)
Condition coverage
Path coverage

Take the following program as an example:

Statement Coverage#

Statement coverage measures coverage based solely on the number of executed lines of code. The denominator of the coverage is the total number of lines of code in the tested program's source file, and the numerator is the number of lines of code passed through during program execution. For example, the example program has a total of 10 lines of code, and the first and tenth lines will definitely be executed. When inputting a = 2, b = 3, c = 2, the second, third, sixth, and seventh lines will be executed, so the statement coverage is 60%. Because the fourth and eighth lines are mutually exclusive, it is impossible to achieve 100% statement coverage with just one set of inputs. Two sets of inputs can be constructed: a = 2, b = 3, c = 4 and a = 2, b = 3, c = 1, which can achieve 100% statement coverage.

Branch Coverage (Decision Coverage)#

Branch coverage, also known as decision coverage, requires that each branch of statements with conditional decisions be covered. If the logic of the code is represented as a flowchart, branch coverage can be intuitively understood as covering all the lines and arrows in the graph:

If the continuously executed code is merged into one node, the graph can be simplified as follows:

Although it is required to cover all the lines and arrows, usually only the branches generated by conditional statements and loop statements are considered. For example, for the conditional statement on the second line, the branches 2->3 and 2->6 need to be covered; for the conditional statement on the fourth line, the branches 4->5 and 4->6 need to be covered.
The example program has a total of 4 conditional statements, which can take 4 truth values. For a set of inputs a = 2, b = 3, c = 2, the resulting truth values for each conditional statement are T, F, T, F, corresponding to the branches 2->3, 3->6, 6->7, 7->10. To achieve 100% branch coverage, only one set of inputs needs to produce the truth values F, T, F, T. However, in the example program, it is impossible to make the second condition true by making the first condition false. The following 3 sets of inputs can achieve 100% branch coverage:

a = 2, b = 3, c = 2 => T, F, T, F
a = 10, b = 3, c = 4 => T, T, F, _
a = 1, b = 1, c = 1 => F, _, T, T

Each condition takes both T and F, satisfying the requirements of branch coverage.

Condition Coverage#

A decision is often composed of multiple conditions (the smallest expression that can produce a Boolean value) combined with Boolean operations such as AND, OR, and NOT. Therefore, to measure the change in control flow from a more granular perspective, the value of each condition is used to replace the concept of decision in the branch coverage, forming condition coverage. In the example program, for the second line of code, we no longer consider the cases where a > 1 and b > 2 is true or false, but the cases where a > 1 and b > 2 are true or false respectively. In condition coverage, it is only necessary for each condition to take both T and F. For example, one input produces T, T, and another input produces F, F, achieving 100% condition coverage. In the 3 sets of inputs given in branch coverage:

a = 2, b = 3, c = 2 => T, F, T, F
a = 10, b = 3, c = 4 => T, T, F, _
a = 1, b = 1, c = 1 => F, _, T, T

The case of a < 1 is missing, so the third set of inputs can be replaced with a = 0, b = 1, c = 1.

Path Coverage#

Path coverage is the most granular coverage, which combines every statement from the entry point (usually the main function) to the exit point (exit, panic, return statement in the main function) into paths. As long as one statement is different, it is considered a different path. The paths that exist in the example program are (hopefully I haven't missed any):

1,2->3,4->5->6->7,8->9->10
1,2->3,4->5->6->7,8->10
1,2->3,4->5->6->10
1,2->3,4->6->7,8->9->10
1,2->3,4->6->7,8->10
1,2->3,4->6->10
1,2->6->7,8->9->10
1,2->6->7,8->10
1,2->6->10

It should be noted that these are all the possible paths obtained from analyzing the program code. In actual situations, some paths are not feasible. For example, 3,4->5 and 7,8->9 cannot appear in the same path. Therefore, for the example program, it is impossible to achieve 100% path coverage.
Path coverage is the highest requirement among these coverage metrics. As long as path coverage is achieved, all other coverage metrics will be satisfied (except for situations that do not exist in reality). However, in real-life programs, there are not only countless conditional statements, but also loop statements and combinations of multiple conditions and loop statements. The number of paths generated will increase exponentially with the number of conditions and loop statements, making it extremely difficult to achieve high path coverage. It is almost impossible to calculate the number of paths in a given program's code, so when calculating path coverage, counting rather than calculating the ratio is often used. Inputs that can trigger more different paths are considered higher quality test cases.