Second edition of the CT Competition

In this page you can find the information and the results obtained by each participant to the second edition of the CT competition.

Participants

ACTS: Java Implementation of IPO, one of the most used combinatorial test generation tools. Executed with default settings (IPOG with MFT) - tool description;
CAgen: a multithreaded FIPOG implementation written in Rust - tool description;
CAopt: a sampling and an optimization phase, based on a SAT solver - tool description;
KALI: a java multi-thread tool exploiting SMT solvers - tool description;
MEDICI: a C++ tool for combinatorial test generation based on the use of Multi-Valued Decision Diagrams - tool description;
pMEDICI: a java multi-thread implementation of the MEDICI tool, based on Multi-valued Decision Diagrams - (tool description out soon with IWCT proceedings);

Benchmarks and execution rules

For this edition of the CT competition:

240 benchmark models have been generated
Strengths from 2 to 6 have been used
Each benchmark has been executed 3 times for each tool and strength (you can find the list of all the execution results here), and the best run has been selected for the attribution of the score (see here for the best executions list)

Results

All the results reported into this section derives from the list of best executions.

The file containing all the CAs which have been analyzed for giving the score to each tool can be found here.

Finally, the slides used during IWCT 2023 for presenting the results of the second edition of the CT competition are available here.

Validity and timeouts

All the tools have reported at least one timeout or produced an invalid test suite (even if not in all the categories and all strenghts). The list of timed out instances for each category and each strength can be found here, while the one of invalid instances can be found here.

Score - Time and Size

The competition score has been given considering in an equal way the generation time and the test suite size, as described in the following. For each benchmarks, considering n competing tools, the score has been assigned in the following way:

Tools not completing the benchmark received 0 points
Tools producing an invalid test suite received 0 points
The other k tools have been ordered in ascending order. The first received k points, the second k-1, and so on. In this way, producing a test suite always led to at least 1 point for the considered benchmark.

The described process, has been repeated for each bechmark, first considering the generation time and then the test suite size. To determine the final score for each tool (and potentially each category), the mean was calculated by combining the score achieved based solely on the generation time and the score based on the test suite size.

In the following, for completeness, we here report in a separate way the score given either only considering the time or the test suite size, by distinguishing between different strengths.

Time

Time score

Size

Size score

OVERALL ranking

This section reports the overall ranking, considering the aggregated score of each category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6)

CAgen (3666.0 pts)
ACTS (3056.5 pts)
CAopt (2104.0 pts)
MEDICI (1997.0 pts)
pMEDICI (1560.5 pts)
KALI (537.0 pts)

For completeness, we here report the graph with the score given for each strength.

Overall score

UNIFORM_BOOLEAN ranking

This section reports the ranking for the UNIFORM_BOOLEAN category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6)

CAgen (410.5 pts)
ACTS (363.0 pts)
CAopt (324.5 pts)
MEDICI (280.0 pts)
KALI (223.5 pts)
pMEDICI (222.0 pts)

For completeness, we here report the graph showing the score given for each strength for this category.

Overall score

UNIFORM_ALL ranking

This section reports the ranking for the UNIFORM_ALL category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6

CAgen (289.5 pts)
ACTS (235.5 pts)
CAopt (141.5 pts)
pMEDICI (131.0 pts)
MEDICI (126.0 pts)
KALI (104.5 pts)

For completeness, we here report the graph showing the score given for each strength for this category.

Overall score

MCA ranking

This section reports the ranking for the MCA category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6)

CAgen (591.5 pts)
ACTS (468.0 pts)
pMEDICI (232.0 pts)
CAopt (211.0 pts)
MEDICI (172.0 pts)
KALI (157.5 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

BOOLC ranking

This section reports the ranking for the BOOLC category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6)

CAgen (741.5 pts)
ACTS (627.5 pts)
MEDICI (600.5 pts)
pMEDICI (431.5 pts)
CAopt (380.0 pts)
KALI (61.5 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

MCAC ranking

This section reports the ranking for the MCAC category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6)

CAgen (286.5 pts)
ACTS (263.0 pts)
MEDICI (178.5 pts)
CAopt (172.0 pts)
pMEDICI (100.0 pts)
KALI (9.5 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

NUMC ranking

This section reports the ranking for the NUMC category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6).

For this category, only CAGen, ACTS, CAopt and KALI have been considered, since the other tools declared not to able to support the constraints available in the NUMC category.

CAgen (272.0 pts)
ACTS (216.0 pts)
CAopt (131.0 pts)
KALI (0.0 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

INDUSTRIAL ranking

This section reports the ranking for the INDUSTRIAL category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6).

CAgen (262.0 pts)
ACTS (217.5 pts)
CAopt (148.5 pts)
MEDICI (147.5 pts)
pMEDICI (140.5 pts)
KALI (8.5 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

FM ranking

This section reports the ranking for the FM category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6).

CAgen (333.5 pts)
ACTS (271.0 pts)
CAopt (264.0 pts)
MEDICI (220.5 pts)
pMEDICI (169.5 pts)
KALI (11.5 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

CNF ranking

This section reports the ranking for the CNF category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6).

CAgen (473.5 pts)
ACTS (364.0 pts)
CAopt (210.0 pts)
MEDICI (166.5 pts)
pMEDICI (133.0 pts)
KALI (0.0 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

HIGHLY_CONSTRAINED ranking

This section reports the ranking for the HIGHLY_CONSTRAINED category. The detailed data can be found here for all the strenghts (or, if interested in a specific strength, you can look at the specific file - 2, 3, 4, 5, 6).

CAgen (389.0 pts)
ACTS (308.5 pts)
CAopt (212.5 pts)
MEDICI (209.5 pts)
pMEDICI (135.0 pts)
KALI (5.5 pts)

For completeness, we here report the plot showing the score given for each strength for this category.

Overall score

ct-competition

Welcome to the combinatorial interaction testing competition

Second edition of the CT Competition

Participants

Benchmarks and execution rules

Results

Validity and timeouts

Score - Time and Size

Time

Size

OVERALL ranking

UNIFORM_BOOLEAN ranking

UNIFORM_ALL ranking

MCA ranking

BOOLC ranking

MCAC ranking

NUMC ranking

INDUSTRIAL ranking

FM ranking

CNF ranking

HIGHLY_CONSTRAINED ranking