Objective
The area of Combinatorial Interaction testing has seen tremendous progress over the last years. Many tools have been developed but a comparison among algorithms and techniques is difficult to carry on. With this competition, we want to motivate implementors to present their work to a broader audience and to compare it with that of others.
Call for Participation — Procedure
The competition compares state-of-the-art tools for generating combinatorial test suites with respect to the generation time and test suite size.
The competition consists of two phases:
- a training phase, in which example benchmarks are given to the tool developers (starting from end-Dec 2022)
- and an evaluation phase, in which all participating CT tools will be executed on benchmark test tasks, and their performances are measured. The competition is performed (some days before the workshop) and presented during the IWCT workshop.
Researchers from both academia and industry are invited to submit their tools. In order to easily include in the competition both open source and commercial tools, participants have to submit only the executable and no submission of the source code is required.
Benchmarks characteristics
Benchmarks used for tool comparison will be randomly generated, both in terms of parameters, domains and constraints, or taken from well known test sets. However, the random generation will be guided by setting the number of variables (included between a lower and an upper limit) and their types, and the number (included between a lower and an upper limit) and characteristics of constraints (like depth of logical operators, type of operators, …).
The code of the benchmark generator is available here, together with benchmarks used during the previous editions of the competition.
Categories/Tracks
Different generators can compete in different categories, and the participants may choose the category in which the tool competes (depending on the capabilities of the tool). We identify the following categories:
- Models with no constraints
- With only boolean parameters (
UNIFORM_BOOLEAN
) - MCA
- Uniform with n > 2 (
UNIFORM_ALL
)
- With only boolean parameters (
- Models containing constraints
- With boolean parameters and logical operators in constraints (
BOOLC
) - With also enumerative parameters (MCA), and logical and equal operators in constraints (
MCAC
) - With also integer parameters, and logical, mathematical, and relational operators in constraints (
NUMC
) - Models deriving from feature models translation (
FM
) - Models deriving from industrial case studies (
INDUSTRIAL
) - Highly constrained models, with validity ratio < 1% (
HIGHLY_CONSTRAINED
) - Models with constraints in CNF (
CNF
)
- With boolean parameters and logical operators in constraints (
During tools evaluation, test models will be distributed as in the following table:
Category Name | Parameters | Constraints | Control variables | Boundaries | # Tests |
---|---|---|---|---|---|
UNIFORM |
Only booleans | NO | k: number of parameters | k: random in the interval [6, 30] | 15 |
UNIFORM |
Uniform | NO | k: number of parameters v: number of elements for each parameter |
k: random in the interval [6, 30] v: random in the interval [2, 15] |
15 |
MCA |
MCA | NO | k: number of parameters v[]: array containing the number of elements for each parameter |
k: random in the interval [6, 30] each element of v[]: random in the interval [2,15] |
30 |
BOOLC |
Only booleans | randomly chosen between AND, OR, <=>, NOT, => | k: number of parameters c: number of constraints d[]: array containing the complexity of each of the c constraints |
k: random in the interval [6, 30] c: random in the interval [1, 100] each element of d[]: random in the interval [1, 20] |
30 |
MCAC |
MCA | randomly chosen between AND, OR, <=>, NOT, =>, = (both x=C and x=y, where x and y are parameters and C a constant of x), != | k: number of parameters v[]: array containing the number of elements for each parameter c: number of constraints d[]: array containing the complexity of each of the c constraints |
k: random in the interval [6, 30] each element of v[]: random in the interval [2,15] c: random in the interval [1, 100] each element of d[]: random in the interval [1, 20] |
30 |
NUMC |
Booleans, Enumeratives and Integer ranges | randomly chosen between AND, OR, <=>, NOT, =>, = (both x=C and x=y, where x and y are parameters and C a constant of x), !=, mathematical and relational operators | k: number of parameters v[]: array containing the number of elements for each parameter c: number of constraints d[]: array containing the complexity of each of the c constraints |
k: random in the interval [6, 30] each element of v[]: random in the interval [2,15] c: random in the interval [1, 100] each element of d[]: random in the interval [1, 20] |
30 |
FM |
MCA | As required in the corresponding feature model from which the benchmark derives | – | – | 15 |
INDUSTRIAL |
As required by the industrial case study from which the benchmark derives | As required by the industrial case study from which the benchmark derives | – | – | 15 |
HIGHLY_CONSTRAINED |
MCA | randomly chosen between AND, OR, <=>, NOT, =>, = (both x=C and x=y, where x and y are parameters and C a constant of x), != | k: number of parameters v[]: array containing the number of elements for each parameter c: number of constraints d[]: array containing the complexity of each of the c constraints |
k: random in the interval [6, 30] each element of v[]: random in the interval [2,15] c: random in the interval [1, 100] each element of d[]: random in the interval [1, 20] |
30 |
CNF |
MCA | in CNF randomly chosen between AND, OR, NOT, = (both x=C and x=y, where x and y are parameters and C a constant of x), != | k: number of parameters v[]: array containing the number of elements for each parameter c: number of constraints d[]: array containing the complexity of each of the c constraints |
k: random in the interval [6, 30] each element of v[]: random in the interval [2,15] c: random in the interval [1, 100] each element of d[]: random in the interval [1, 20] |
30 |
Input and output formats
The benchmark models will be distributed in the CTWedge, ACTS and PICT formats. The tools must be able to process models in one of these formats (you must specify whether your submission supports CTWedge, ACTS, or PICT) and produce its output in CSV on the standard output file descriptor (stdout). Examples of the inputs and outputs, together with the full grammar of CTWedge, can be found here.
Tool execution
Generators will be executed inside a Docker container provided by the competition organizers, on the same Linux machine with the following specs:
- 2 CPUs Intel(R) Xeon(R) E5-2620 v4 @ 2,10 GHz
- RAM DDR4, 2400 MHz, 4x32 Gb
- 2xSSD Samsung 850 (256GB each) in RAID1
- OS: Ubuntu 18.04.6 LTS
The results (size, generation time, completeness, and validity) will be gathered through the generation of test suites from 50 test models for each category, randomly generated.
Your submission will be invoked as follows:
toolExecutable strength modelFileName
For example:
hyperspeed_ca_generator 4 input.ctwedge
The test model will be processed with a maximum execution time of 300 seconds each. Note that multiple executions for the same test model (details about the number of executions will follow) will be done, and the considered result will be the one of the fastest execution.
Submission format and dependencies
Your submission can be in one of two formats:
- A Linux (ELF) executable
- A docker-compose file that points towards a Docker container provided by you, plus the path to your executable in this container
If you are facing issues with providing any of these formats, please contact the organizers. We are unable to set up custom environments.
If you submit an executable, you must not make any assumptions about libraries or versions thereof available in the Docker container. This means that if your submission requires any shared libraries besides libc (e.g. boost or GMP), you should submit a statically linked version (details on this process depend on your compiler and build system).
To work around issues with more complex dependencies (such as Java), you can instead prepare a Docker container, upload it to a repository and submit a docker-compose file along with the path to the executable in the Docker container (if your docker-compose file includes multiple images, please also specify which container holds the executable). We will manually audit submitted docker-compose files.
All invocations of your executable will follow the format described above. This means that if you require additional command line flags (e.g. java -Xmx 65500 -jar my_ca_generator.jar -Ddoi=5
), you must write a wrapper that takes the arguments described above (strength and input file) and invokes your actual executable.
Regardless of the submission format, your submission must not make network connections or execute malicious code. Failure to adhere to these conditions will result in immediate disqualification and possibly legal action.
Tools evaluation
Each tool will be evaluated by considering:
- Test suite size (50% of the final score)
- Test suite generation time (50% of the final score)
- Test suite completeness and validity (required for all the test suites)
Note that the test suite validity and completeness will be mandatory for the evaluation of how the tool performs over a benchmark model: an invalid or incomplete test suite produced for a model will be marked as not correct and its score will be considered like the tool has been unable to complete the generation of the test suite for that model.
The tools will be ranked
- For the total size of the test suites, in a decreasing order
- For the total time, in an increasing order Supposing that there will be n tools competing, the first tool in the rank will receive n points, the second n-1, and so on.
NEW: we plan to evaluate addional measures (such as the t+1 coverage), but these will not contribute in the actual tool ranking.
Having fixed the timeout (300 seconds), some tools may not complete the computation of the test suite for certain models. In this case the size and the time (for the ranking) will be considered as follows: if a tool X does not complete the benchmark Y, the greatest time required by the other tools for Y (+1) and the greatest size for Y (+1) will be assigned to X.
For the strength, we will execute the tool with strength t starting from 2 to maximum 6.
If no tool is able to generate a full covering array for a given strength and model, that strength will be skipped in the evaluation for this model.
Publication and Presentation of the Competition Candidates
Participants may submit a tool for the 2nd edition of the CT-Competition in two different (mutually exclusive) ways:
- By presenting their tool at IWCT2023 with a full or short paper, containing a description of the tool and the performance obtained with the models given as examples by the competition organizers.
- By sending to one of the competition organizers the tool and a document consisting of one or two pages describing the tool and the performance obtained with the models given as examples by the competition organizers.
In the former case, the paper describing the tool will be peer reviewed and, if accepted, will be a part of the Workshop proceedings. In the latter, the document will not be peer reviewed and will not be part of the regular proceedings. Moreover, in this second scenario, no novel approaches are required to participate.
We emphasize that tools presented in papers that will not be accepted as a part of workshop proceedings can be submitted by following the second path.
- The results of the 2nd edition of the CT Competition are published here.
Important Dates
- End-Dicember 2022, the release of the benchmarks for training
- January 20th 2023, submission of paper (short/full) for the IWCT workshop
- February 24th 2023, submission of the tools and, if the paper has not been previously submitted and accepted, documents describing the tools with the results over the benchmarks
- April 2023, competition with new benchmarks and comparison among all the competing tools
Organization
If you want to know more, or need clarification, do not hesitate to contact us:
- For the University of Bergamo, Andrea Bombarda andrea.bombarda@unibg.it
- for SBA Research, Michael Wagner MWagner@sba-research.org
Sponsors/prize
If you are interested to support the competition, please contact us.