ct-competition

Welcome to the combinatorial interaction testing competition

View project on GitHub

Objective

The area of Combinatorial Interaction testing has seen tremendous progress over the last years. Many tools have been developed but a comparison among algorithms and techniques is difficult to carry on. With this competition, we want to motivate implementors to present their work to a broader audience and to compare it with that of others.

Call for Participation — Procedure

The competition compares state-of-the-art tools for generating combinatorial test suites with respect to the generation time and test suite size.
The competition consists of two phases:

  • a training phase, in which example benchmarks are given to the tool developers (starting from end-Dec 2022)
  • and an evaluation phase, in which all participating CT tools will be executed on benchmark test tasks, and their performances are measured. The competition is performed (some days before the workshop) and presented during the IWCT workshop.

Researchers from both academia and industry are invited to submit their tools. In order to easily include in the competition both open source and commercial tools, participants have to submit only the executable and no submission of the source code is required.

Benchmarks characteristics

Benchmarks used for tool comparison will be randomly generated, both in terms of parameters, domains and constraints, or taken from well known test sets. However, the random generation will be guided by setting the number of variables (included between a lower and an upper limit) and their types, and the number (included between a lower and an upper limit) and characteristics of constraints (like depth of logical operators, type of operators, …).

The code of the benchmark generator is available here, together with benchmarks used during the previous editions of the competition.

Categories/Tracks

Different generators can compete in different categories, and the participants may choose the category in which the tool competes (depending on the capabilities of the tool). We identify the following categories:

  • Models with no constraints
    • With only boolean parameters (UNIFORM_BOOLEAN)
    • MCA
    • Uniform with n > 2 (UNIFORM_ALL)
  • Models containing constraints
    • With boolean parameters and logical operators in constraints (BOOLC)
    • With also enumerative parameters (MCA), and logical and equal operators in constraints (MCAC)
    • With also integer parameters, and logical, mathematical, and relational operators in constraints (NUMC)
    • Models deriving from feature models translation (FM)
    • Models deriving from industrial case studies (INDUSTRIAL)
    • Highly constrained models, with validity ratio < 1% (HIGHLY_CONSTRAINED)
    • Models with constraints in CNF (CNF)

During tools evaluation, test models will be distributed as in the following table:

Category Name Parameters Constraints Control variables Boundaries # Tests
UNIFORM Only booleans NO k: number of parameters k: random in the interval [6, 30] 15
UNIFORM Uniform NO k: number of parameters

v: number of elements for each parameter
k: random in the interval [6, 30]

v: random in the interval [2, 15]
15
MCA MCA NO k: number of parameters

v[]: array containing the number of elements for each parameter
k: random in the interval [6, 30]

each element of v[]: random in the interval [2,15]
30
BOOLC Only booleans randomly chosen between AND, OR, <=>, NOT, => k: number of parameters

c: number of constraints

d[]: array containing the complexity of each of the c constraints
k: random in the interval [6, 30]

c: random in the interval [1, 100]

each element of d[]: random in the interval [1, 20]
30
MCAC MCA randomly chosen between AND, OR, <=>, NOT, =>, = (both x=C and x=y, where x and y are parameters and C a constant of x), != k: number of parameters

v[]: array containing the number of elements for each parameter

c: number of constraints

d[]: array containing the complexity of each of the c constraints
k: random in the interval [6, 30]

each element of v[]: random in the interval [2,15]

c: random in the interval [1, 100]

each element of d[]: random in the interval [1, 20]
30
NUMC Booleans, Enumeratives and Integer ranges randomly chosen between AND, OR, <=>, NOT, =>, = (both x=C and x=y, where x and y are parameters and C a constant of x), !=, mathematical and relational operators k: number of parameters

v[]: array containing the number of elements for each parameter

c: number of constraints

d[]: array containing the complexity of each of the c constraints
k: random in the interval [6, 30]

each element of v[]: random in the interval [2,15]

c: random in the interval [1, 100]

each element of d[]: random in the interval [1, 20]
30
FM MCA As required in the corresponding feature model from which the benchmark derives 15
INDUSTRIAL As required by the industrial case study from which the benchmark derives As required by the industrial case study from which the benchmark derives 15
HIGHLY_CONSTRAINED MCA randomly chosen between AND, OR, <=>, NOT, =>, = (both x=C and x=y, where x and y are parameters and C a constant of x), != k: number of parameters

v[]: array containing the number of elements for each parameter

c: number of constraints

d[]: array containing the complexity of each of the c constraints
k: random in the interval [6, 30]

each element of v[]: random in the interval [2,15]

c: random in the interval [1, 100]

each element of d[]: random in the interval [1, 20]
30
CNF MCA in CNF randomly chosen between AND, OR, NOT, = (both x=C and x=y, where x and y are parameters and C a constant of x), != k: number of parameters

v[]: array containing the number of elements for each parameter

c: number of constraints

d[]: array containing the complexity of each of the c constraints
k: random in the interval [6, 30]

each element of v[]: random in the interval [2,15]

c: random in the interval [1, 100]

each element of d[]: random in the interval [1, 20]
30

Input and output formats

The benchmark models will be distributed in the CTWedge, ACTS and PICT formats. The tools must be able to process models in one of these formats (you must specify whether your submission supports CTWedge, ACTS, or PICT) and produce its output in CSV on the standard output file descriptor (stdout). Examples of the inputs and outputs, together with the full grammar of CTWedge, can be found here.

Tool execution

Generators will be executed inside a Docker container provided by the competition organizers, on the same Linux machine with the following specs:

  • 2 CPUs Intel(R) Xeon(R) E5-2620 v4 @ 2,10 GHz
  • RAM DDR4, 2400 MHz, 4x32 Gb
  • 2xSSD Samsung 850 (256GB each) in RAID1
  • OS: Ubuntu 18.04.6 LTS

The results (size, generation time, completeness, and validity) will be gathered through the generation of test suites from 50 test models for each category, randomly generated.

Your submission will be invoked as follows:

toolExecutable strength modelFileName

For example:

hyperspeed_ca_generator 4 input.ctwedge

The test model will be processed with a maximum execution time of 300 seconds each. Note that multiple executions for the same test model (details about the number of executions will follow) will be done, and the considered result will be the one of the fastest execution.

Submission format and dependencies

Your submission can be in one of two formats:

  • A Linux (ELF) executable
  • A docker-compose file that points towards a Docker container provided by you, plus the path to your executable in this container

If you are facing issues with providing any of these formats, please contact the organizers. We are unable to set up custom environments.

If you submit an executable, you must not make any assumptions about libraries or versions thereof available in the Docker container. This means that if your submission requires any shared libraries besides libc (e.g. boost or GMP), you should submit a statically linked version (details on this process depend on your compiler and build system).

To work around issues with more complex dependencies (such as Java), you can instead prepare a Docker container, upload it to a repository and submit a docker-compose file along with the path to the executable in the Docker container (if your docker-compose file includes multiple images, please also specify which container holds the executable). We will manually audit submitted docker-compose files.

All invocations of your executable will follow the format described above. This means that if you require additional command line flags (e.g. java -Xmx 65500 -jar my_ca_generator.jar -Ddoi=5), you must write a wrapper that takes the arguments described above (strength and input file) and invokes your actual executable.

Regardless of the submission format, your submission must not make network connections or execute malicious code. Failure to adhere to these conditions will result in immediate disqualification and possibly legal action.

Tools evaluation

Each tool will be evaluated by considering:

  • Test suite size (50% of the final score)
  • Test suite generation time (50% of the final score)
  • Test suite completeness and validity (required for all the test suites)

Note that the test suite validity and completeness will be mandatory for the evaluation of how the tool performs over a benchmark model: an invalid or incomplete test suite produced for a model will be marked as not correct and its score will be considered like the tool has been unable to complete the generation of the test suite for that model.

The tools will be ranked

  • For the total size of the test suites, in a decreasing order
  • For the total time, in an increasing order Supposing that there will be n tools competing, the first tool in the rank will receive n points, the second n-1, and so on.

NEW: we plan to evaluate addional measures (such as the t+1 coverage), but these will not contribute in the actual tool ranking.

Having fixed the timeout (300 seconds), some tools may not complete the computation of the test suite for certain models. In this case the size and the time (for the ranking) will be considered as follows: if a tool X does not complete the benchmark Y, the greatest time required by the other tools for Y (+1) and the greatest size for Y (+1) will be assigned to X.

For the strength, we will execute the tool with strength t starting from 2 to maximum 6.

If no tool is able to generate a full covering array for a given strength and model, that strength will be skipped in the evaluation for this model.

Publication and Presentation of the Competition Candidates

Participants may submit a tool for the 2nd edition of the CT-Competition in two different (mutually exclusive) ways:

  • By presenting their tool at IWCT2023 with a full or short paper, containing a description of the tool and the performance obtained with the models given as examples by the competition organizers.
  • By sending to one of the competition organizers the tool and a document consisting of one or two pages describing the tool and the performance obtained with the models given as examples by the competition organizers.

In the former case, the paper describing the tool will be peer reviewed and, if accepted, will be a part of the Workshop proceedings. In the latter, the document will not be peer reviewed and will not be part of the regular proceedings. Moreover, in this second scenario, no novel approaches are required to participate.

We emphasize that tools presented in papers that will not be accepted as a part of workshop proceedings can be submitted by following the second path.

  • The results of the 2nd edition of the CT Competition are published here.

Important Dates

  • End-Dicember 2022, the release of the benchmarks for training
  • January 20th 2023, submission of paper (short/full) for the IWCT workshop
  • February 24th 2023, submission of the tools and, if the paper has not been previously submitted and accepted, documents describing the tools with the results over the benchmarks
  • April 2023, competition with new benchmarks and comparison among all the competing tools

Organization

If you want to know more, or need clarification, do not hesitate to contact us:

Sponsors/prize

If you are interested to support the competition, please contact us.

History: Other editions of the combinatorial testing competition

  • The rules and general information about the first edition of the combinatorial testing competitions are published here.
    • The results of the 1st edition of the CT Competition are published here.