fond
Model Checking Contest @ Petri Nets 2014
Tunis, Tunisia, June 24, 2014
Results for the MCC’2014
Last Updated
Sept. 1, 2014

Introduction

This page summarizes the results for the 2014 edition of the Model Checking Contest. This section is divided in three sections:

The Participants in 2014

Nine tools where submitted this year. Data about these submissions are summarized in the table below. For any tool, you may download the disk image that was provided with all its data. You may use these to reproduce measures locally and perform comparison with your own tool on the same benchmark.

Summary of the Participating Tools (alphabetical order)
Tool name Representative Author Origin Link to the submitted disk image Reported Techniques
Cunf C. Rodriguez ENS de Cachan & Univ. Oxford Net Unfolding, SAT/SMT
GreatSPN M. Beccuti Univ. Torino Decision Diagrams, Symmetries
Helena S. Evangelista Univ Paris 13 Explicit, Stubborn Sets
LoLa N. Lohmann Univ. Rostock Explicit, State Compression, Stubborn Sets, Symmetries, Topological
Marcie C. Rohr Univ. Cottbus Decision Diagrams
PNMC A. Hamez ISAE Decision Diagrams
PNXDD E. Paviot-Adet Uiv. P. & M. Curie Decision Diagrams, Topological
Stratagem E. Lopez Univ. Geneva Decision Diagrams, Topological
Tapaal J. Srba Univ. Aalborg Explicit, Structural Reductions

The results

We first propose an access to the winner page for the categories of examinations and then some facts about this edition. You may also access detailed results (model per model) for a deeper analysis.

Winners

We set-up four categories of examinations to rate tools:

Access to detailed results

Results of the MCC'2014 are summarized as follows:

For those who want very detailed information ;-), you have:

Facts about the MCC'2014

Each tool was submitted to 8040 executions in various conditions (670 model/instances and 12 examination per model/instance) for which it could report: DNC (do not compete), CC (cannot compute) or the result of the query. These executions were handled by BenchKit, that was developped in the context of the MCC for massive testing of software. Then, from the raw data provided by BenchKit, some post-analysis scripts consolidated these and computed a ranking. The table below shows some data about the involved machines and their contribution to the computation of these results.

Involved Machines and Execution of the Benchmarks
  Quadhexa-2 Rented Machine Total
Cores 24 @ 2.66GHz 40 @ 2.5GHz
Memory (GB) 128 256
Used Cores 21 39
Number of runs 48 395 34 913 83 308
Total computation time 157 days, 0 hour, 10 minutes and 51 seconds
Time spent to complete benchmarks 8 days, 21 hours, 50 minutes and 29 seconds
Boot time of VMs (overhead) Approx 19 days, 4 hours (estimation)

Please note that Quadhexa-2 was made available by colleagues at Université Paris Ouest Nanterre. The other machine was rent thanks to of the ImPro ANR blanc project.

Interpreting the correctness mask

Each result provided by a tool, is compared to those of others tools. The valid result (first blue column on Figure 1) is decided after the majority (second blue column on Figure 1). In our example, the examination involved 10 formulas which result is either true (T) of false (F).

When no majority is found for a result, we cannot decide and a dedicated sign (?) is used.

Please note that at this stage, our algorithm considers only the table line by line. Further improvements are planned to improve the confidence of tools according to their global results for the next editions of the MCC.

snipet of mask-result
Figure 1 : snippet from a correctness mask

Then, for each tool and for each result, a mask is computed. It shows where the tool was right, where it was wrong, and where we cannot decide. The masks are displayed in the sub-columns of section "Tool's results" on Figure 1. Each digit must be read as follows:

Such a table can help tool developers to see a comparison of their results against the majority of results in the contest. This should help then to detect bugs. They can also consider the reason why their tool did not computed any result:

About score computation

Scores are computed as follows:

Let us exemplify this on the snippet of Figure 1 (we do not consider the bonus for scrambled or surprise models):

All others tools do not compete for his examination (or could not complete any result).