Software systems have natural parameters, like the number of replicated parts and the size of parametric data types, which can take infinitely many values. Consequently, all software systems can be considered as multiparametrised finite-state machines where the parameters determine the (maximum) number of replicated components (like objects), their relationships and the size of basic data types. That is why the question on the correctness of a software system is naturally expressed as the parametrised verification problem: given a parametrised system implementation and specification, determine whether the implementation is correct with respect to the specification for all parameter values.
Bounds2 is a tool that enables parametrised verification by establishing upper bounds, i.e., cut-offs, for the values of parameters such that if there is a bug in an implementation instance with a parameter value greater than the cut-off, then there is an analogous bug in an implementation instance where the values of the parameters are within the cut-offs. When using Bounds2, implementations and specifications are composed of labelled transitions systems (LTSs) with explicit input and output events by using parallel composition and hiding. One can also use several kinds of parameters: process types represent the sets of the identifiers of replicated components, data types denote the sets of data values, typed variables refer to the identities of individual components or data values and relation symbols represent binary relations over process types. Correctness is understood either as a refinement, which can be trace inclusion [1] or alternating simulation [2], or the compatibility of the components of the implementation [2].
The tool consists of two parts. The instance generator determines cut-offs for the types, computes the allowed parameter values up to the cut-offs and outputs the corresponding finite state verification tasks. After that, the outputted instances are verified by an instance checker specific to the notion of correctness. Trace refinement and compatibility checkers exploit the refinement checker FDR2 [1] to verify the instances, the alternating simulation checker makes use of MIO Workbench [3] refinement checker, too.
Download Bounds v2.2 (Linux) here. To install and use Bounds you need a number of other software packages.
Having prepared all the auxiliary software as explained above, run
to compile and install Bounds.
The input language of Bounds is called Communicating Parameterised Systems (CPS) according to its role model CSP (Communicating Sequential Processes) [1]. CPS is basically the formalism presented in [4] with some syntactic sugar to enable the compact and comfortable representation of large processes.
To introduce the input language of Bounds, CPS, and the tool itself, you should consider a shared resource system (SRS) with an arbitrary number of users competing for an access to an arbitrary number of shared resources arranged in the form of a forest (Fig. 1). A user gets a read (write) access to a subtree of resources after successfully requesting a read lock sr (write lock sx) on the root of the subtree and a weaker read intention lock ir (write intention lock ix) on all its ancestors. The purpose of the read (write) intention lock is to indicate the existence of the read (write) lock deeper in the tree. Several users can hold a lock on a resource simultaneously only if all of them have either intention locks or read related locks (Fig. 2). Our goal is to formally model the system implementation and the specification and prove that in our construction it is not possible for a user to access a resource if somebody else is writing to it.
There are two kinds of components, resources and users, so we introduce types R and U to represent the sets of their identities, respectively.
The types R and U denote sets {R1,R2,…,Rk}, where k ∈ ℤ+, and {U1,U2,…,Un}, where n ∈ ℤ+, respectively.
The relationship between the resources is represented using a relation symbol Anc.
The relationship between the users (the empty relation) and the relationship between the users and the resources (the maximal bipartite topology) are trivial and can be handled without the use of relation symbols.
We also need some variables to refer to the identities of individual users and resources when describing system and specification behaviour, and the values of Anc.
In CPS, communications are called events and they consists of a channel and a data part. For SRS, we define six channels for locking events and two channels for both read and write events.
The locking events consists of a user identifier and a resource identifier. For example, srlock(u,r) indicates that the user u puts a subtree read lock sr on the resource r.
The read and write events, in turn, consists of a user identifier and two resource identifiers. For example, rdbeg(u,r2,r1) indicates that the user u starts reading a resource r2 based on the lock it has on the resource r1.
The system implementation and specification will be first modelled from the viewpoint of two resources, where one resource is an ancestor of the other. That is why we define Anc as a proper ancestor relation, i.e. the transitive closure of a forest. (Here, an ancestor of a resource can also be the resource itself and a proper ancestor refers to an ancestor other than the resource itself.)
The values of variables and relation symbols are expressed in the universal fragment of first order logic (FOL), where terms are restricted to variables and predicated to relation symbols. (Also existential quantification can be used but then the tool is not able to compute the cut-offs automatically.) In CPS, formulae of this kind are called valuation constraints and specified using vc keyword.
Variables introduced earlier will not occur free in the final system and specification, i.e. they have no special system level meaning like the root of a tree, but they are always bound by the universal quantifier (\/), the replicated parallel composition or the generalised set union ((_)). That is why there is no need to restrict their values here.
To represent parameterised systems in our formalism, you need to use compositional modelling style [5]. That is, the behaviour of the system implementation is first captured in finitely LTSs, each of which represents the behaviour of the system from the viewpoint of finitely many components. After that, propositional conditions are used to restrict the values of variables and the parts are put to together using standard and replicated parallel compositions. The specification is modelled similarly. Finally, the events irrelevant to the specification are hidden in the system.
We build the system model from three LTSs each of which represents the behaviour of the implementation from the viewpoint of two or three users and resources.
In CPS, parameterised LTSs are are specified using the plts keyword.
To formalise the specification, note that every illegal behaviour can be traced back to two different users u1, u2 that write to a resource r3 simultaneously based on the locks they have on respectively some ancestors r1, r2. Therefore, we first capture the specification in an LTS Prop2 from the viewpoint of u1, u2 accessing r3 based on the locks on respectively r1, r2. However, to correctly present the specification in the presence of a single user only, we also introduce an LTS Prop1, which captures the specification from the viewpoint of u1 accessing r2 based on the lock on r1.
You can also use fixed sets and relations/functions over these sets to represent LTSs in a compact and convenient way. To see how to use these constructs, see taDOM2+ examples in examples/tadom2+_repeatable.cps [6, 7].
To build the model of the system implementation and the formal specification, you should enclose the guarded LTSs within replicated parallel compositions such that every variable, which does not have a specific system level meaning, becomes bound. Finally, you compose the parts together using the standard parallel composition.
In CPS, the sets of parameterised events are specified using pset keyword. In our case, we hide the events related to locking, denoted by LockAct. Hence, the system implementation gets the form Sys \ LockAct.
The correctness of the SRS implementation can be now stated as the question whether Sys \ LockAct is a traces refinement of Spec whenever ForestTopo is satisfied. In CPS, this is specified in the following way.
To reduce a parameterised verification task into (finitely many) finite-state verification tasks, just give the CPS file as a parameter to the Bounds instance generator.
After reading the file, the instance generator reports the cut-off sizes for the types. As types are used to specify the domains of variables and relation symbols, these cut-offs implicitly limit the values of other parameters, too.
Hence, in the case of SRS, it suffices to consider instances of the system and the specification with at most three resources and two users.
If there were no relation symbols, determining the instances up to the cut-offs would be easy. The problem with relation symbols is that not all the values are allowed and there are lots of isomorphic values, which result in the instances of the parameterised verification task with the same answer, too.
Bounds computes the allowed values for parameters one by one. By default, the parameters are treated in the following order:
However, a type T is always handled before variables and relation symbols of the type T. Therefore, the values of some type are always computed first. For example, in the case of our example, the order in which the values of the parameters are computed is 1. R, 2. Anc, 3. U.
As the computation may take a long time, Bounds shows the status of the computation as a percentage value. However, this value does not increase linearly in time, but gives only a rough idea how long the computation will take. Bounds shows also some other statistics that reflect the status of the computation: the total number of valuations computed so far, the number of canonical forms computed and the number of valuations currently stored. The last value converges to the number of instances output.
After computing the valuations, the instance generator outputs the corresponding instances of the parameterised verification task. In the case of SRS, there are 14 of them. However, some of them contain redundant information and can be discarded. That is why Bounds analyses each instance in more detail and keeps only those instances which must be refinement checked. Due to parallel computing, the instances are processed in more or less random order. The remaining instances are stored in files named xxxx_instance_k.csp, where xxxx is the name of the input file without the suffix and k is a value from 0 to the number of valuations computed. The syntax of output files is machine-readable CSP.
The instances are verified by running boundsTraceRefinementChecker script. The script just calls FDR2 for each instance and reports whether all of them are correct, i.e. the whole parameterised system is correct in the sense of the parameterised specification. Alternatively, an error reported by FDR2 is shown, which implies that the system is incorrect.
The statistics at the end is the output of time command. In this case, it takes 2 minutes and 12 seconds to check the 6 instances using FDR2 on a laptop with 4GB of memory and Intel Core i5 processor running Ubuntu Gnu Linux 12.04.
There are two ways a user can affect the performance of the tool: by using as many relation-negated conjuncts in valuation contraints as possible and by changing the parameter order.
Use Relation-Negated Conjuncts in Valuation Constraints You should specify the values of variables and relation symbols using as many conjuncts, where relations occur within an odd number of negations, as possible. Formulae of this kind are called relation-negated. The reason why you should do so is that Bounds searches the allowed values of a relation symbols in the subset of order starting from the smallest possible value, the empty relation. If a value, i.e., a relation, that does not satisfy a relation-negated conjunct is encountered, the search tree can be immediately pruned, because the relation-negated formulae cannot be made true by enlarging the values of relation symbols.
In the case of SRS, there is one such a conjunct, irrefl, namely. It is clear that if some relation is not irreflexive, then all the larger relations are not irreflexive either. Hence, the search for the values of Anc can be pruned as soon as a non-irreflexive value is encountered. However, it does not hold for other conjuncts, tran and ancprop, which are not relation-negated.
To improve the computation of valuations, we can try to introduce new relation-negated conjuncts in ForestTopo such that the model class of the valuation constraint remains the same. For example, we may add the requirement of asymmetry explicitly in ForestTopo.
This improves the search which appears in the reduced number of valuations generated and stored.
Here, the speed-up is not observable but in the case of larger examples the influence of using relation-negated conjuncts may be dramatic.
Change the Parameter Order if Needed The other possibility to improve the search is to try to change the order in which the values of parameters are computed. A user can change the order in pursuance of the definition by giving an insertion value for a parameter. For example,
results in
However, in the case of SRS, the default order is the optimal one.
Next, we consider a host configuration protocol (HCP), where each host repeatedly picks a network address until it finds one that is not used by other hosts. This is done by broadcasting address queries and replies to other hosts in the style of ARP (Address Resolution Protocol). Our goal is to formally model the protocol with an arbitrary number of hosts and an arbitrarily large address space and prove that in our construction, each address is possessed by at most one host.
In HCP, there are two kinds of replicated objects: hosts and addresses. Hence, we use a type T to represent the set of the identifiers of hosts and a type T to denote the set of available addresses. We also introduce some variables which we use to refer to the hosts and addresses.
The events used in HCP come in three forms: ihave(h,a) denotes that the host h has the address a, timeout denotes timeout and whohas(h,a) means that the host h wants to know whether someone has the address a.
In order to formalise the specification of HCP, we first capture its behaviour from the viewpoint of two hosts (and all addresses) in an LTS DifAdr. Initially, DifAdr allows the host h2 to report having any address but after the host h has picked an address a, the host h2 is no longer allowed to report having a. The model of the full specification is obtained by letting h and h2 to range over all pairs of different host identifiers and by composing all the resulting instances of DifAdr in parallel. Hence, DifAdr allows for each host to report only a single unique address.
Above, the variables a and a2 of the type A are used in replicated choices, i.e., structures of the form S1 = []a:chan(a) -> S2(a). That is why Bounds classifies A as a data type. Respectively, H is considered a process type, since the variables h and h2 of the type H are used in replicated parallel compositions, i.e., structures of the form ||h:P.
Bounds is able to compute cut-offs only when each type is classified either as a data type or a process type, not both. If some type is used both as a data type and a process type, then Bounds asks for a user to input a cut-off. In this case, the result of the verification is of course unsound.
The implementation of HCP is created in a similar way as the specification. First, we capture it in an LTS Host mainly from the viewpoint of a host h and secondarily from the viewpoint of a host h2. Initially, Host just ignores all timeouts and the messages sent by h2. The host h can also pick an address a and broadcast a query in order to see whether the address is already in use. If the other host h2 says that it has the address or is about to take the same address, then Host returns to the initial state. However, if the query timeouts, then the host h can either keep it or discard it and return to the initial state. If the host h decides to keep the address, then Host goes to a state where h listens to broadcasts from other hosts and replies when needed. Specifically, if h2 queries whether the address a is in use, then h replies that it already has it.
Again, as we let h and h2 to range over all pairs of different host identifiers and compose the instances of Host in parallel, we obtain the model of the implementation from the viewpoint of all hosts and addresses.
Finally, we hide the events irrelevant to the specification and ask whether the implementation is a trace refinement of the specification.
In this case, the structural cut-offs computed by the instance generator are two for the number of hosts and 16 for the number of addresses. However, the further analysis performed by the instance generator reveals that in order to prove the HCP implementation correct for any number of hosts and addresses it is sufficient to check six instances. After running the instance checker, we see that all six instances are correct, which implies that also the HCP implementation is correct for any number of hosts and addresses. In other words, each address can be possessed by at most one host. The whole process of verifying HCP took less than a second. The HCP model can be found in the file examples/hcp.cps.
So far, we have considered only the verification of safety properties via trace refinement. However, Bounds lends support to two other notions of correctness, too, compatibility and alternating simulation, namely [2]. For this purpose, we need to distinguish between two kinds event, inputs and outputs. By default, the events are treated as outputs and inputs are marked by putting a question mark in front of the channel name. One can also use an exclamation mark to explicitly mark an event as an output.
When using inputs and outputs, we often talk about interface automata [2] instead of LTSs. The use of inputs and outputs affects the parallel composition, too, such that when an input chan(a1,...,am) is synchronised with the output chan(a1,...,am), the event becomes invisible, tau. Consequently, there is no need for explicit hiding when working with interface automata. Actually, the use of hiding is prohibited when checking for compatibility or alternating simulation.
As an example, we consider a simple variant of the shared resource system (SRS) where the resources are not ordered and we do not make a distinction between read and write accesses. Consequently, there is no need for multiple lock modes either. That is why we model the system by using the following events:
Two interfaces are compatible if there is some environment process such that when the interfaces and the environment are composed in parallel, it is not possible to reach a state where one interface is willing to execute an output event chan(a1,...,am) but the other interface blocks the input event chan(a1,...,am). This is an optimistic view to compatibility since it is sufficient that there is at least one environment where outputs can be consumed immediately.
Our one goal is to model a user, resource and lock (mutex) interface and show that they are compatible. Since the SRS system is modelled in an analogous way as earlier, we do not go into modelling details here but just give the CPS code.
Finally, we state the verification task: we want to know whether the components of ISRS are compatible.
The model is processed by boundsInstanceGenerator as earlier. However, this time, the output of the instance generator is processed by boundsCompatibilityChecker which calls FDR2 in order to verify the instances. In this case, all the instances are generated and found to be correct within a second, which implies that the user, resource and lock interfaces are compatible for any number of users and resources. This is example is found in the file examples/miniSRS_comp.cps.
Intuitively, a concrete interface (or implementation) is an alternating simulation refinement of (or simply refines, implements or is correct with respect to) an abstract interface (a specification) if the concrete interface does not have stronger input assumptions nor weaker output guarantees than the abstract one.
Our other goal is to show that the resource interface can be safely implemented without concurrency control. For that purpose, we introduce the following CPS code:
Again, the model is processed by boundsInstanceGenerator but the output is processed by boundsAlternatingSimulationChecker. This program uses FDR2 for computing the parallel compositions and MIO Workbench [3] for establishing alternating simulation. This time, the instance generator is finished within a second but running the instance generator takes roughly six seconds. Most of the time is spent on starting Eclipse under which MIO Workbench runs. This example can be found in the file examples/res_io.cps.
You should note that Bounds is not able to compute cut-offs for compatibility and alternating simulation when relation symbols are being used.
For further information on the theory behind CPS and the operation of Bounds, see my Ph.D. thesis [5] and the conference papers [8, 9, 4]. Of course, you can always contact me personally, too.
[1] Roscoe, A.W.: Understanding Concurrent Systems. Springer (2010)
[2] De Alfaro, L., Henzinger, T.: Interface automata. ACM SIGSOFT Software Engineering Notes 26(5) (2001) 109–120
[3] Bauer, S., Mayer, P., Schroeder, A., Hennicker, R.: On weak modal compatibility, refinement, and the MIO Workbench. In: TACAS ’10, Springer (2010) 175–189
[4] Siirtola, A., Heljanko, K.: Parametrised compositional verification with multiple process and data types. In: ACSD ’13, IEEE (2013)
[5] Siirtola, A.: Algorithmic Multiparameterised Verification of Safety Properties. Process Algebraic Approach. PhD thesis, University of Oulu (2010)
[6] Haustein, M., H?der, T.: Optimizing lock protocols for native XML processing. Data Knowl. Eng. 65(1) (2008) 147–173
[7] Siirtola, A., Valenta, M.: Verifying parameterized taDOM+ lock managers. In: SOFSEM ’08. Volume 4910 of LNCS. Springer (2008) 460–472
[8] Siirtola, A., Kortelainen, J.: Algorithmic verification with multiple and nested parameters. In: ICFEM ’09. Volume 5885 of LNCS. Springer (2009) 561–580
[9] Siirtola, A.: Automated multiparameterised verification by cut-offs. In: ICFEM 2010. Volume 6447 of LNCS. Springer (2010) 321–337