Inference Mechanisms for a Separation and Numerical Domain

EPSRC Funded Project EP/G042322

Principal Investigator: Prof. Shengchao Qin

in collaboration with Prof. Wei-Ngan Chin (NUS)

 

 

Summary of the project:

The proliferation of software across all aspects of our life means that software failure can have significant economic and social impact. It is therefore highly desirable to be able to develop software that is formally verified as correct with respect to its expected specification. This has also been identified as a key objective in one of the UK Grand Challenges (GC6). Although research on formal verification has a long history, dating back to the 1960's, it remains a challenging problem to automatically verify programs written in mainstream imperative languages such as C, C++, C# and Java. This is in part due to the prolific use of (recursive) shared mutable data structures which are difficult to keep track of statically and in a precise and concise way.

The emergence of separation logic promotes scalable reasoning via explicit separation of structural properties over the memory heap where recursive data structures are dynamically allocated. Using separation logic, progress has been made on automated verification for pointer safety in the separation/shape domain. To verify the more general memory safety and functional correctness, it will require the combination of both separation (structural) and numerical (e.g. size) information. Therefore, advanced analysis and verification techniques are needed in the combined separation and numerical domain to verify memory safety and functional correctness. Nevertheless, this remains a clear challenge for program analysis research.

As a first step to tackle the challenge, Our development on program verification using a combined separation and numerical domain also allows user-specified inductive predicates to appear in program specifications for better expressivity. Based on this specification mechanism, a verification system called HIP/SLEEK has been built by a team led by our collaborator Prof Wei-Ngan Chin to conduct the automated verification and proof search [SCP12,FM11b]. The experimental results have confirmed the viability of this approach. One issue with this system is that it is a liability for the users to supply all loop invariants and method pre/post-conditions prior to the verification. This can be very demanding and challenging for the users.

As the second phase towards tackling the challenge, we propose to develop advanced inference mechanisms in the combined separation and numerical domain with user-defined predicates so that loop invariants and method pre/post-conditions can be automatically synthesised, where possible. Achieving this goal means that a much higher level of automation will be achieved, therefore a significant advance can be made in automated verification on memory safety and functional correctness.

In this project work, we reduce the need of user annotations by automatically inferring loop invariants over an abstract domain with both separation and numerical information [ICFEM10a,JSC13]. Our loop invariant synthesis is conducted automatically by a fixpoint iteration process, equipped with newly designed abstraction mechanism, and join and widening operators. Initial experiments have confirmed that we can synthesise loop invariants with non-trivial constraints.

To tackle the problem of method pre-/post-conditions inference, we have proposed two solutions. In one proposal, we allow users to provide only partial specifications to methods/functions (usually in the form of pre/post shape templates), a lightweight static analysis is then employed to automatically discover missing non-trivial constraints in pre/post to obtain complete pre/post specifications for methods/functions [FM11a,SCP14]. This proposal for refining partial specification is aimed at harnessing the synergy between human's insights and machine's capability at automated program analysis. In particular, human's guidance can help narrow down on the most important of the different specifications that are possible with each program code, while automation by machine is important for minimising on the tedium faced by users. In another proposal [ICFEM13a], we have designed a compositional analysis framework in the presence of user-defined predicates, which derive the pre/post summary for each method in the expressive abstract domain, independently from its callers. We have also proposed a novel abstraction method with a biabduction technique in the combined domain to discover pre-/post-conditions that could not be automatically inferred before. The analysis does not only prove the memory safety properties, but also finds relationships between pure and shape domains towards full functional correctness of programs. A prototype of the framework has been implemented and initial experiments have shown that our approach can discover interesting properties for non-trivial programs. In a recent work along this line [CAV14] (which takes place after the project period) we have made a significant step forward by proposing an advanced shape analysis via second-order bi-abduction to automatically discover pre/post shapes for heap-manipulating programs. This recent achievement complements well with either of the two pieces of work mentioned above. In another more recent work [PLDI15], we have applied the bi-aduction idea to help guide case-splits in a unified program analysis for termination and non-termination where specifications corresponding to termination and non-termination can be inferred by the same analysis.

During this project period, we have also proposed a novel method to verify memory safety of heap-manipulating programs with unknown procedure calls [JSC10,ICFEM10b]. We have conducted a bigger experiment by verifying the FreeRTOS scheduler [TASE12,STTT14b].

In summary, a key objective in the proposed research has been to find a systematic approach to abstraction construction in the combined domain, so that appropriate abstractions can be employed by the inference process. Abstractions are required in the analysis and verification for various reasons, such as termination and scalability. Appropriate abstraction mechanisms are crucial in maintaining a desirable scalability/precision trade-off. Apart from the abstraction mechanisms, we have also designed analysis algorithms for loop invariant synthesis, method post-condition inference and method pre-condition discovery for the combined domain with arbitrary user-defined predicates. We have built prototype tools to implement these analyses and evaluated them on various programs.

Our research outcomes have further improved the level of automation, and therefore significantly extended the viability and applicability of automated verification on memory safety as well as functional correctness for imperative (heap-manipulating) programs.

References: