CADE-30 Accepted Papers with abstracts
Exploiting Instantiations from Paramodulation Proofs in Isabelle/HOL
Abstract: Metis is an ordered paramodulation prover built into the Isabelle/HOL proof assistant. It attempts to close the current goal using a given list of lemmas. Typically these lemmas are found by Sledgehammer, a tool that integrates external automatic provers. We present a new tool that analyzes successful Metis proofs to derive variable instantiations. These increase Sledgehammer’s success rate, improve the speed of Sledgehammer-generated proofs, and help users understand why a goal follows from the lemmas.
Computing Witnesses Using the SCAN Algorithm
Abstract: Second-order quantifier-elimination is the problem of finding, given a formula with second-order quantifiers, a logically equivalent first-order formula.
While such formulas are not computable in general, there are practical algorithms and subclasses with applications throughout computational logic.
One of the most prominent algorithms for second-order quantifier elimination is the SCAN algorithm which is based on saturation theorem proving.
In this paper we show how the SCAN algorithm on clause sets can be extended to solve a more general problem: namely, finding an instance of the second-order quantifiers that results in a logically equivalent first-order formula.
In addition we provide a prototype implementation of the proposed method.
This work paves the way for applying the SCAN algorithm to new problems in application domains such as modal correspondence theory, knowledge representation, and verification.
While such formulas are not computable in general, there are practical algorithms and subclasses with applications throughout computational logic.
One of the most prominent algorithms for second-order quantifier elimination is the SCAN algorithm which is based on saturation theorem proving.
In this paper we show how the SCAN algorithm on clause sets can be extended to solve a more general problem: namely, finding an instance of the second-order quantifiers that results in a logically equivalent first-order formula.
In addition we provide a prototype implementation of the proposed method.
This work paves the way for applying the SCAN algorithm to new problems in application domains such as modal correspondence theory, knowledge representation, and verification.
Lexicographic Combination of Reduction Pairs
Abstract: We present a simple criterion for combining reduction pairs lexicographically. The criterion is applicable to arbitrary classes of reduction pairs, such as the polynomial interpretation, the matrix interpretation, and the Knuth--Bendix order. In addition, we investigate a variant of the matrix interpretation where the lexicographic order is employed instead of the usual component-wise order. Effectiveness is demonstrated by experiments and examples, including Touzet's Hydra Battle.
A Theorem Prover Based Approach for SAT-Based Model Checking Certification
Abstract: In the field of formal verification, certifying proofs serve as compelling evidence to demonstrate the correctness of a model within a deductive system. These proofs can be automatically generated as a by-product of the verification process and are key artefacts for high-assurance systems. Their significance lies in their ability to be independently verified by proof checkers, which provides a more convenient approach than certifying the tools that generate them.
Modern model checking algorithms adopt deductive methods and usually generate proofs in terms of inductive invariants, assuming that these apply to the original system under verification. Model checkers, though, often make use of a range of complex pre-processing simplifications and transformations to ease the verification process, which add another layer of complexity to the generation of proofs.
In this paper, we present a novel approach for certifying model checking results exploiting a theorem prover and a theory of temporal deductive rules that can support various kinds of transformations and simplification of the original circuit.
We implemented and experimentally evaluated our contribution on invariants generated using two state-of-the-art model checkers, nuXmv and PdTRAV, and by defining a set of rules within a theorem prover, to validate each certificate.
Modern model checking algorithms adopt deductive methods and usually generate proofs in terms of inductive invariants, assuming that these apply to the original system under verification. Model checkers, though, often make use of a range of complex pre-processing simplifications and transformations to ease the verification process, which add another layer of complexity to the generation of proofs.
In this paper, we present a novel approach for certifying model checking results exploiting a theorem prover and a theory of temporal deductive rules that can support various kinds of transformations and simplification of the original circuit.
We implemented and experimentally evaluated our contribution on invariants generated using two state-of-the-art model checkers, nuXmv and PdTRAV, and by defining a set of rules within a theorem prover, to validate each certificate.
Infinite State Model Checking by Learning Transitive Relations
Abstract: We propose a new approach for proving safety of infinite state systems. It extends the analyzed system by transitive relations until its diameter D becomes finite, i.e., until constantly many steps suffice to cover all reachable states, irrespective of the initial state. Then we can prove safety by checking that no error state is reachable in D steps. To deduce transitive relations, we use recurrence analysis. While recurrence analyses can usually find conjunctive relations only, our approach also discovers disjunctive relations by combining recurrence analysis with projections. An empirical evaluation of the implementation of our approach in our tool LoAT shows that it is highly competitive with the state of the art.
The Computability Path Ordering for Beta-Eta-Normal Higher-Order Rewriting
Abstract: We lift the computability path ordering and its extensions from
plain higher-order rewriting to higher-order rewriting on
beta-eta-normal forms where matching modulo beta-eta is
employed. The resulting order NCPO is shown to be useful on
practical examples. In particular, it can handle systems where its
cousin NHORPO fails even when it is used together with the
powerful transformation technique of normalization. We also argue
that automating NCPO efficiently is straightforward using SAT/SMT
solvers whereas this cannot be said about the transformation
technique of neutralization. Our prototype implementation
supports automatic termination proof search for NCPO and is also
the first one to automate NHORPO with neutralization.
plain higher-order rewriting to higher-order rewriting on
beta-eta-normal forms where matching modulo beta-eta is
employed. The resulting order NCPO is shown to be useful on
practical examples. In particular, it can handle systems where its
cousin NHORPO fails even when it is used together with the
powerful transformation technique of normalization. We also argue
that automating NCPO efficiently is straightforward using SAT/SMT
solvers whereas this cannot be said about the transformation
technique of neutralization. Our prototype implementation
supports automatic termination proof search for NCPO and is also
the first one to automate NHORPO with neutralization.
Simplified and Verified: A Second Look at a Proof-Producing Union-Find Algorithm
Abstract: Using Isabelle/HOL, we verify an union-find data structure with an explain operation due to Nieuwenhuis and Oliveras. We devise a simpler, more naive version of the explain operation whose soundness and completeness is easy to verify. Then, we prove the original formulation of the explain operation to be equal to our version. Finally, we refine this data structure to Imperative HOL, enabling us to export efficient imperative code. The formalisation provides a stepping stone towards the verification of proof-producing congruence closure algorithms which are a core ingredient of Satisfiability Modulo Theories (SMT) solvers.
Sort-Based Confluence Criteria for Non-Left-Linear Higher-Order Rewriting
Abstract: Powerful confluence criteria for higher-order rewriting exist for left-linear systems, even in the presence of non-termination. On the other hand, confluence criteria that allow for mixing non-termination and non-left-linearity are either extremely limited or hardly usable in practice. In this paper, we study confluence criteria which explore sort information to make proving higher-order confluence possible, even in the presence of non-termination and non-left-linearity. We give many interesting examples of systems covered by our results, including a (confluent) variant of Klop's counterexample, and a calculus issuing from a dependent type theory with cumulative universes.
Confluence of Almost Parallel-Closed Generalized Term Rewriting Systems
Abstract: *Generalized Term Rewriting Systems* (GTRSs) extend Conditional Term Rewriting Systems by (i) restricting rewritings on arguments of function symbols and (ii) allowing for more general conditions in rules, namely, atoms defined by a set of Horn clauses. They are useful to model and analyze properties of computations with sophisticated languages like Maude. Huet proved that left-linear and parallel closed Term Rewriting Systems (TRSs) are confluent. In this paper, we generalize and extend Huet's results to GTRSs *without requiring left-linearity*. This improves Huet's result, as we show with some examples. Also, Huet's result entails confluence of weakly orthogonal TRSs, thus providing a syntactic criterion for proving confluence *without requiring termination*. We similarly introduce *weakly V-orthogonal* GTRSs, which are confluent. These results, implemented in the confluence tool CONFident, improve its ability to deal with context-sensitive and conditional term rewriting systems. We briefly report on this.
Efficient Neural Clause-Selection Reinforcement
Abstract: Clause selection is arguably the most important choice point in saturation-based theorem proving. Framing it as a reinforcement learning task is a way to challenge the human-designed heuristics of state-of-the-art provers and to instead automatically evolve—just from prover experiences—their potentially optimal replacement.
In this work, we present a neural network architecture for scoring clauses for clause selection that is powerful yet efficient to evaluate. Following reinforcement learning principles to make design decisions, we integrate the network into the Vampire theorem prover and train it from successful proof attempts. An experiment on the diverse TPTP benchmark finds the neurally guided prover improve over a baseline strategy from which it initially learns—in terms of the number of in-training-unseen problems solved under a practically relevant, short CPU instruction limit—by 20%.
In this work, we present a neural network architecture for scoring clauses for clause selection that is powerful yet efficient to evaluate. Following reinforcement learning principles to make design decisions, we integrate the network into the Vampire theorem prover and train it from successful proof attempts. An experiment on the diverse TPTP benchmark finds the neurally guided prover improve over a baseline strategy from which it initially learns—in terms of the number of in-training-unseen problems solved under a practically relevant, short CPU instruction limit—by 20%.
Relatively Complete and Efficient Partial Quantifier Elimination
Abstract: Quantifier elimination is used in various automated reasoning tasks, including quantified SMT solving, Exists/Forall solving, program synthesis, model checking, and constrained Horn clause (CHC) solving. Complete quantifier elimination, however, is computationally intractable for many theories. The recent algorithm QEL shows a promising approach to approximate quantifier elimination, which has resulted in improvements in solver performance. QEL performs partial quantifier elimination with a completeness guarantee that depends on a certain semantic property of the given formula. Considerably generalizing the previous approach, we identify a subclass of local theories in which partial quantifier elimination can be performed efficiently. We present T-QEL, a sound extension of QEL, and an efficient algorithm for this class of theories. The algorithm utilizes the proof theoretic characterization of the theories, which is based on restricted derivations. Finally, we prove for T-QEL, soundness in general, and relative completeness with respect to the identified class of theories.
Uniform Interpolation and Forgetting for Large-Scale Ontologies with Application to Semantic Difference in SNOMED CT
Abstract: In this paper, we present a practical method for computing uniform interpolation (UI) and forgetting in ELIH-ontologies, addressing fundamental operations critical for detecting semantic differences in ontology evolution. Our method extends previous work to accommodate inverse roles while maintaining termination and soundness guarantees. Empirical evaluation across industry benchmark datasets Oxford ISG and NCBO BioPortal demonstrates 100% success rates with significant improvements in computational efficiency compared to state-of-the-art systems. The practical impact of our approach is validated through application to SNOMED CT, the world's largest clinical terminology supporting healthcare information systems globally. This enables terminologists and developers to systematically track semantic differences, detect unintended consequences, and validate change safety across SNOMED CT releases.
Being polite is not enough (and other limits of theory combination)
Abstract: In the Nelson–Oppen combination method for satisfiability modulo theories, the combined theories must be stably infinite; in gentle combination, one theory has to be gentle, and the other has to satisfy a similar yet weaker property; in shiny combination, only one has to be shiny (smooth, with a computable minimal model function and the finite model property); and for polite combination, only one has to be strongly polite (smooth and strongly finitely witnessable). For each combination method, we prove that if any of its assumptions are removed, then there is no general method to combine an arbitrary pair of theories satisfying the remaining assumptions. We also prove new theory combination results that weaken the assumptions of gentle and shiny combination.
Ground Truth: Checking Vampire Proofs via Satisfiability Modulo Theories
Abstract: The Vampire automated theorem prover is extended to output proofs in such a way that each inference is represented by a quantifier-free SMT instance. If every instance is unsatisfiable, the proof can be considered verified by an external SMT solver. This pragmatic form of proof checking places only a very light burden on the SMT solver, and can easily handle inferences that other systems may find difficult, such as theory inferences or extensive ground reasoning. The method is considerably easier to implement than proof formats based on small kernels and covers a greater variety of modern-day inferences.
Computing Ground Congruence Classes
Abstract: Congruence closure on ground equations is a well-established and efficient algorithm for deciding ground equalities. It constructs an explicit representation of ground equivalence classes based on a given set of input equations, allowing ground equalities to be decided by membership. In many applications, these ground equations originate from grounding non-ground equations.
We propose an algorithm that directly computes a non-ground representation of ground congruence classes for non-ground equations. Our approach is sound and complete with respect to the corresponding ground congruence classes. Experimental results demonstrate that computing non-ground congruence classes often outperforms the classical ground congruence closure algorithm in efficiency.
We propose an algorithm that directly computes a non-ground representation of ground congruence classes for non-ground equations. Our approach is sound and complete with respect to the corresponding ground congruence classes. Experimental results demonstrate that computing non-ground congruence classes often outperforms the classical ground congruence closure algorithm in efficiency.
Interoperability of Proof systems with SC-TPTP (System Description)
Abstract: We introduce SC-TPTP, an extension of the TPTP derivation format that encompasses sequent formalism, enabling seamless proof exchange between interactive theorem provers and first-order automated theorem provers.
We provide a way to represent non-deductive steps---Skolemization, clausification, and Tseitin normal form---as deductive steps within the format.
Building upon the existing support in the Lisa proof assistant and the Goéland theorem prover, SC-TPTP ecosystem is further enhanced with proof output interfaces for Egg and Prover9, as well as proof reconstruction support for HOL Light, Lean, and Rocq.
We provide a way to represent non-deductive steps---Skolemization, clausification, and Tseitin normal form---as deductive steps within the format.
Building upon the existing support in the Lisa proof assistant and the Goéland theorem prover, SC-TPTP ecosystem is further enhanced with proof output interfaces for Egg and Prover9, as well as proof reconstruction support for HOL Light, Lean, and Rocq.
Cut Elimination for Negative Free Logics with Definite Descriptions
Abstract: We present a sequent calculus GNFL for negative free logic with definite descriptions in the classical and intuitionistic version, with empty and nonempty domains. It is shown constructively that GNFL satisfies the cut elimination theorem and its cut-free version satisfies the subformula property.
Learning Conjecturing from Scratch
Abstract: We develop a self-learning approach for conjecturing of induction predicates on a dataset of 16197 problems derived from the OEIS. These problems are hard for today's SMT and ATP systems because they require a combination of inductive and arithmetical reasoning.
Starting from scratch, our approach consists of a feedback loop that iterates between
(i) training a neural translator to learn the correspondence between the problems solved so far and the induction predicates useful for them,
(ii) using the trained neural system to generate many new induction predicates for the problems,
(iii) fast runs of the z3 prover attempting to prove the problems using the generated predicates,
(iv) using heuristics such as predicate size and solution speed on the proved problems to choose the best predicates for the next iteration of training.
The algorithm discovers on its own many interesting induction predicates, ultimately solving over 5500 problems, compared to 2265 problems solved by CVC5, Vampire or Z3.
Starting from scratch, our approach consists of a feedback loop that iterates between
(i) training a neural translator to learn the correspondence between the problems solved so far and the induction predicates useful for them,
(ii) using the trained neural system to generate many new induction predicates for the problems,
(iii) fast runs of the z3 prover attempting to prove the problems using the generated predicates,
(iv) using heuristics such as predicate size and solution speed on the proved problems to choose the best predicates for the next iteration of training.
The algorithm discovers on its own many interesting induction predicates, ultimately solving over 5500 problems, compared to 2265 problems solved by CVC5, Vampire or Z3.
Concrete Domains Meet Expressive Cardinality Restrictions in Description Logics
Abstract: Standard Description Logics (DLs) can encode quantitative aspects of an application domain through either *number restrictions*, which constrain the number of individuals that are in a certain relationship with an individual, or *concrete domains*, which can be used to assign concrete values to individuals using so-called features. These two mechanisms have been extended towards very expressive DLs, for which reasoning nevertheless remains decidable. Number restrictions have been generalized to more powerful comparisons of sets of role successors in ALCSCC, while the comparison of feature values of different individuals in ALC(D) has been studied in the context of ω-admissible concrete domains D. In this paper, we combine both formalisms and investigate the complexity of reasoning in the thus obtained DL ALCOSCC(D), which additionally includes the ability to refer to specific individuals by name. We show that, in spite of its high expressivity, the consistency problem for this DL is ExpTime-complete, assuming that the constraint satisfaction problem of D is also decidable in exponential time. It is thus not higher than the complexity of the basic DL ALC. At the same time, we show that many natural extensions to this DL, including a tighter integration of the concrete domain and number restrictions, lead to undecidability.
A Fresh Inductive Approach to Useful Call-by-Value
Abstract: Useful evaluation is an optimised evaluation mechanism for functional programming languages, introduced by Accattoli and Dal Lago.
The key to useful evaluation is to represent programs with sharing and to implement substitution of terms only when this contributes to the progress of the computation. Initially defined in the framework of call-by-name, useful evaluation has since been extended to call-by-value. The definitions of usefulness in the literature are complex and lack inductive structure, which makes it challenging to (formally) reason about them.
In this work, we define useful call-by-value evaluation inductively, proceeding in two stages. First, we refine the well-known Value Substitution Calculus, so the substitution operation becomes linear, yielding the lcbv calculus. The two calculi are observationally equivalent. We then further refine lcbv by restricting linear substitution only when it contributes to the progress of the computation, yielding the ucbv strategy. This new substitution notion is sensitive to the surrounding evaluation context, so it is non-trivial to capture it inductively. Moreover, we formally show that the resulting ucbv is a sound and complete implementation of lcbv, optimised to implement useful evaluation.
As a further contribution, we show that the ucbv strategy can be implemented by an existing lower-level abstract machine called GLAMoUr with polynomial overhead in time. This entails, as a corollary, that ucbv is time-invariant, i.e., that the number of reduction steps to normal form in ucbv can be used as a measure of time complexity.
Our ucbv strategy is part of the preliminary work required to develop semantic interpretations of useful evaluation, for which its inductive formulation is more suitable than the (non-inductive) existing ones.
The key to useful evaluation is to represent programs with sharing and to implement substitution of terms only when this contributes to the progress of the computation. Initially defined in the framework of call-by-name, useful evaluation has since been extended to call-by-value. The definitions of usefulness in the literature are complex and lack inductive structure, which makes it challenging to (formally) reason about them.
In this work, we define useful call-by-value evaluation inductively, proceeding in two stages. First, we refine the well-known Value Substitution Calculus, so the substitution operation becomes linear, yielding the lcbv calculus. The two calculi are observationally equivalent. We then further refine lcbv by restricting linear substitution only when it contributes to the progress of the computation, yielding the ucbv strategy. This new substitution notion is sensitive to the surrounding evaluation context, so it is non-trivial to capture it inductively. Moreover, we formally show that the resulting ucbv is a sound and complete implementation of lcbv, optimised to implement useful evaluation.
As a further contribution, we show that the ucbv strategy can be implemented by an existing lower-level abstract machine called GLAMoUr with polynomial overhead in time. This entails, as a corollary, that ucbv is time-invariant, i.e., that the number of reduction steps to normal form in ucbv can be used as a measure of time complexity.
Our ucbv strategy is part of the preliminary work required to develop semantic interpretations of useful evaluation, for which its inductive formulation is more suitable than the (non-inductive) existing ones.
SMT and Functional Equation Solving over the Reals: Challenges from the IMO
Abstract: We use SMT technology to address a class of problems involving uninterpreted
functions and nonlinear real arithmetic. In particular, we focus on problems
commonly found in mathematical competitions, such as the International
Mathematical Olympiad (IMO), where the task is to determine all solutions to
constraints on an uninterpreted function. Although these problems require only
high-school-level mathematics, state-of-the-art SMT solvers often struggle with
them. We propose several techniques to improve SMT performance in this setting.
functions and nonlinear real arithmetic. In particular, we focus on problems
commonly found in mathematical competitions, such as the International
Mathematical Olympiad (IMO), where the task is to determine all solutions to
constraints on an uninterpreted function. Although these problems require only
high-school-level mathematics, state-of-the-art SMT solvers often struggle with
them. We propose several techniques to improve SMT performance in this setting.
More is Less: Adding Polynomials for Faster Explanations in NLSAT
Abstract: To check the satisfiability of (non-linear) real arithmetic formulas, modern satisfiability modulo theories (SMT) solving algorithms like NLSAT depend heavily on single cell construction, the task of generalizing a sample point to a connected subset (cell) of $\RR^n$ that contains the sample and over which a given set of polynomials is sign-invariant.
In this paper, we propose to speed up the computation and simplify the representation of the resulting cell by extending the considered set of polynomials with further linear polynomials.
While this increases the total number of (smaller) cells generated throughout the algorithm, our experiments show that it can pay off when using suitable heuristics due to the interaction with Boolean reasoning.
In this paper, we propose to speed up the computation and simplify the representation of the resulting cell by extending the considered set of polynomials with further linear polynomials.
While this increases the total number of (smaller) cells generated throughout the algorithm, our experiments show that it can pay off when using suitable heuristics due to the interaction with Boolean reasoning.
Anti-pattern templates
Abstract: We introduce anti-pattern templates, which extend both anti-pattern matching by abstracting ground terms to variables, and unification by negative constraints. For anti-patter templates, we study the satisfiability problem, which asks whether a given template can be instantiated to a valid anti-pattern matching instance.
We show that the satisfiability problem for anti-pattern templates is NP-complete, but it becomes tractable if the number of negation symbols is bounded. Next, we consider anti-pattern templates modulo an equational theory and discuss its relations with other variants of equational frameworks with negative constraints. In particular, we conclude that the satisfiability problem for anti-pattern templates considered modulo associativity becomes undecidable, while it is decidable modulo associativity and commutativity.
We show that the satisfiability problem for anti-pattern templates is NP-complete, but it becomes tractable if the number of negation symbols is bounded. Next, we consider anti-pattern templates modulo an equational theory and discuss its relations with other variants of equational frameworks with negative constraints. In particular, we conclude that the satisfiability problem for anti-pattern templates considered modulo associativity becomes undecidable, while it is decidable modulo associativity and commutativity.
Verified Path Indexing
Abstract: Indexing of syntactic terms is a key component for the efficient
implementation of automated theorem provers. This paper presents the
first verified implementation of a term indexing data structure,
namely a formalization of path indexing in the proof assistant
Isabelle/HOL. We define the data structure, maintenance operations,
and retrieval operations, including retrieval of unifiable terms,
instances, generalizations and variants. We prove that maintenance
operations preserve the invariants of the structure, and that
retrieval operations are sound and complete.
implementation of automated theorem provers. This paper presents the
first verified implementation of a term indexing data structure,
namely a formalization of path indexing in the proof assistant
Isabelle/HOL. We define the data structure, maintenance operations,
and retrieval operations, including retrieval of unifiable terms,
instances, generalizations and variants. We prove that maintenance
operations preserve the invariants of the structure, and that
retrieval operations are sound and complete.
Term Ordering Diagrams
Abstract: The superposition calculus for reasoning in first-order logic with equality relies on simplification orderings on terms. Modern saturation provers use the Knuth-Bendix order (KBO) and the lexicographic path order (LPO) for discovering redundant clauses and inferences. Implementing term orderings is however challenging. While KBO comparisons can be performed in linear time and LPO checks in quadratic time, using the best known algorithms for these orders is not enough. Indeed, our experiments show that for some examples term ordering checks may use about 98% of the overall proving time. The reason for this is that some equalities that cannot be ordered can become ordered after applying a substitution (post-ordered), and we have to check for post-ordering repeatedly for the same equalities.
In this paper, we show how to improve post-ordering checks by introducing a new data structure called term ordering diagrams, in short TODs, which creates an index for these checks. We achieve efficiency by lazy modifications of the index and by storing and reusing information from previously performed checks to speed up subsequent checks. Our experiments demonstrate efficiency of TODs.
In this paper, we show how to improve post-ordering checks by introducing a new data structure called term ordering diagrams, in short TODs, which creates an index for these checks. We achieve efficiency by lazy modifications of the index and by storing and reusing information from previously performed checks to speed up subsequent checks. Our experiments demonstrate efficiency of TODs.
Partial Redundancy in Saturation
Abstract: Redundancy elimination is one of the crucial ingredients of efficient saturation-based proof search. We improve redundancy elimination by introducing a new notion of redundancy, based on partial clauses and redundancy formulas, which is more powerful than the standard notion: there are both clauses and inferences that are redundant when we use our notions and not redundant when we use standard notions. In a way, our notion blurs the distinction between redundancy at the level of inferences and redundancy at the level of clauses. We present a superposition calculus PaRC on partial clauses. Our calculus is refutationally complete and is strong enough to capture some standard restrictions of the superposition calculus. We discuss the implementation of the calculus in the theorem prover Vampire. Our experiments show the power of the new approach: we were able to solve 24 TPTP problems not previously solved by any prover, including previous versions of Vampire.
Faithful Logic Embeddings in HOL – A recipe to have it all: deep and shallow, automated and interactive, heavy and light, proofs and counterexamples, meta and object level
Abstract: Deep and shallow embeddings of non-classical logics in classical higher-order logic have been explored, implemented, and used in various automated reasoning tools in recent years. This paper presents a recipe for the simultaneous deployment of different forms of deep and shallow embeddings in classical higher-order logic, enabling not only flexible interactive and automated theorem proving and counterexample finding at meta and object level, but also automated faithfulness proofs between the logic embeddings. The approach, which is fruitful for logic education, research and application, is deliberately illustrated here using simple propositional modal logic. However, the work presented is conceptual in nature and not limited to such a simple logic context.
Entailment vs. Verification for Partial-assignment Satisfiability and Enumeration
Abstract: Many procedures for SAT-related problems, in particular for those requiring the complete enumeration of satisfying truth assignments, rely their efficiency and effectiveness on the detection of (possibly small) partial assignments satisfying an input formula. Surprisingly, there seems to be no unique universally-agreed definition of formula satisfaction by a partial assignment in the literature.
In this paper we analyze in deep the issue of satisfaction by partial assignments, highlighting some ambiguities and subtleties of this concept, and investigating their practical consequences. We identify two alternative notions that are implicitly used in the literature, namely verification and entailment, which coincide if applied to CNF formulas but differ and present complementary properties if applied to non-CNF or to existentially-quantified formulas. We show that, although the former is easier to check and as such is implicitly used by most current search procedures, the latter has better theoretical properties, and can improve the efficiency and effectiveness of enumeration procedures.
In this paper we analyze in deep the issue of satisfaction by partial assignments, highlighting some ambiguities and subtleties of this concept, and investigating their practical consequences. We identify two alternative notions that are implicitly used in the literature, namely verification and entailment, which coincide if applied to CNF formulas but differ and present complementary properties if applied to non-CNF or to existentially-quantified formulas. We show that, although the former is easier to check and as such is implicitly used by most current search procedures, the latter has better theoretical properties, and can improve the efficiency and effectiveness of enumeration procedures.
A Stepwise Refinement Proof that SCL(FOL) Simulates Ground Ordered Resolution
Abstract: Recently, it has been demonstrated that SCL(FOL) can simulate ground ordered resolution. We revisit this result and provide a new formal proof in Isabelle/HOL. The existing pen-and-paper proof is monolithic and challenging to comprehend. In order to improve clarity, we develop an alternative proof structured as eleven (bi)simulation steps between the two calculi, transitioning from ordered resolution to SCL(FOL). A key simulation lemma ensures that, under certain conditions, one simulation direction can be automatically lifted to the other. Consequently, for each of the eleven steps, it suffices to establish only one direction of simulation. The complete proof is included in the Isabelle Archive of Formal Proofs.
What's Decidable About Arrays With Sums?
Abstract: The theory of arrays, supported by virtually all SMT solvers, is one of the most important theories in program verification. The standard theory of arrays, which provides \emph{read} and \emph{write} operations, has been extended in various different ways in past research, among others by adding extensionality, constant arrays, function mapping (resulting in combinatorial array logic), counting, and projections. This paper studies array theories extended with \emph{sum constraints,} which capture properties of the sum of all elements of an integer array. The paper shows that the theory of extensional arrays extended with constant arrays and sum constraints can be decided in non-deterministic polynomial time. The decision procedure works both for finite and infinite index sorts, as long as the cardinality is fixed a priori. In contrast, adding sum constraints to combinatorial array logic gives rise to an undecidable theory. The paper concludes by studying several fragments in between standard arrays with sums and combinatorial arrays with sums, aiming at providing a complete characterization of decidable and undecidable fragments.
On Symbol Elimination and Uniform Interpolation in Theory Extensions
Abstract: We study the links between symbol elimination in local theory extensions
and possibilities of computing uniform interpolants for ground formulae
w.r.t. such theories.
In contrast to approaches which only consider the problem of eliminating
existentially quantified variables, we distinguish between interpreted and
uninterpreted function symbols and analyze possibilities of obtaining
uniform interpolants which can contain arbitrary interpreted function symbols,
and uninterpreted function symbols in a given, specified set.
We identify situations in which a hierarchical form of symbol elimination
can be successfully used for this, and also investigate the limitations of
the method we propose.
and possibilities of computing uniform interpolants for ground formulae
w.r.t. such theories.
In contrast to approaches which only consider the problem of eliminating
existentially quantified variables, we distinguish between interpreted and
uninterpreted function symbols and analyze possibilities of obtaining
uniform interpolants which can contain arbitrary interpreted function symbols,
and uninterpreted function symbols in a given, specified set.
We identify situations in which a hierarchical form of symbol elimination
can be successfully used for this, and also investigate the limitations of
the method we propose.
Boosting MCSat Modulo Nonlinear Integer Arithmetic via Local Search.
Abstract: The Model Constructing Satisfiability (MCSat) approach to
the SMT problem extends the ideas of CDCL from the SAT level to the
theory level. Like SAT, its search is driven by incrementally constructing
a model by assigning concrete values to theory variables and performing
theory-level reasoning to learn lemmas when conflicts arise. Therefore,
the selection of values can significantly impact the search process and the
solver’s performance. In this work, we propose guiding the MCSat search
by utilizing assignment values discovered through local search. First, we
present a theory-agnostic framework to seamlessly integrate local search
techniques within the MCSat framework. Then, we highlight how to use
the framework to design a search procedure for Nonlinear Integer Arith-
metic (NIA), utilizing accelerated hill-climbing and a new operation
called feasible-sets jumping. We implement the proposed approach in the
Model Constructing Satisfiability (MCSat) engine of the Yices2 solver,
and empirically show its performance using the NIA benchmarks of
SMT-LIB.
the SMT problem extends the ideas of CDCL from the SAT level to the
theory level. Like SAT, its search is driven by incrementally constructing
a model by assigning concrete values to theory variables and performing
theory-level reasoning to learn lemmas when conflicts arise. Therefore,
the selection of values can significantly impact the search process and the
solver’s performance. In this work, we propose guiding the MCSat search
by utilizing assignment values discovered through local search. First, we
present a theory-agnostic framework to seamlessly integrate local search
techniques within the MCSat framework. Then, we highlight how to use
the framework to design a search procedure for Nonlinear Integer Arith-
metic (NIA), utilizing accelerated hill-climbing and a new operation
called feasible-sets jumping. We implement the proposed approach in the
Model Constructing Satisfiability (MCSat) engine of the Yices2 solver,
and empirically show its performance using the NIA benchmarks of
SMT-LIB.
Equational Reasoning Modulo Commutativity in Languages with Binders
Abstract: In computer science and mathematical logic, formal languages abstract away concrete syntax details, focusing on abstract syntax to capture structural properties. However, conventional abstract syntax falls short for languages with variable-binding constructs. In this paper, we work with the nominal language which is a general formal language for representing binders, freshness conditions and $\alpha$-renaming. Rather than relying on the usual freshness constraints, we introduce a nominal algebra which employs a generalisation of permutation fixed-point constraints in $\alpha$-equivalence judgements, seamlessly integrating commutativity into the reasoning process.
We investigate the proof-theoretical properties of the framework and provide a sound and complete semantics in the setting of nominal sets. Additionally, we propose a novel algorithm for nominal unification modulo commutativity, which we prove terminating and correct. Moreover, thanks to the use of fixed point constraints, the resulting unification theory is finitary, unlike the standard nominal unification modulo commutativity using freshness constraints. This extension provides a robust foundation for structural induction and recursion over syntax with binding constructs and commutative operators, such as first-order logic and the $\pi$-calculus, enabling reasoning modulo both $\alpha$-equivalence and commutativity.
We investigate the proof-theoretical properties of the framework and provide a sound and complete semantics in the setting of nominal sets. Additionally, we propose a novel algorithm for nominal unification modulo commutativity, which we prove terminating and correct. Moreover, thanks to the use of fixed point constraints, the resulting unification theory is finitary, unlike the standard nominal unification modulo commutativity using freshness constraints. This extension provides a robust foundation for structural induction and recursion over syntax with binding constructs and commutative operators, such as first-order logic and the $\pi$-calculus, enabling reasoning modulo both $\alpha$-equivalence and commutativity.
From Modal Sequent Calculi to Modal Resolution
Abstract: We establish a systematic correspondence between sequent
calculi and resolution calculi for a broad class of modal logics. Our main
result is that soundness and completeness transfer from sequent to resolution
calculi as long as cut and weakening are admissible. We first
construct generative calculi that essentially import modal rules to a res-
olution setting, and then show how soundness and completeness transfer
to absorptive calculi, where modal rules are translated into generalised
resolution rules. We discuss resolution calculi that establish validity, and
then introduce local clauses to generate calculi for inconsistency. Finally,
for modal rules of a certain shape, we show how to construct layered
resolution calculi, a technique that has so far only been established for
normal modal logics. Our work directly yields new sound and complete
resolution calculi, and layered resolution calculi, for a large number of
modal logics.
calculi and resolution calculi for a broad class of modal logics. Our main
result is that soundness and completeness transfer from sequent to resolution
calculi as long as cut and weakening are admissible. We first
construct generative calculi that essentially import modal rules to a res-
olution setting, and then show how soundness and completeness transfer
to absorptive calculi, where modal rules are translated into generalised
resolution rules. We discuss resolution calculi that establish validity, and
then introduce local clauses to generate calculi for inconsistency. Finally,
for modal rules of a certain shape, we show how to construct layered
resolution calculi, a technique that has so far only been established for
normal modal logics. Our work directly yields new sound and complete
resolution calculi, and layered resolution calculi, for a large number of
modal logics.
A Real-Analytic Approach to Differential-Algebraic Dynamic Logic
Abstract: This paper introduces a proof calculus for real-analytic differential-algebraic dynamic logic, a variant of differential-algebraic dynamic logic.
Our calculus allows correct reduction of differential-algebraic equations
to ordinary differential equations through axiomatized index reduction.
Additionally, it ensures compatibility between differential-algebraic equation proof calculus and (differential-form) differential dynamic logic for
hybrid systems. One key contribution is ghost switching which establishes
precise conditions to decompose multi-modal systems into hybrid pro-
grams, thereby correctly hybridizing sophisticated differential-algebraic
dynamics. We demonstrate our calculus on a Euclidean pendulum and
formally verify its energy conservation.
Our calculus allows correct reduction of differential-algebraic equations
to ordinary differential equations through axiomatized index reduction.
Additionally, it ensures compatibility between differential-algebraic equation proof calculus and (differential-form) differential dynamic logic for
hybrid systems. One key contribution is ghost switching which establishes
precise conditions to decompose multi-modal systems into hybrid pro-
grams, thereby correctly hybridizing sophisticated differential-algebraic
dynamics. We demonstrate our calculus on a Euclidean pendulum and
formally verify its energy conservation.
Unfolding Boxes with Local Constraints
Abstract: We consider the problem of finding and enumerating polyominos that can be folded into multiple non-isomorphic boxes. While several computational approaches have been explored, including SAT, randomized algorithms, and decision diagrams, none has been able to perform at scale.
We argue that existing SAT encodings are hindered by the presence of global constraints (e.g., graph connectivity or acyclicity), which are generally hard to encode effectively, and hard for solvers to reason about. In this work, we propose a new SAT-based approach that replaces these global constraints with simple local constraints that have substantially better propagation properties. Our approach dramatically improves the scalability of both computing and enumerating common box unfoldings: (i) while previous approaches could only find common unfoldings of two boxes up to area 88, ours easily scales up to 150, and (ii) while previous approaches were only able to enumerate common unfoldings up to area 30, ours scales up to 60. This allows us to rule out 46, 54, and 58 as the smallest areas allowing a common unfolding of three boxes, thereby refuting a conjecture of Xu et al. (2017).
We argue that existing SAT encodings are hindered by the presence of global constraints (e.g., graph connectivity or acyclicity), which are generally hard to encode effectively, and hard for solvers to reason about. In this work, we propose a new SAT-based approach that replaces these global constraints with simple local constraints that have substantially better propagation properties. Our approach dramatically improves the scalability of both computing and enumerating common box unfoldings: (i) while previous approaches could only find common unfoldings of two boxes up to area 88, ours easily scales up to 150, and (ii) while previous approaches were only able to enumerate common unfoldings up to area 30, ours scales up to 60. This allows us to rule out 46, 54, and 58 as the smallest areas allowing a common unfolding of three boxes, thereby refuting a conjecture of Xu et al. (2017).
Cazamariposas: Automated Instability Debugging in SMT-based Program Verification
Abstract: Program verification languages such as Dafny and F⋆ often rely heavily on Satisfiability Modulo Theories (SMT) solvers for proof automation. However, SMT-based verification suffers from instability, where semantically irrelevant changes in the source program can cause spurious proof failures. While existing mitigation techniques emphasize preemptive measures, we propose a complementary approach that focuses on diagnosing and repairing specific instances of instability-induced failures. Our key technique is a novel differential analysis to pinpoint problematic quantifiers in an unstable query. We instantiate this technique in Cazamariposas, a tool that automatically identifies such quantifiers and suggests fixes. We evaluate Cazamariposas on multiple large-scale systems verification projects written in three different program verification languages. Our results demonstrate Cazamariposas' effectiveness as an instability debugger. In the majority of cases, Cazamariposas successfully isolates the issue to a single problematic quantifier and provides a stabilizing fix.