kp9 - Soundness and Completeness

The Crisis of Ambiguity

Natural language is beautiful for poetry, but lethal for processors. When we describe a language loosely, we leave room for "undefined behavior", which is the dark matter of the programming universe where anything can happen.

We humans often work with both natural and programming languages, sometimes this causes ambiguity to slip-in. For example, below java code compiles, but fails at runtime.


/**
* This compiles but fails at runtime.
*/
public class Main {
    public static void main(String[] args) {
        // Type erasure in action!
        Particle electron = new QuantumParticle();

        // Exception in thread "main" java.lang.ClassCastException: 
        // class java.lang.String cannot be cast to class java.lang.Integer
        electron.setSpin("up");
    }
}

class Particle {
    public T spin;

    public void setSpin(T spin) {
        this.spin = spin;
    }
}

class QuantumParticle extends Particle {
    @Override
    public void setSpin(Integer spin) {
        super.setSpin(spin);
    }
}

Type erasure in Java generics are well documented but their pitfalls are not. The compiler will allow the code to compile due to generics erasure, but the cast at runtime leads to a ClassCastException.

Counterexamples in Type Systems is a great resource for finding such examples.

Mapping the Mind of the Machine

General purpose programming languages are built on the foundations of computational models. These models are powerful enough to compute anything "worthy of computing".

Church-Turing thesis is a statement in computability theory which asserts that any function that can be computed by an effective method can be computed by a Turing machine.

Both lambda calculus and Turing machines proves that we can compute anything "worthy of computing". However, a model that can do "anything" can also do "everything wrong".

The usage of "Everything wrong" is also unsound and only exists as a moral subjective statement in the eyes of the programmer.

Therefore, it is important to lay down the rules for the machine to follow. The goal of these rules is to enforce a certain level of soundness and completeness.

Type system and type checking rely on logical rules such as,

if these premises hold, then this conclusion must follow

Or,

the mathematical certainty of a compiler to look at your code and say, "I have checked every possible branch, and I can prove that in this environment, a 'String' will never be treated as an 'Integer'"

Formally, we can express these rules in various frameworks like the Hindley-Milner type system, Dependent types, the Calculus of Constructions, and the 'Propositions as Types' principle.

The Deeper Connection and the Limits of Our Constructs

Even before the advent of type systems, humans (really intelligent ones) have been trying to reason about the truth of their statements, particularly in the foundations of mathematics.

David Hilbert and it's program called upon to translate all the mathematics with it's axioms into a formal and consistent system. All the true statements must be provable from the axioms (completeness) and no contradictions should be possible (consistency).

But Hilbert's program suffered coup de grâce when Kurt Gödel proved the incompleteness theorem.

Standford Encyclopedia of Philosophy on Hilbert's Program.

Standford Encyclopedia of Philosophy on Gödel's Incompleteness Theorem.

Perhaps its not surprising, "Gödel himself remarked that it was largely Turing's work"

Our programming tools are bound by the same mathematical limits. They suffer from undecidability and incompleteness.

Correspondingly, Rice's theorem made it clear that it's impossible for any program to check and output yes or no non-trivial semantic properties of another program. Such a decision is "unknowable".

H. G. Rice's Classes of recursively enumerable sets and their decision problems

These limits are not meant to discourage us. Instead, they help us build better programming tools, where compilers assist humans in proving how accurate and complete our programs are and the weaknesses they suffer from.

Proof assistants rely on this foundational connection between logic and computer science to ensure high-reliability proof construction and although LLMs' proving Erdős' problems has been a bit controversial, it's still an interesting development in the field of automated reasoning.

Terence Tao's post on Mathstodon and AI contributions to Erdős problems

Can we advance mathematics without compilers that prove theorems? Regardless, this is reason enough for my fondness for compilers.

The Semantic Police: Stop or my compiler will prove a theorem!

Table of contents

The Crisis of Ambiguity

Mapping the Mind of the Machine

The Deeper Connection and the Limits of Our Constructs