Library motive
Elimination with a Motive (Conor McBride)
Motivation
Conjunction and Disjunction
Lemma and_project_l : forall {A B : Prop}, A /\ B -> A.
tauto. Qed.
Lemma and_project_r : forall {A B : Prop}, A /\ B -> B.
tauto. Qed.
Lemma or_elim : forall {A B C : Prop}, A \/ B -> (A -> C) -> (B -> C) -> C.
destruct 1; tauto. Qed.
Students usually grasp the and_project rules easily, but find or_elim
frightening. Edward: We can make this clear by attempting to do proofs in the forward
style: (here False is a stand-in for some goal which the student doesn't know
they are proving yet.)
Goal forall (A B : Prop), A /\ B -> False.
intros. pose proof (and_project_l H).
Abort.
Goal forall (A B : Prop), A \/ B -> False.
intros. try pose proof (or_elim H).
(* Error: Cannot infer the implicit parameter C of or_elim. *)
Abort.
Coq has the same difficulty as our hypothetical student: the trouble is that
C appears from nowhere: students struggle to dream up of C's which will eventually
lead to their goal.
When they learn proof by refinement (backwards reasoning), or_elim finally
makes sense: given A \/ B, we can instantiate C with our goal, splitting it
into cases.
The need to prove the goal is why we are eliminating A \/ B: the goal
is the motive for the elimination. We can choose the appropriate motive
precisely because 'C appears from nowhere': or_elim is parametric in its motive.
Elimination rules whose conclusion is a parameter--the motive variable--allow us
to exploit a hypothesis whatever the goal, just as 'left rules' in sequent calculus
analyse hypotheses regardless of what stands right of the turnstile. (Edward: To refresh your
sequent calculus, check out http://logitext.mit.edu/logitext.fcgi/tutorial scrolling to
the "Summary. Here are all the inference rules for first-order logic.")
In this light, the 'simplicity' of the and-project rules is less attractive:
we may only exploit A /\ B when we want to know A or we want to know B. I join the many
advocates of the 'Gentzenized' alternative (in reference to so-called 'Gentzen systems',
also known as sequent calculi), exploiting A /\ B, whatever our motive C.
Edward: As an aside, on the other side of the Curry-Howard correspondence, the
projection rules are fst/snd, whereas the Gentzenized rule is uncurry.
'Mathematical Induction' is another common example of elimination with a motive.
Structural Induction and Recursion
Lemma nat_induction : forall (P : nat -> Prop),
P 0 ->
(forall n, P n -> P (S n)) ->
forall n, P n. exact nat_ind. Qed.
Here, P stands for a family of propositions indexed by a number. Not even
the most ardent 'forwardist' is bold enough to suggest that we should search our
collection of established facts for a pair related by P in order to add
forall n, nat : P n to that collection. nat_induction needs a motive not
only because, like or_elim, it splits the proof into cases, but also because
the abstract n being eliminated is instantiated in each case. The point
of induction is not just to decompose a hypothesis but to simplify the goal:
where constructor symbols appear, computation can happen.
If we allow P to stand for a nat-indexed family of types and supply the appropriate
computation behavior, induction becomes dependently typed primitive recursion,
supporting function on n whose return type depends on n. The explicit indexing
of P by numbers makes a strong connection to pattern matching and structural recursion.
We can expose these patterns even in simply typed examples by making a definition:
Definition Plus := fun (x y : nat) => nat.
Goal forall x y : nat, Plus x y.
induction x.
(* forall y : nat, Plus 0 y *)
Focus 2.
(*
x : nat
IHx : forall y : nat, Plus x y
============================
forall y : nat, Plus (S x) y
*)
Abort.
The return type of the goal reads like the left-hand side of a functional program
'under construction'. Induction splits our programming problem in two: we can
read off the instantiated patterns and, in the S case, the legitimate class of recursive
calls.
An elimination rule with an indexed motive variable P justifies a kind of pattern
analysis, 'matching' P's arguments in the conclusion (the goal patterns) against
P's arguments in the premises (the subgoal patterns): P's arguments in inductive
hypotheses (the recursion patterns) allow the corresponding recursive calls. To equip
an elimination rule with a computational behavior is to give its associated pattern
matching and structural recursion an operational semantics.
(* Illustrated, we have:
forall (P : nat -> Prop),
P 0 -> (* subgoal patterns *)
(forall n, P n -> P (S n)) -> (* --"-- *)
(* \__ recursion patterns *)
forall n, P n. (* goal patterns *)
Or, more concretely:
(forall y, Plus 0 y) -> (* subgoal patterns *)
(forall x, (forall y, Plus x y) -> forall y, Plus (S x) y) -> (* --"-- *)
(* \__ recursion patterns *)
forall x y, Plus x y (* goal patterns *) *)
Relation Induction
Inductive le : nat -> nat -> Prop :=
| le_refl : forall n, le n n
| le_succ : forall m n, le m n -> le m (S n).
Hint Constructors le.
Definition le_induction :
forall (P : nat -> nat -> Prop),
(forall n, P n n) ->
(forall m n, le m n -> P m n -> P m (S n)) ->
forall {m n}, le m n -> P m n.
exact le_ind.
Qed.
Relation induction is easy to apply if the eliminated hypothesis, le m n, ranges
over the entire domain of the relation: we can choose P by naive 'undergraduate' textual
matching.
However, if the hypothesis is instantiated, we still need a P indexed over
the whole domain, so we employ a well known goal transformation which I learned
from James McKinna--use a general le m n, but constrain m and n with equations.
Goal forall (P : nat -> nat -> Prop) {m n}, le m n -> P m n.
intro P. apply (le_induction (fun m n => P m n)).
Abort.
Goal forall {m}, le m 0 -> m = 0.
intro. apply (le_induction _). (* not obvious *)
trivial.
(* In fact, Coq gets it wrong, picking (fun m n => m = n).
This is not provable.
m : nat
============================
forall m0 n : nat, le m0 n -> m0 = n -> m0 = S n *)
Abort.
Goal forall {m}, le m 0 -> m = 0.
intro. induction 1. trivial.
Abort.
We should generalize and add an equation.
Goal forall {m n}, le m n -> n = 0 -> m = 0.
apply (le_induction (fun m n => n = 0 -> m = 0)); try inversion 3; trivial.
(* Note: there is a typo in the paper: m <= n should not be included in P *)
Qed.
More generally, an elimination rule for an inductive relation, R : forall (x : X) ..., Prop
typically requires some P : forall (x : X) ..., Prop as motive.
(*
Goal forall (y : Y) ..., R (t y) ... -> P y ... .
assert (forall (x : X) (y : Y) ..., R x ... -> x = t y -> ... -> P x ...).
apply (induction (fun x ... => forall (y : Y) ..., x = t y -> ... -> P y ...)).
*)
Plugging this P and the proof of R (t y) ... into the rule delivers a proof
of P y ... subject to the trivial equations t y = t y .... This technique gives
a slightly clumsier for P than we chose for m <= 0, which only constrains one argument,
so needs only one equation. It is not hard to see how to remove the unnecessary
equation.
Note that our chosen P x ... resembles the goal, but with some equations inserted
and R (t y) ... missing. P is not indexed over the proof of R x ..., so elimination
tells us nothing about it: we an safely omit it from the motive.
The datatype analogue of inductively defined relations are dependent families
of datatypes, such as vectors--list of a given length--defined as follows:
Induction for Dependent Datatypes
Inductive Vect (A : Type) : nat -> Type :=
| vnil : Vect A 0
| vcons : forall {n}, A -> Vect A n -> Vect A (S n).
Lemma Vect_elim : forall {A : Type} (P : forall {n : nat}, Vect A n -> Type),
P (vnil A) ->
(forall {n} (x : A) {xs}, @P n xs -> @P (S n) (vcons A x xs)) ->
forall {n} (xs : Vect A n), @P n xs.
exact Vect_rect.
Qed.
(* NB: Conor wants A to be inferred for constructors but explicit for the type;
Coq doesn't support this. *)
Proof terms for relations are interesting only for what they say about the
indices--their structure is unimportant. The terms in dependent datatypes are
the actual data. Corresponding, the motive P of Vect_elim is indexed not only over
the length of n, but also over the vector itself: we care if a vector is vnil or vcons. On
the other hand, P is not indexed over the element type A, which is parametric to
the entire inductive definition.
'Constraint by equation' also works for instantiated datatypes. For example:
Definition VTail := fun {A : Type} {m : nat} (xxs : Vect A (S m)) => Vect A m.
Require Import JMeq.
Infix "==" := JMeq (at level 70, no associativity).
Goal forall {A : Type} {m : nat} (xxs : Vect A (S m)), VTail xxs.
intro A.
assert (forall {n : nat} (xs : Vect A n) {m : nat} (xxs : Vect A (S m)),
n = S m -> xs == xxs (* !! *) -> VTail xxs) as X.
apply (Vect_elim (fun {n : nat} (xs : Vect A n) =>
forall {m : nat} (xxs: Vect A (S m)),
n = S m -> xs == xxs (* !! *) -> VTail xxs)).
Solving the equations refutes the base case premises...
(*
A : Type
============================
forall (m : nat) (xxs : Vect A (S m)),
0 = S m -> vnil A == xxs -> VTail xxs
*)
inversion 1.
...and reduces the step case to:
(*
A : Type
============================
forall (n : nat) (x : A) (xs : Vect A n),
(forall (m : nat) (xxs : Vect A (S m)), n = S m -> xs == xxs -> VTail xxs) ->
forall (m : nat) (xxs : Vect A (S m)),
S n = S m -> vcons A x xs == xxs -> VTail xxs
*)
inversion 2; subst. intro eq; rewrite <- eq.
The return type again shows the one pattern possible, and xs
is the tail we seek.
(*
============================
VTail (vcons A x xs)
*)
exact xs.
(* ungeneralize the hypothesis... *)
intros; apply (X (S m) xxs); auto.
Qed.
Unlike with <=, our chosen P did quantify over the eliminated xxs--it
is matched to the xs in the goal patterns. Omitted this time is A, parametric to the
definition, so kept parametric in the elimination. We must be sensitive to these
distinctions in order to deliver appropriate behaviour, whatever the elimination rule.
By now, the eagle-eyed will have noticed that I write batched equations like
x = t y ... without worrying about type safety. Indeed, in the above example, I
wrote xs == xxs where xs : Vect A n and xxs : Vect A (S m). The conventional
Martin-Lof definition of = forbids such heterogeneous equations, relating elements of
different types. You can thus deduce that I am using an unconventional definition.
I define == as follows:
Equational Constraints and Dependent Types
Print JMeq.
(*
Inductive JMeq (A : Type) (x : A) : forall B : Type, B -> Prop :=
JMeq_refl : x == x
*)
Check JMeq_ind.
(*
JMeq_ind
: forall (A : Type) (x : A) (P : A -> Prop),
P x -> forall y : A, x == y -> P y
*)
This == can compare anything to a, even if it is not in A. Correspondingly, we
may form heterogeneous sequences s == t .... However, the introduction and
elimination rules follow the conventional homogeneous definition: we shall
only treat something as an equal of a if its type really is A. I call this 'John Major'
equality, because it widens aspirations to equality without affecting the practical outcome.
If s ... and t ... are vectors in the same telescope (Edward: a telescope is a string of
abstractions, i.e. a generalized binding), then the leftmost equation s1 == t1 is
homogeneous and thus vulnerable to elimination.
'John Major' equality is equivalent to extending Martin-Lof equality with Altenkirch
and Streicher's 'uniqueness of identity proofs' axiom, often referred to as 'axiom K'.
It is clear that the new equality subsumes the old. On the other hand, we can write
a heterogeneous equation a = b as a homogeneous equation between pairs (A, a) = (B, b)
in the type exists (T : Type), T. Clearly (A, a) equals itself. The elimination rule
follows if a = a' is a consequence of (A, a) = (A, a'), and this is a wel known
variant of axiom K. The details of this construction can be found in Conor McBride's thesis.
Edward: In the rest of the paper, Conor McBride describes the tactic 'BasicElim', which
automatically performs the procedure we've seen alluded to above.
Fortunately for the average Coq user, a variant of this tactic has been implemented,
called 'dependent induction. We can see how it solves both of our previous problems.
The rest
Require Import Equality.
Goal forall {m}, le m 0 -> m = 0.
intros m H. dependent induction H; trivial.
Qed.
Lemma vtail : forall {A : Type} {m : nat} (xxs : Vect A (S m)), VTail xxs.
intros A m xxs. dependent induction xxs; trivial.
Qed.
Print Assumptions vtail.
As you can see, dependent induction uses JMeq internally.
There is more discussion about JMeq in Adam Chlipala's book CPDT:
http://adam.chlipala.net/cpdt/html/Equality.html
JMeq (and axiom K) are very convenient assumptions to make, although in the light of
new developments such as Homotopy Type Theory, there may be interesting mathematics
to be done in type theories with models that do not validate axiom K. So while
some of the more recent, practically oriented dependently typed languages (e.g. Idris)
offer heterogenous equality as the default, there are also many reasons why Coq developments
will try to avoid using it. Another downside of 'dependent induction' is that it is not
very fast, which can cause problems for large developments.