# Re: INTERNATIONAL STANDARD FOR LOGIC: CSMF

Chris Menzel <cmenzel@kbssun1.tamu.edu>
From: Chris Menzel <cmenzel@kbssun1.tamu.edu>
To: dam@ai.mit.edu
Cc: tony@ontek.com, fritz@rodin.wustl.edu, jksharp@sandia.gov,
sowa@turing.pacss.binghamton.edu, msingh@bcr.cc.bellcore.com,
bhacker@nara.gov, scott@ontek.com, tony@ontek.com, skperez@mcimail.com,
genesereth@cs.stanford.edu, duschka@cs.stanford.edu,
E.Hunt@cgsmtp.comdt.uscg.mil, cg@cs.umn.edu, interlingua@ISI.EDU,
srkb@cs.umbc.edu
Subject: Re: INTERNATIONAL STANDARD FOR LOGIC: CSMF
Message-id: <94Sep10.162039cdt.9769@kbssun1.tamu.edu>
Date: 	Sat, 10 Sep 1994 16:20:31 -0500
Sender: owner-srkb@cs.umbc.edu
Precedence: bulk

David McAllester writes:
I have not generally taken part in interlingua discussions but have
been generally concerned by the emphasis on first order semantics.  It
is well known, and stated in previous messages, that you can't even
define the concept of "connected to" in a graph in first order logic.
This has important PRACTICAL implications.  Let Arc and Con be two
binary relations and Phi(Arc, Con) be a formula which claims to state
that Con(x,y) is true if and only if there is a path of arcs form x to
y where Arc(x,y) means there is an arc from x to y.  I challenge the
supporters of a purely first order system to write this definition.
When most people write Phi(Arc,Con) in first order logic they get
something such that

forall x P(x) ^ Arc(x,y) => P(y)
plus Phi(Arc,Con)

does NOT imply

forall x P(x) ^ Con(x,y) => P(y).

It can be done in first order ZF set theory (see below).  Of course,
as Professor McAllester points out, it will require notions like
induction that are "naturally second order" in the sense that if you
want definitions that are insusceptible to unintended interpretations,
you'll need full second-order model theory.  However, these notions
are all definable in first-order ZF in the weaker sense noted, i.e.,
they pick out the right things in the real world of sets.  (And it is
enough, I think (though I would be happy to be convinced otherwise)
that they *in fact* pick out the right things (we're describing the
real world, aren't we?), even though there are models in which they
don't.)  In particular, the notions of finitude and natural number are
definable in ZF in this sense, and the principle of mathematical
induction is a theorem of ZF, in the following form:

(0 \in x & (n)(n \in x -> n' \in x)) -> (n)(n \in x),

where "n" is a sorted variable ranging over the natural numbers, i.e.,
the finite ordinals as defined in ZF a la von Neumann.  (Cf. the texts
by, e.g., Drake, Levy, Devlin, etc.)

A concept language which can not define the concept of connected in
a graph would seem to be of limited utility.

Agreed.

[...]

The problem, in practice, is mathematical induction.  Many truths can
only be proved if we have some form of induction principle.  Induction
principles are naturally second order.  First order induction schemes
are HARDER to compute with than higher order principles invoked with
higher order matching and unification.  Higher order unification may be
bad, but wait till you try to use a first order induction scheme!

As noted, mathematical induction in ZF is a straight object language
theorem, not a schema.  Induction is a schema in PA just because it is
a much weaker first-order theory.  It only talks about numbers, not
sets (except out of the corner of its mouth via open formulas).

In summary, induction principles seem to be a necessity.  Higher order
induction principles seem more computationally tractable than first
order induction schemes.

I think that the concern that Professor McAllester expresses here,
though very important, is not my concern.  A computational reasoning
system that uses a higher order language does not know whether it is
being interpreted with a "real" higher-order model theory or a general
model.  (Indeed, it is not exactly clear what role model theory plays
in a such a system beyond guiding the designers.)  However, it may
well be true that a higher-order *formalism* is, as Professor
McAllester notes, computationally more tractable than a first-order
formalism for the purpose of automated reasoning.  Important as it is,
this is a wholly distinct issue.  I have no doubt, and have myself
argued at length, that a higher-order *language* is vastly preferable
to a first-order language for the representation of the logical forms
of natural language.  Mikhail Zeleny, in a typically informative
posting, provides several examples in support of this claim.  And the
success of HOL no doubt provides strong evidence that the same is true
with respect to automated reasoners.  *BUT it is another question
entirely whether a full blown higher-order semantics is required for
any or all of these purposes.* (Indeed, as I understand it, and as
John Sowa seems to have confirmed, the semantics behind HOL is
explicitly *not* genuinely higher-order.)  It is *this* claim I have
been questioning, not the virtues of higher-order *formalisms*.  These
are two issues that must be kept distinct in the higher-order/
first-order debate.

**Re the definability of connectedness in first-order ZF**

I think the following will meet Professor McAllester's challenge.
(And even if there is a flaw below, there is surely a way to do it
along the same lines.  Given, as Mikhail points out, the well known
fact that all of classical mathematics is representable in ZF, this is
not purported to be a deep result, but just an illustration of how a
lot of "naturally higher-order" theorems have robust first-order
counterparts in ZF.  I imagine the higher order proof of McAllester's
theorem looks pretty much the same as the one below.)

Let "i" and "n" be sorted variables ranging over the natural numbers
(i.e., finite ordinals as defined in ZF).  First define a path within
a graph g to be simply a function f on some initial segment of the
numbers such that for all i on which f is defined (save the largest,
if there is one), Arc(f(i),f(i+1)).  (Or we could define a path as a
sequence of arcs; macht nichts.)  Again, note that f, despite playing
the role of a function here, is just a certain kind of set (a set of
pairs) and hence is a perfectly legitimate first-order object in set
theory.  Now define the auxiliary predicate C such that

D1:  C(x,y,n) =df (Ef)(path-in(f,g) & length(f)=n & f(1)=x & f(n)=y).

That is, C(x,y,n) iff there is a path of length n from x to y.  Now
let p be the property P above (represented, as usual, as a set); I
switch to lower case to emphasize the first-order character of things.
In particular, I'll write "x \in p" instead of "P(x)".  (But note that
we *could* retain the higher-order syntax for aesthetic or
computational reasons.)  Recall that we are given

(*)  (x \in p & Arc(x,y)) -> y \in p.

Lemma:  (n)(x \in p & C(x,y,n) -> y \in p).

Pf: By induction on n.  For n=1, given (*), D1, and the definition of
a path, the result is immediate.  So suppose the lemma holds for n,
and suppose x \in p & C(x,y,n+1), i.e., there is a path f from x to y
of length n+1.  Let z be y's predecessor in f.  Then there is a path
f'=<f(1),...,f(n)> of length n from x to z, so by the induction
hypothesis, z \in p.  But by the definition of a path, we have
Arc(z,y), and so by (*), y \in p, as required, and the lemma is
proved.

Now, define Con such that

D2:  Con(x,y) =df (En)C(x,y,n).

By first-order logic, the Lemma is equivalent to

(**)	(En)(x \in p & C(x,y,n)) -> y \in p,

which is equivalent to

(***)	(x \in p & (En)C(x,y,n)) -> y \in p,

which by D2 is what we want, viz.,

(****)	(x \in p & Con(x,y)) -> y \in p.

--Chris

=================================================================
Christopher Menzel		  |  Internet -> cmenzel@tamu.edu
Philosophy, Texas A&M University  |  Phone ---->   (409) 845-8764
College Station, TX  77843-4237   |  Fax ------>   (409) 845-0458
=================================================================