KB interchange standards

Tracy Schwartz <schwartz@surya.cyc-west.mcc.com>
Date: Wed, 27 Nov 1991 09:56-0800
From: Tracy Schwartz <schwartz@surya.cyc-west.mcc.com>
Subject: KB interchange standards 
To: interlingua@isi.edu, kr-advisory@isi.edu, SRKB@isi.edu, krd@ai.mit.edu,
        james@cs.rochester.edu, davis@ai.mit.edu,
        feigenbaum@sumex-aim.stanford.edu, forbus@ils.nwu.edu,
        rkahn@nri.reston.va.us, pkarp@ai.sri.com, kunz@intellicorp.com,
        jin@eagle.mit.edu, luu@isi.edu, malone@eagle.mit.edu,
        overt@prc.unisys.com, porter@cs.utexas.edu, dan_russell.parc@xerox.com,
        bwilliam@parc.xerox.com, hewitt-srkb@ai.mit.edu, mars@cs.utwente.nl,
        cleary@corwin.ccs.northeastern.edu, doug@csi.uottawa.ca,
        john@atc.boeing.com, roger@ci.deere.com, gio@darpa.mil,
Cc: lenat@mcc.com, guha@mcc.com
Message-id: <19911127175638.7.GUHA@SURYA.CYC-WEST.MCC.COM>

We think that the time may be right, now, for this sort of push on
knowledge-sharing.  In some ways, our experiences with Cyc can serve as
a microcosm for this sort of inter-group interaction.  One interesting
result is that many of the problems you're talking about still remain,
even when the cooperating groups all use exactly the same representation
system (syntax, vocabulary/terms, and inference machinery.)

Let us explain that "microcosm" remark.  Superficially, this occurs
because we have many different groups using and helping to build Cyc,
located around the country.  The analogy holds at a deeper level as
well, even within a single site:  Our knowledge enterers work in small
teams, often for weeks at a time, building up a "micro-theory" of some
topic.  They must have some level of interaction with other groups, and
already-entered micro-theories, but the less interaction the faster they
can work.

Much of the attention and controversy of the Standards for KB
Interchange effort seems to focus on sharing syntax and semantics of a
language, and a little about sharing vocabulary.  Well, let's consider
Cyc's knowledge enterer teams, since they do share these things. Does it
solve the problem?  If not, what else is/was needed?

One of the recurring problems during 1984-1989 was "divergence" ---
DESPITE the aforementioned sharing.  Different groups would use a term
slightly differently in their new micro-theory (compared to the way it
had been used before in other theories, sometimes even by themselves at
an earlier time.)

The standard solution to this would be to pick a small set of primitives,
and lock in their meanings.  The problems in our case -- and yours -- are
(a) there is no small set, and (b) it's almost impossible to nail down the
meaning of most interesting terms, because of the inherent ambiguity in
whatever set of terms are "primitive."

So what did we do?  
    (1) For one thing, we insist only on local coherence.  I.e., groups
share most of the meaning of most of the terms with other groups, but
within a group (working on a particular micro-theory) they strive for
complete sharing.
    (2) For another thing, both kinds of sharing are greatly facilitated
by the existing KB content --- i.e., if the terms involved are already
used in many existing axioms.

While (2) can be achieved through massive straightforward effort, (1) is
more subtle, and has required certain significant extensions to the
representation framework. More specifically, we had to introduce the
whole machinery of contexts/micro-theories into Cyc (which is why
"divergence" has been much less of a problem, since 1990.)

Each group enters its micro-theory into a context.  Different contexts
may use different vocabularies, may make different assumptions, may
contradict assertions made in other contexts, etc.  (Each context is a
first class object in our language, and instead of saying that a formula
is either universally true or false, it can be true in some contexts and
false (or even unstatable) in others.)

Both knowledge entering and problem solving go on in a context.  Axioms
external to a context are imported (lifted) from other contexts, using
articulation rules.  So the question of `what to share' is partially
decided at knowledge-entering time, by humans, and partially at
inference time, by the system.

>From this, it seems that an optimal knowledge-sharing effort should
attempt to build on a significant (large, broad) existing base KB, and
it should incorporate some sort of context mechanism, so that the
sharing can be flexible and, if necessary, reasoned about by the system.

If there is sufficient call for it, we'd like to try to find some way to
share Cyc -- its content and context mechanism, as well as the
less-important syntax and vocabulary of its language -- with you.  Think
of it either as a seed, or as scaffolding, but in any case we feel that
something like it (in both breadth and size, which is currently over a
million axioms) is going to be needed to serve as the semantic glue to
enable the sort of knowledge sharing we all have in mind.


Doug Lenat  and   R. V. Guha