RE: pun in ontolingua KB
Tom Gruber <gruber@hpp.stanford.edu>
Date: Mon, 20 Jun 94 14:44:10 PDT
Message-id: <9406202144.AA04078@hpp-ssc-1.Stanford.EDU.ksl>
To: "Benjamin J. Kuipers" <kuipers@cs.utexas.edu>
Cc: <ontolingua@hpp.stanford.edu>, srkb@cs.umbc.edu
From: Tom Gruber <gruber@hpp.stanford.edu>
Sender: gruber@smi.stanford.edu
Subject: RE: pun in ontolingua KB
In-reply-to: Benjamin J. Kuipers's message of Thu, 16 Jun 1994 15:19:28 -0500: <199406162019.PAA02636@archimedes.cs.utexas.edu>
Reply-To: gruber@hpp.stanford.edu
Ben Kuipers wrote (to the ontolingua list):
> While browsing in the automatically-generated network of Ontolingua
> html files, I was reading the theory for Physical-Quantities, and
> discovered that the pointer for "length" goes not to the
> physical-dimension length, but to the length-of-list function and its
> recursive definition.
The example is from the documentation string on the theory called physical-quantities, URL
http://ksl-web.stanford.edu/knowledge-sharing/ontologies/html/physical-quantities/physical-quantities.html
This is the nature of a flat namespace and a context-free syntax. The term
spelled "LENGTH" was defined by the KIF spec to mean length-of-list. For that
reason, I used the spelling "LENGTH-DIMENSION" for the physical dimension
length. The pointer Ben found was in free floating natural language text,
which depends on the context and the reader's background knowledge for
interpretation. Ontolingua indexes both the formal axioms and the
documentation strings. For the strings, Ontolingua used a purely syntactic
heuristic for deciding which tokens to link to formal definitions; the
sequence of characters "length" that occured in the documentation string for
the theory physical quantities was parsed as a token (because it was
surrounded by whitespace) and matched the lexical form of the LENGTH (of list)
function (ignoring case). So ontolingua linked it to the formal definitions
of the LENGTH function.
> By itself, this is a minor bug in the documentation, and easily
> corrected, but ...
>
> Q: Does the bug in the automatically-generated documentation reflect
> a bug in the KB?
>
So this isn't a bug, it's a "feature" of our knowledge-free indexing trick
used on free text documentation. Such mistakes in interpretation do NOT occur
for the formal part of the specification (e.g., the axioms and slot values).
In the semantics of KIF, an object constant "FOO" always denotes the same
object, no matter what words or phrases surround it (this is not the unique
names assumption; it is the standard way of defining the semantic value
interpretation function).
> Q: How can you check automatically for such bugs (a) in the
> documentation, and (b) in the KB itself? It is at least plausible
> to me that this cannot be done automatically, although there may
> be heuristic signs of potential semantic incoherence.
If, in a formal defintion, someone used LENGTH (which is a function in the
KIF-LISTS ontology) and really meant LENGTH-DIMENSION (which is an object in
the STANDARD-UNITS ontology), then ontolingua would tell them it is an error.
So the answer is, for (b) we can do some checking based on simple things like
constant type, arity, which ontologies it is defined in, etc. For (a), one
would have to know enough about the formal terms (and the English language) to
do unambiguous natural language parsing.
The difference in the ambiguity of natural language and formal language is one
of the reasons we are trying the formal approach to specification
(ontologies), instead of just sharing text documents. If we could do a better
job at specifying shared conceptualizations with videotape or Lisp programs,
I'd use it.
I'm sorry if this analysis is painfully obvious, but I think this gets at a
deep issue for knowledge sharing.
tom