Words, Concepts, Representations

The Foundation Ontology as a Basis
for Semantic Interoperability
Patrick Cassidy
MICRA, Inc., Plainfield, NJ
cassidy@micra.com
1
Outline
• Accurate, automatic, and broad general Semantic Interoperability requires
that different systems represent their knowledge using a common
Foundation Ontology (FO). The common Foundation Ontology provides a
common set of concept representations that can be used to logically describe
the intended meanings of any of the more complex concepts not in the FO.
• Basing the common FO on the full set of fundamental conceptual elements
(sometimes called “semantic primitives”) provides a means to limit the need
for agreement; viewing the FO as an interlingua to enable translation among
different representations allows complete freedom for locally optimal
representation.
• A common foundation ontology that can be widely adopted should be as
small as possible, to maximize agreement and make using the FO as easy as
possible.
• To make usage easier, utilities are necessary, for example to extract a
subontology for specific domains; a good Natural Language interface may be
required
2
The Problem of Independent Applications
•
•
•
•
•
Different groups develop their own databases, terminologies, and ontologies. Local
communities want to do their own thing, not be forced to conform.
When there is a need to communicate information among independently developed
databases or applications, for automatic use (without human intervention)
information needs to be communicated and interpreted accurately – i.e., systems
need to interoperate at the semantic level.
Data structures can be created and used effectively in multiple applications without
reference to a common vocabulary standard for information transfer within a
community that has internal interaction; but such local interoperability cannot scale to
the general situation where information must be in a form interpretable by any other
system that can use it.
The absence of a common vocabulary for data elements makes accurate
interoperation impossible without translation.
Semantic Interoperation requires a common standard of meaning; if any undefined
terms are used in the description of a term or concept, accurate interpretation is
impossible. Only an agreed common vocabulary will support accurate interpretation
within a community. The FO supplies the basic “vocabulary” for logical
representation of many domains, allowing translation of any domain ontology into
any other.
3
A Language is More Than a Grammar
• The term “language” has been used to mean qualitatively different things,
such as computer “languages” like FORTRAN, C, or JAVA.
• A language in the sense relevant to the interoperability problem must have
both a grammar and a vocabulary.
• Ontology “languages” such as OWL or CL consist primarily of a grammar,
with a minimal semantics defining the logical operations.
• Regardless of how widely any grammar such as OWL is used, accurate
communication will be impossible without some agreement on a vocabulary.
• The FO tactic makes that agreement easier by requiring agreement only on
the basic concepts used to describe all others, leaving application developers
free to invent an unlimited number of terms and concepts for local use
without any contact with other groups, and still have their information
accurately interpreted by any other system using the basic vocabulary.
• The goal of the FO project is to find agreement on that limited vocabulary of
basic ontology elements, and demonstrate that this is adequate to support
general interoperability.
4
A Solution for The Problem
• Locally developed applications can use small, specialized ontologies,
idiosyncratic ontologies, or no ontology at all and still perform their
work perfectly, and share information using local agreements for the
meaning of the data.
• BUT When local applications need to share complex information with
many other systems, a common, expressive standard of meaning (i.e.
a common language) is essential for communication.
• The Solution – a common Foundation Ontology to provide a standard
for Content to complement the existing standards for Format and
basic-level Semantics (such as OWL or FOL).
• There is a widespread assumption that getting some broad agreement
on a common Foundation Ontology is impossible. This assumption is
largely based on the ambiguity of language, and a simplified view of
the function of the Foundation Ontology. There is no technical, social,
or psychological barrier – what has been missing is a proper
interpretation of the function of the FO, and an adequate test.
5
Overview of the FO Project
(1) The goal is to find a means to translate assertions in one ontology language
(grammar + vocabulary) into another ontology language. The translations will use
axioms having elements in common between the two ontologies.
(2) there are some ontology elements whose intended meanings cannot be expressed
solely as an FOL combination of other ontology elements without circular links.
These are called the "primitives“ in this discussion.
for any given group of domain ontologies, there will necessarily be some set of
such primitives that will be sufficient to logically specify by FOL combination the
intended meanings of the non-primitive elements of all of the other ontologies in
the study. These constructed meanings will not necessarily be complete
descriptions of the intended real-world referents; they will be sufficient to perform
the computations desired for the applications supported by the domain ontologies.
(4) To *accurately* translate logical assertions among those domain ontologies, the
most parsimonious tactic (and probably the fastest) would be to identify the
primitives in common among those domain ontologies, include them in an FO, and
use them to create translations of assertions between the domain ontologies.
Those translations will use "bridging axioms" to convert assertions from the form in
one ontology to the form in another ontology.
(3)
6
Overview of the FO Project (2)
(5) To minimize the changes in the FO as new domain ontologies are linked to
(mapped to or logically expressed by) the FO, it is advisable to try to identify as
many of the possible primitives as can be identified, at the earliest stages of
testing of the FO. This should reduce the number of new primitives that need to
be created as new domain ontologies are linked to the FO. Since the test has
never been done, we do not know whether or how quickly the need for new
primitives will drops for each new domain. That can only be determined by testing
the FO process. It is possible that new primitives will need to be continually
added; even so, this method promises to be the most effective to achieve the
maximum and most accurate semantic interoperability that is possible at any given
time, and to do it with the least cost.
(6) The COSMO project is intended to identify a plausible starting candidate for a
common foundation ontology based on primitives by identifying the logical
primitives needed to represent the most common and basic concepts.
7
Overview of the FO Project (3)
(7) As possible inventories of primitives that should be included in a *starting* FO, to
aim for the broadest coverage as quickly as possible, COSMO uses the senses
associated with the Longman dictionary defining vocabulary - 2148 words, and
probably over 4000 senses. Longman has been tested for its ability to linguistically
define all other words in the dictionary, but whether there could be a similar small
inventory of primitive ontology elements that can combine to specify *all* other
ontology elements is unknown and may be impossible. The more relevant
question is whether a set of primitive ontology elements can be found that will not
need *significant* supplementation as new domains are linked to the FO; if little
supplementation is needed, the FO should be stable enough for most practical
tasks requiring semantic interoperability. This question can only be answered by
testing multiple domain ontologies versus some common FO.
(8) Other possible sources of essential primitives could be the 3000 most frequent
Chinese characters (covering 98.9% of modern text) and the 4000 most common
signs of AMESLAN. But these symbols have not been tested as a "defining
vocabulary".
8
The Principle of Semantic Primitives
• For any given group of domain ontologies, it is possible to identify
some set of basic ontology elements (the Foundation Ontology, or FO)
that can be combined to form the more complex ontology elements in
the domain ontologies. Those basic ontology elements can be viewed
as representing the “semantic primitives” for that group of ontologies.
These primitive elements can be used to translate information from its
original form in any one of those ontologies to its form in any of the
others. Any logical contradictions among the linked ontologies can be
recognized and represented, within local “theories” or modules.
• As the number of ontologies linked increases, the number of new
primitive elements that need to be added to link a new ontology to the
existing ones will decrease. At some point the FO will be stable
enough to serve as a reliable standard of meaning for accurate
semantic interoperability.
9
What are the “intended meanings”?
The intended meaning of an ontology element reflects
two criteria:
• The meaning that the creator of the ontology element intends to
capture by the logical specification: ideally, this will be
unambiguously described by the linguistic documentation as well as
reflected in the logic.
• The behavior of the programs that use that ontology element must
correctly reflect the behavior (insofar as it is affected by that
ontology element) that the ontologist and programmer both intend
for that program.
– Any change in an ontology that affects the logical specification of an ontology
element must not change either the meaning as intended by the ontologist, or
the behavior of the ontology-based application, unless that change is
understood and accepted by the ontologist and programmer.
10
Mapping versus Translation
• Automated mapping without a common foundation ontology
is too inaccurate for mission-critical automated decisions
because the intended meanings often differ, overlap, merge,
or are absent between the same terms in different
applications.
• Semiautomated mapping without a common interlingua
ontology is too expensive - order of n2 effort; however,
mapping to a common ontology reduces the effort to
integrate multiple ontologies, including those initially
developed without reference to the common ontology.
• Domain ontologies developed from the start by using the
common foundation Ontology to describe the domain terms
will be automatically translatable into each other, with no
need for post-hoc mapping to any other ontology.
11
The Translation Tactic:
Everybody Gets Everything They Want
By supporting Translation among different local
knowledge representations
• Nobody has to stop doing anything they want to,
they can do it exactly the way they want to do it
• When applications need to communicate, the
developers only need to learn the common defining
language (or collaborate with someone who
already knows it) and map to it
• Learning and using the common language of the
foundation ontology is time-consuming, but can be
made easier by utility programs – commercial and
open-source, and a Natural Language interface.
12
Alternatives to a Common Foundation Ontology?
Mapping post-hoc vs. ab initio
• The relations between types in two different
ontologies may be:
– Synonymy (same intended meaning)
– Specialization (one type may be a subtype of the
other)
• The added constraint(s) by which the
specialized type differs from the parent type
must be specified, and that specification may
require adding new types or relations
– Overlap (there may be parts of the meaning of
one type similar, and other parts different)
13
Difficulties with Mapping Ontologies Developed
Completely Independently
– Representations often combine fundamental components of
meaning in different ways
– Elements of different ontologies may overlap, rather than map
directly or be in a hierarchical relation
• The areas of overlap and non-overlap may require creation of new
types or relations, more basic than the composite types
• Dissecting the components of each overlapping representations
requires human-level intelligence, AND access to the original creators of
the different ontologies to verify interpretations
• Creation of new basic types and relations requires human-level
intelligence; cannot be done automatically
– The documentation alone rarely has sufficient information
even for a human to resolve the ambiguities
– Mapping legacy ontologies to a common Foundation
Ontology will reduce the effort from order of n2 to n.
14
Triage for Mapping to a Common FO
• Ontologies or Database Schemas newly created
using the basic “conceptual vocabulary” of the FO
will be automatically interoperable from their
creation, using the translation utility
• Retroactive integration of ontologies or DBs can
benefit from mapping (semiautomated or by
hand) to a common FO. This is expensive, and
the benefit must justify the cost.
• Where the cost of mapping to an FO cannot be
justified, statistical pattern-matching may be used
to obtain an approximate relation.
15
How Can Incompatible Theories be Included?
• The basic concepts that are required to specify meanings are generally
agreed on. Differing viewpoints will usually be expressible in terms of a
common vocabulary, and assertions in different terminologies or syntaxes
will be directly translatable into each other.
• One example: the often mentioned ‘incompatibility’ between 3-D
(endurantism) and 4-D (perdurantism) views of objects in time. The actual
assertions of each viewpoint are accurately translatable into assertions in
the other viewpoint:
• Pat Hayes (email to UOM-forum Aug. 8, 2009):
“But, for the record, I reach the conclusion from the observation that
anything that can be said in a 4D ontological framework can be
transcribed into a 3D framework based on the continuant/occurrent
distinction, and vice versa. The differences between them, I have
concluded, are really nothing more than a matter of notational choice. “
16
3D-4D Translation Axioms (Pat Hayes)
From Pat Hayes:
(forall (x (t Time) P)(iff (P x t)(P (x during t)) ))
“Think of this as a 'bridging' axiom, part of a translation specification, if you like”
. . . And later;
• There are a variety of notational options in combining a simple timeless assertion with a
temporal parameter. One is to treat the time as a context, in effect attaching it to the entire
sentence (or in IKL, proposition):
(ist t (P x y))
(ist t (that (P x y)))
• another is as an extra relational argument, giving the 'fluent' style which goes naturally with
continuants:
(P x y t)
• and a third is to connect it to the object(s) being related, the relation then being naturally
understood as a relation between time-slices:
(P (x at t)(y at t))
• But in fact, these are really all just notational variations on a single theme. They amount to
choosing where in the parse tree of the simple _expression_ to attach the parameter, is all. If we
simply FORGET the philosophy for a second, then we can treat this as an arbitrary conventional
choice, and think of them as all meaning exactly the same thing, and therefore equivalent.
17
3D-4D Translation Axioms (COSMO)
•
{PH isanInstanceOf Object}
•
{PH4D isanInstanceOf Object4D}
•
{t1 isanInstanceOf TimePoint}
•
{t2 isanInstanceOf TimePoint}
•
{t1t2 isanInstanceOf TimeInterval}
•
{t1t2 hasStartingTimePoint t1}
•
{t1t2 hasEndingTimePoint t2}
•
{PH4D isTheWholeLife4dVersionOf PH}
•
{PHt1t2 isaTimeSliceOf PH4D from t1 to t2}
•
;; If we included a ‘during’ similar to the one Pat Hayes uses, it might look like:
•
{(PH during t1t2) isIdenticalTo PHt1t2}
;; The bridging axiom for a specific assertion would be:
•
{{PH isLocatedAt IHMC from t1 to t2} iff {PHt1t2 isLocatedAt IHMC}}
•
;; And, redundantly, given the above:
•
•
•
{{PH isLocatedAt IHMC from t1 to t2} iff {(PH during t1t2} isLocatedAt IHMC}
The above explicitly has a 4D entity PH4D as TheWholeLife4dVersionOf the ‘dimension neutral’ object PH.
NOTE: The bridging axioms can be generalized by using row variables.
18
Bridging Axioms in General
• More detail for bridging axioms for various
scenarios translating different styles of
representation were presented in:
IKRIS Scenarios Inter-Theory (ISIT)
– Jerry Hobbs with the KRIS Scenarios Working Group
– http://nrrc.mitre.org/NRRC/Docs_Data/ikris/ISIT_spec.pdf
• Mirrored at:
http://micra.com/COSMO/HobbesEtalBridgingAxioms.pdf
19
How Can Incompatible Theories be Included?
(continued)
• When representation of genuinely logically
incompatible theories, not merely different
viewpoints, are desirable in the FO or in some
extension, the theories can be represented as
theories using the defining elements of the
FO. The assertions in theories are not
themselves directly part of the ontological
commitment of the FO, and describing
incompatible theories does not make the FO
itself inconsistent.
20
Representation of Incompatible Theories
Does not Make the FO Self-contradictory
• A logical contradiction in the FO would have some
pair of statements of the form:
– (P ?x) and (not (P ?x))
• But theories are represented in the FO as separate
contexts:
– (isTrueIn (P ?x) Theory1) and (isTrueIn (not (P ?x))
Theory2)
• Logically contradictory theories can be described in
the FO but not asserted to be true in the FO itself.
21
Similar Approaches
•
H. Wache, T. Vogele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann, and S.
Hübner, "Ontology-based Integration of Information -- a Survey of Existing
Approaches," Proceedings of the IJCAI-Workshop Ontologies and Information
Sharing, Seattle, WA: 2001, pp. 108-117
Accessed at: http://www.let.uu.nl/~Paola.Monachesi/personal/papers/wache.pdf
•
H. Wache, "Towards Rule-Based Context Transformation in Mediators," in
Proceedings of the International Workshop on Engineering Federated Information
Systems (EFIS), 1999, pp. 107-122.
http://citeseer.ist.psu.edu/cache/papers/cs/9658/http:zSzzSzwww.informatik.unibremen.dezSz~wachezSzPaperszSzefis-99-wache.pdf/wache99towards.pdf
22
Will Translation Among Logically Incompatible
Ontologies Always be Possible?
• Not necessarily. BFO is (for example) a single-inheritance ontology,
and it is possible that trying to translate assertions from multipleinheritance ontologies would cause a logical contradiction.
• This might be avoided if the single-inheritance axioms of BFO are
only used during the development of the classes of BFO-dependent
ontologies, and not during data (instance) entry or query time .
• There may or may not be workarounds for other cases of logical
incompatibility. The Foundation Ontology (FO – see below) project
would have to determine whether there are practical workarounds
for true irreconcilable inconsistencies.
• Groups that develop ontologies too inconsistent with the FO to use
the translation mechanism , may develop a special FO for their own
community with whom they must interoperate.
23
Integration of Knowledge Sources
Via Semantic Interoperability
• Representation of knowledge using a logical-based
ontology allows automated inferences using multiple
data sources – “connecting the dots” rapidly and
accurately, based on rules created by the domain
experts
• Automated reasoning that is reliable enough to be
trusted to make important decisions without human
intervention requires accurate information.
• Information transferred from other systems can be
used reliably only if the information is interpreted
accurately. 99% accuracy is insufficient.
• Accurate automated interpretation requires a common
foundation ontology among information sources.
24
Why is 99% Accuracy Insufficient?
The number of inferences deduced in the course of
proving a test theorem can be greater than 10,000.
(See: Owen L. Astrachan and Mark E. Stickel, Caching
and Lemmaizing in Model Elimination Theorem
Provers
http://www.cs.duke.edu/~ola/papers/cade92.pdf.
If the likelihood of error in each step is as low as 1%,
the chance of reaching a correct conclusion is
0.9910000 = 2 -44
If the number of inference steps in solving a problem is
68, there is a 50-50 chance of arriving at the correct
conclusion with 99% accuracy in translation.
25
Foundation Ontology
• Generically, a Foundation Ontology is an ontology containing
logical representations of the most general (abstract) entities
(types, relations) that are used in constructing more
specialized or domain-specific representations. Existing
examples are OpenCyc, SUMO, BFO, DOLCE, ISO15926 and
others.
• For practical convenience, more specific extensions can be
maintained to avoid unnecessary recreation of existing
ontology elements; these extensions can form a hierarchy of
ontologies (logical theories)
• If logically inconsistent ontologies are included in the set of
reference ontologies, they may be represented as a lattice of
theories.
• An FO used to support interoperability of any given set of
domain ontologies will have all of the basic concepts required
to represent any domain concept in that set as a combination
of the FO representations.
26
What A Common Foundation Ontology
Isn’t
≠ A controlled vocabulary
Each community can choose its own words to refer to
concepts, and map those to the FO
≠ A mandated standard
Users can use any common ontology or none, as their own
needs dictate. A common FO is required only for accurate
communication among multiple independent applications.
≠ A Restriction on expressiveness
An individual user can use any local application with any
language or technology. What must be expressed using the
FO is only that information that needs to be shared with
other communities.
27
Primitive Concept Representations
A used here, a primitive ontology element is an
ontology element whose intended meaning
cannot be represented as a FOL combination of
other elements in the FO, without some cycle.
An FO that is intended to function to translate
elements from one ontology to another should
have all of those primitives that are used to
represent the ontelms in either of the ontologies;
the primitives may be in the FO itself, or in some
mid-level or domain-level ontology used by both
communicating ontologies.
28
Are There a Fixed number of Primitives?
• It is possible that there may be no limit to the number of
primitive ontology elements required to construct other
ontology elements in all other domains. For the FO principle to
serve for translating multiple ontologies, it is only necessary that
all of the primitive elements required to construct all those
ontologies are represented in the FO or in some extension
common to the communicating ontologies.
• Therefore it is not necessary that there be a fixed number of
primitive ontology elements (ontelms) in order for the FO tactic
to support accurate interoperability.
• However, evidence from linguistic experience suggests that the
number of primitives required for broad applicability of an FO
may be small. This evidence may provide some participants with
additional motivation to explore the FO tactic. Primitives are
discussed further below.
29
Are There a Fixed number of Primitives?
Arguments from communication
• The number of primitive concepts people use internally for their own
thinking cannot be easily determined.
• What is important for interoperability is the number of primitives
used for communication.
• Accurate communication depends on agents using symbols whose
meaning is understood by all – this in turn depends on the meanings
being associated with common perceptual experiences. The number
of such distinguishable common experiences is limited.
• By age 18, most people can understand definitions of new terms
based on the fundamental concepts they have already learned.
• This is the basis for the use of a limited defining vocabulary in
dictionaries like Longman’s.
30
COSMO: Current Status
OWL version (September 2014)
• Types (classes): 7930
• Relations (OWL object properties): 973
• Restrictions: 2790
• All Longman Terms have some representation,
and all but 100 have been mapped to the
WordNet semantic network (currently
ongoing).
31
How Is Semantic Interoperability Achieved by
a Common Foundation Ontology? (overview)
• The elements of domain ontologies or databases are represented
as First-Order-Logic (FOL) combinations of ontology elements
(types, relations, axioms, functions – for short, “ontelms”) already
present in the Foundation Ontology.
• When information is to be communicated between systems using
different domain ontologies, each system communicates, in
addition to the data, the logical descriptions (axioms) for ontelms
not already in the Foundation Ontology (or public extensions) that
are required to understand the meanings of the data.
• Each system, able to interpret both FOL and the ontelms used to
describe the meanings, will be able to produce the same inferences
from the same data, when both use the same or a functionally
equivalent FOL inferencing engine.
• If the reasoning used in each local system is restricted to logical
inference on the represented knowledge, interoperability will be
optimal. Local procedures may be created for efficiency purposes,
provided that they use the knowledge in ways compatible with the
logical meaning .
32
Semantic Interoperability via an FO (more detail: 1)
The goal: An FO that can support the goal of “broad, general, accurate
semantic interoperability” can be viewed as: a system of agreed data
structures and programs that allow *any* local group using this common
system to place information **on any topic** on the internet or some
other public place, or to transmit it directly to another system, and have
the information interpreted in the sense intended by its creators,
regardless of whether the transmitting and receiving systems have any
prior contact. Proper interpretation requires that both transmitting and
receiving systems reach the same inferences from the same data, and
have the same real-world referents for each term.
Any system that has more relations, will of course be able to reach
additional inferences, but these will not be logically contradictory to the
inferences that the less complete system reaches.
A system that has more data may also reach additional inference, but the
inferences should not be logically contradictory to the inferences reached
by the less-informed system, unless the additional data itself is
contradictory to that in the less-informed system.
33
Semantic Interoperability via an FO (more detail: 2)
The FO would be used in this manner:
1.
The ontelms in the FO all have a meaning agreed to by the users, and
the logical specifications and linguistic documentation is unambiguous
enough to satisfy all participants that they agree on the intended
meanings,
2. The ontelms in domain ontologies or upper ontologies are identical to or
logically specified as FOL combinations of ontelms in the FO (or in
extensions of the FO).
3. The computations performed with ontology-specified data in
applications (other than simple input-output, or computations not
affecting data communicated among applications) are performed either
(a) using an agreed common implementation of FOL; or (b) the
procedural code that is part of some element in the FO. Thus the
calculations performed on data in communicating systems should be
identical, and produce identical inferences.
34
Semantic Interoperability via an FO (more detail: 3)
4. When any two programs that want to interoperate and have
separately developed domain ontologies need to communicate,
then in addition to the data that is to be transmitted, the
transmitting system must send all the logical descriptions of the
domain elements needed to describe the data that are not
already in the FO or in some extension used in common between
those two applications. There then needs to be an integrating
program that (on the receiving side) takes the new descriptions of
previously unknown elements, and integrates them into the local
ontology, to arrive at an ad-hoc (temporary) merged ontology
that is sufficient to properly interpret the data
communicated. The merger should be accurate because all of the
new ontology descriptions use only FO elements in FOL
combinations, and the FOL implementation is common among all
communicating systems.
35
Semantic Interoperability via an FO (more detail: 4)
5. Any application that can properly interpret elements of the FO
should be able to properly and consistently interpret
elements described as FOL combinations of those elements.
6. Therefore the computations performed by all applications
using the FO should all arrive at the same inferences from
the same data. That is all one can demand for
programs that are intended to be interoperable.
7. If any procedural code is used locally that manipulates the
data other than for input output or presentation, there may
be a risk of misinterpretation. The local programmers need
to be aware of the risk, and avoid misuse of the data so as to
change its intended meaning.
36
Semantic Interoperability via an FO (more detail: 5)
8. For information not transmitted to other systems, of course local systems
have complete freedom to use them as they consider optimal. It is only
the information transmitted to other systems that has to be
interpretable by means of the FO specification.
9. Recall that the FO will be able to have procedural code labeled as
functions. Any systems that require procedural code for proper
interpretation of transmitted data, that is not adequately mimicked by
FOL, can add it as a primitive function to the FO or to a domain
extension ontology used within some community.
10. The FO, in order to accommodate newly mapped systems that require
new primitives, should have an expeditious procedure for rapidly adding
new primitives, after review by the technical committed agrees that the
new element is not FOL specifiable using existing FO ontelms, and is not
redundant or logically contradictory to the existing FO.
37
Semantic Interoperability via an FO (more detail: 6)
Potential issues (1):
There is one potential problem in the manner of using newly
specified ontology elements required to interpret transmitted
information. It may not always be possible to recognize when the
intended meanings of elements in separately developed domain
ontologies are identical. Since the FO allows alternate structures to
represent the same meanings, but has translation axioms among
them , the various alternatives can in principle be calculated and
compared for identity. But unless the system can develop some
normal form into which all elements can be converted, identical
meanings may not always be recognized as such. It will have to be
investigated by the FO developers whether it is possible to develop
a normal form for the FO, or if not, whether failure to recognize
identity would have significant negative effects.
38
Semantic Interoperability via an FO (more detail: 7)
Potential issues (2):
An additional issue is whether newly added axioms could change the
interpretations of existing FO ontelms. To minimize that potential, it would
seem important to try to identify all axioms necessary to specify the intended
meanings of the FO primitives as fully as possible at the earliest stage, so that
few if any need to be added after the initial shake-down period of a few
years. Additions of new subtypes or relations that are only conservative
extensions of the FO may not be problematic in the same way.
For stability, it is important that the intended meanings of FO ontelms remains
constant, so that the logical interpretations of elements does not change over
time. When elements representing new concepts are needed, they are added
to the FO or some extension. Extensions that conserve meaning (e.g.
definitions using basic FO elements) should not create any logical conflict with
the FO.
Systems that have local data or local ontology extensions may derive additional
inferences from the same data, but these should not be logically contradictory
to those derivable by other systems.
39
Other Advantages of the Common FO
• The FOL rules that can be created within a domain ontology
can represent not only the data combining operations that are
performed in procedural programming, but can also
implement checking for consistency and accuracy of the input
and results.
• If some procedural code is nevertheless required for local
data processing, the use of an FO will still reduce the number
of data elements that need to be interpreted carefully and
processed according to the common interpretation; as a
result, the chance of inadvertently using an interpretation
different from that used by others will be reduced.
• Procedural code that is useful for more than a few local uses,
if representable as a function, may be included in the FO or a
mid-level extension as a new primitive element.
40
The Integrating Function of the
Foundation Ontology
Foundation Ontology
GenericObligation
Domain Ontology 1
Domain Ontology 2
SameAs
SameAs
Obligation
Duty
41
The Ontology for Integrating Databases
Foundation Ontology (FO)
Knowledge Base
uses FO
For Definitions
Provides defining concepts to specify
conceptual message Content
Database Translating Interfaces
Customers
General
Commercial
Patient
Inventory
Regulations
Products Transactions
Data Knowledge
Data Collection Interfaces
42
The Foundation Ontology for Integrating Applications
Information Store(s)
Use FO for
Definitions
Task Control: Select
Processes To Solve Current
Problem. Iterate to more
specific problems.
Interface
To user
Common Foundation Ontology provides
defining concepts to specify content of
messages passed among modules
Analysis
Support
Linguistic
Information
Information
Retrieval
Extraction
Case-Based
Reasoning
Sensors
and
Robotics
Probabilistic
Reasoning
Spatial
Reasoning
Interfaces
43
In Multi-Agent Architectures Like Cougaar, the
Ontology Can Be the Backbone of The Communication
Language (http://www.cougaar.org/)
Uses Ontology
Uses Ontology
44
Special Agent Communication Protocols?
• Certain agents in a multi-agent architecture may only need to
communicate with a small number of other agents (a point
emphasized by John Sowa on the Ontolog forum Jan 2010).
These communications may take advantage of protocols much
simpler than the FO for accurately sharing information among
them. If any one of these agents can express its information
according to the FO, this local community can also
interoperate accurately with the wider world of FO users.
• Such specialized protocols could be used for independently
developed agents. The FO could include a means to make
such protocols publicly accessible.
• Such specialized communications protocols may reduce the
overall effort to create certain multi-agent systems that can
communicate with other FO-aware systems.
45
Global Semantic Integration; Combined use of Local
Interfaces and Protocols with FO for Global Communication
Commercial Organizations Communicating Privately
Visualization Tools
Foundation Ontology provides the grammar and vocabulary for communicating among local ontologies or DBs
Reasoning Tools
Single Large Corporation;
Local interfaces not using the FO
Individuals or
Communities
General Public
Manufacturing,
Personnel, Sales,
Internals
Advertising Inventory
Government
Research; Agencies:
Academic Interagency
Protocols
Small businesses
Small Businesses:
May have special
internal or Local
interfaces to other
businesses
Commercial Ontologies and Databases
Toward the Future
• The potential for widespread agreement on a common Foundation
Ontology presents an opportunity to develop a tool that can
substantially accelerate progress in developing intelligent
applications by allowing multiple processes or applications to
communicate accurately. This will allow rapid evolution of any
application that consists of multiple modules.
• Development of new computational techniques for information
processing will be accelerated by allowing more effective reuse of
routines that use the same standard of meaning.
• The development and testing of a widely acceptable foundation
ontology can be accomplished by any large enterprise with multiple
data sources whose integration can demonstrate the utility of this
approach .
47
END
• COSMO ontology:
– http://micra.com/COSMO/COSMO.owl
– http://micra.com/COSMO additional resources in
the COSMO directory
– Email: cassidy@micra.com
• Acknowledgement:
– Refinement and clarification of details of the
project suggested here have benefited from many
discussions with the members of the Ontolog
forum.
48
Additional discussion
49
How Is Semantic Interoperability Achieved by
a Common Foundation Ontology? (overview)
• The elements of domain ontologies or databases are represented
as First-Order-Logic (FOL) combinations of ontology elements
(types, relations, axioms, functions – for short, “ontelms”) already
present in the Foundation Ontology.
• When information is to be communicated between systems using
different domain ontologies, each system communicates, in
addition to the data, the logical descriptions (axioms) for ontelms
not already in the Foundation Ontology (or public extensions) that
are required to understand the meanings of the data.
• Each system, able to interpret both FOL and the ontelms used to
describe the meanings, will be able to produce the same inferences
from the same data, when both use the same or a functionally
equivalent FOL inferencing engine.
• If the reasoning used in each local system is restricted to logical
inference on the represented knowledge, interoperability will be
optimal. Local procedures may be created for efficiency purposes,
provided that they use the knowledge in ways compatible with the
logical meaning .
50
The Evolutionary Tactic
• The combination of a common standard of
communication among modules, and a modular design
for applications, allows incremental evolutionary
improvements in function. Improved modules can be
created by totally separate groups, and plugged in to
the whole system.
• The same standard, used to communicate information
among multiple applications, similarly allows
incremental improvement in the function of multiple
applications (or agents) to achieve different goals.
• Separate applications or modules can communicate to
form a “Society of Mind” (Minsky) approaching human
levels of information processing ability.
Communication among the modules uses the FO as a
common language to express meanings.
51
The Foundation Ontology . . .
. . . is not required to be used in toto in every application; individual
applications will only use as much as is needed to support the
reasoning for that application. Redundancy will not cause
computational inefficiency in the applications, as an application
only needs to use one of the alternative views. A utility should
be included with a common FO, to extract only the needed
parts. This tactic is used in the NIEM, where IEPD’s may use only
a small number of the total elements.
. . . is required when separately created ontologies, applications, or
databases need to transfer information. The FO supports
translation of data from one local terminology into the other by
having a complete inventory of primitive elements into which
complex domain entity representations can be analyzed.
. . . Will not break existing applications or databases if used only for
translating data transferred from one system to another.
52
The Foundation Ontology . . .
. . . Is intended to be internally Logically consistent.
Alternative theories can be represented using the
same set of primitive ontology elements.
If (A) is an assertion expressible using ontelms of
the FO, and an alternative assertion (not A) is
also desired for some application, then both (A)
and (not A) can be moved to an extension of the
FO as alternative theories; both (A) and (not A)
are describable (can be expressed) using only
elements in the FO.
Example: ontology has Theory1 and Theory2
ontology may assert: (holdsIn A Theory1) and
not(holdsin A Theory2)
53
What Does it Mean to “Specify the meaning of a
term”? (simplified example)
“The birth mother of a person is a woman who has
given birth to that person”
{{?Mother isTheBirthMotherOf ?Child}
impliesThat
(ThereExists {((exactly one) ?Event) and
((exactly one) ?Date) and ((some) ?Location)} suchThat
{{?Event isa BirthEvent} and
{?Event occurredOn ?Date} and
{?Event occurredAt ?Location} and
{?Mother is (The Mother in ?Event)} and
{?Child is (The Baby in ?Event)} and
{(The BirthDate of ?Child) is ?Date} and
{(The BirthPlace of ?Child) is ?Location}})}
Are All Composite Ontelms Defined by
Necessary and Sufficient Conditions?
No. Most types in a domain ontology will only
be logically specified by necessary conditions.
This leaves some ambiguity in the logical
description, but the intended meanings can be
made as specific and unambiguous as is
required for proper use of the ontelms in
applications that use them.
55
Meanings for the Foundation Ontology
• Whether meanings are interpreted intensionally (as equivalent to their
ontological representations) or extensionally (by use of verification
procedures), the ontology itself serves to construct the meanings used by
the computer for reasoning and deciding.
• Evidence that database meanings have been properly interpreted will
require human evaluation of the correctness of inferences. The goal is for
every data element to be used in computer programs in a manner that is
consistent with the intended meaning. If a program uses procedural
code for reasoning, rather than FOL inference on ontology instances, then
the meaning must be clear to the programmer.
• Evidence that text meanings have been properly represented can be
obtained from (1) question-answering or (2) conversation (the Turing test).
• For robotic systems, recognizing objects and object types, performing
actions and recognizing when actions have been performed will be
additional tests (“procedural Semantics”, as described by Woods).
56
Meaning: Procedural Semantics
Meaning and Links: William A. Woods, AI Magazine
28(4) Winter 2007
– "In this theory the meaning of a noun is a
procedure for recognizing or generating
instances, the meaning of a proposition is a
procedure for determining if it is true or false, and
the meaning of an action is the ability to do the
action or to tell if it has been done."
57
Are Ontology Meanings Stable?
• For any given ontology, the sum of all inferences for any data
input will always be the same. If the FO is changed, however,
there is a potential for a program using the FO to change its
output for the same data.
• The pragmatic criterion for “meaning” to remain constant is:
the same data will give the same output for all programs
using the FO. If a change to the FO does not change the
behavior of existing programs using it, then the relevant
meanings of the ontology elements representing those data
elements is considered unchanged.
• Thus, the FO can change without changing the relevant
meanings of the ontology elements used in programs.
• Conservative additions of ontology elements to the FO may
not cause any changes to existing program behavior.
58
Primitive Concepts
• Primitives: the most basic units of thought (such as the part-of relation)
that are used in combination to create more complex units of thought (such
as an Automobile).
• A Primitive is a concept or ontelm that cannot have its meaning specified
solely by use of some FOL combination of other independently described
primitives
• No consensus on how many primitives there are
• The COSMO project aims to provide an estimate of the upper limit (if any)
on necessary primitives
• Focus on primitives is useful to provide a starting target for a foundation
ontology of minimum size; it is not a technical requirement for a functioning
foundation ontology
• Focus on primitives can provide a useful paradigm for those who expect a
limited number; it can be ignored by participants in an FO development
project who expect an unlimited number of primitives; the FO only needs to
include all the primitives used by the domain ontologies and applications
that want to communicate by means of the FO at any given time.
59
How Many Primitives?
• Wierczbicka’s “universal core” contained 60 primitives
common among multiple languages (see Cliff Goddard
Bad Arguments Against Semantic Primitives, in
Theoretical Linguistics, Vol. 24 (1998), Available at:
http://www.une.edu.au/bcss/linguistics/nsm/pdfs/bad
-arguments5.pdf)
• The Longman Dictionary of Contemporary English
(LDOCE) uses 2148 words to define its over 64000
terms.
• Cheng-Ming Guo analyzed the Longman defining
vocabulary (Ph.D. Thesis, 1989) and determined that
there are 1433 actual “basic” words (representing 3200
word senses) that can be used, recursively, to define all
of the words in the Longman dictionary
60
How Many Primitives? (continued)
• The Japanese Toyo Kanji contain 1850 characters – those required
to be learned by completion of secondary education. Some basic
words are represented phonetically, not as characters.
• In Chinese, knowing 3000 to 4000 characters qualifies one as
“literate” (able to read a newspaper). In modern Chinese text, the
first 3000 characters cover 98.94% of text and the first 4000 cover
99.68%.
• Sign language (AMESLAN) dictionaries contain from 2000 to 5000
signs.
• The first representation of the Longman defining vocabulary plus
associated basic concepts in COSMO will contain at least 8000 types
and 1000 relations, but probably fewer than 10000 total elements
(in progress). Many of these may not be primitive.
• Doug Lenat speculates, from experience with Cyc, that as many as
15,000 primitive concept representations may be needed to serve
as a “Conceptual Defining Vocabulary” (personal communication).
61
The Lesson From Linguistic Primitives
The experience from the Longman defining vocabulary,
the Chinese inventory of commonly used characters,
and the number of sign-language signs shows that:
• When there is a reason to minimize the number of
symbols used to communicate, it is possible to find
a small set of a few thousand that can be combined
to describe any other useful concept.
– Whether this principle will also hold when the
symbols have the precise meanings of ontology
elements needs to be tested.
62
What Makes a Concept Primitive?
•
•
•
•
•
•
In the computer context, the most basic primitives are those that cannot be
specified solely by FOL combinations of pre-existing ontelms, but must include
some procedural code in order to be used properly by applications. Calculable
arithmetic functions are examples.
If two ontelms can only be represented by mutual reference (direct or transitive)
to each other, they are considered as co-primitive (if either one is a primitive), or
co-specified (if inside an extension to the core FO)
If the meaning of an ontelm can only be described by reference to example
instances, rather than by necessary conditions, it is considered as primitive.
Implied reference to experienced instances is characteristic of terms referring to
feelings or emotions.
If the meaning of a new element is expressed solely by asserting that it is disjoint
with some primitive element(s), it is a primitive.
If a new relation has no logical inferences derived from an assertion in which it
relates other ontelms, it is primitive. This means that its meaning is available only
in the documentation or comments interpretable by humans. Its meaning in
applications will be determined by its usage.
Other criteria may be also be identified, and can be adopted by the consortium
developing the common FO.
63
Types of Primitives
Procedural
•
•
•
•
For logics such as FOL the primitives are the procedural implementation of the logical symbols.
For semantic interoperability, the implementations must be identical, e.g. open-world versus closed-world.
Although logicians can manipulate the symbols consistently, programs need procedural code to carry out
the transformations implied by the logical symbols and their traditional interpretation.
A practical foundation ontology needs to include, as well as the implementation of the FOL symbols all of
the procedural code that cannot be represented by the logical symbols of the FOL (or other logic) that is
agreed to as the basis for semantic interoperability among the community that intends to be
interoperable. Those procedures will be among the primitives agreed to, which then need to be included
in the foundation ontology.
Instance-based (Perceptual, Intuitive)
•
•
*Other primitives* that are not procedurally encoded will depend on human interpretation – all
programmers need to understand the intended meaning and, when using those concepts in a program,
need to use them in a way that does not generate logical contradictions. Such primitives may include
representations of emotions or other mental objects that can only be fully understood by exposure to
instances, i.e. depend on individual perceptions as well as on the expectation that other people, especially
computer programmers, have closely analogous mental states induced by those perceptions. To the
extent that a programmer’s understanding of such primitives may differ from the understanding of other
programmers, there is a risk that programs using data in such categories will be inconsistent with each
other.
At some point, computers ma be able to test assertions about types by retrieving information from the
Web about well-known instances of those types. This provides some potential “grounding” of meanings
independent of human interpretation.
64
Procedural Primitives
• To assure that information is interpreted
consistently among applications, wherever
procedures are required to interpret
transmitted information, the procedures must
have the same computational effect (even if
coded in different languages).
• This may require the communication of
procedures composed of combinations of
more basic procedures. Some common
procedural language will be required.
65
How Many Primitives? (continued)
Are there an infinite number of “microsenses” (Wittgenstein) among the
primitive concepts?
• In any given “language Game” (Wittgenstein) each word has only one or a
few senses
• Creating texts with Definitions and Descriptions is one form of “language
game”. The need to convey precise information to a general audience
without opportunity for feedback (as in written text) forces the use of a small
number of widely understood senses.
• The words of the linguistic defining vocabulary (as in Longman’s) are
therefore likely to have only one or a few senses that are required for the
word-definition “language game”.
• Guo’s work provides evidence that this is true; fewer than 2 senses per word
are required for use of the 2148 Longman defining words in its definitions.
• Given a line and the Peano axioms, an infinite number of line segments of
different lengths can be defined. The use of a large number of composable
concepts does not at all imply that the number of primitives must be large.
66
How Many Primitives? (continued)
Will there be specialized primitives required for specialized fields?
Perhaps. One may visualize, for example, abstract mathematical objects that have a
certain relation between them, unlike any relation between objects in the physical
world. Such newly defined relations may also be primitives, and there may be a large
number of them. But math primitives differ from primitives intended to represent
real-world things.
But in general, these primitive relations will only be needed for communication
**within** that specialized field or sub-branch of some field, and can be maintained
in the specialized ontology extensions used for those specialized fields.
Since they are not needed or useful for communication with other fields, they do not
have to be added to the foundation ontology itself. The foundation ontology needed
for accurate communication among different fields will still be stable. The primitives
need to be in the FO only if they are needed to represent information that is
transmitted to agents not using the same extension ontology.
The main concern of the Foundation Ontology is that the number of primitives required
for communication among diverse applications be as stable as possible.
67
How many words are needed
to understand a text?
Estimates of minimal vocabulary sizes needed for academic
purposes start at a low of 5,000 words for reading authentic
texts (Laufer, 1997) and range up to 10,000 words for reading
university textbooks (Hazenberg & Hulstijn, 1996). Nation (2001)
argues that at least 97% of the vocabulary of a text need to be
known to gain adequate understanding of the text. To read
literary texts extensively with understanding and relative ease,
98% of the words of the texts need to be known (Hirsh & Nation,
1992; Hu & Nation, 2000). Native English speaking children
consider a vocabulary load of 2 unknown words per 100 words
(98% known) difficult reading (Carver, 1994). 2 words per 100
words translates into roughly 1 unknown for every 5 lines of text
read
68
The COSMO Project
• Motivated by an absence of a widely accepted Foundation
Ontology that can serve as a standard of meaning
• The COSMO ontology is intended to serve as a test ontology
to investigate the Foundation Ontology principle, to
demonstrate how an FO with all of the primitives ontelms
required to specify some set of domain ontologies will
support accurate semantic interoperability among those
domain ontologies. These efforts with the COSMO will also
test the size of the required inventory of primitive concept
representations.
• COSMO was initiated in in 2005 [13] as a project of the
Ontology and Taxonomy Coordinating Working Group
(ONTACWG) , a working group of the Federal Semantic
Interoperability Community of Practice.
• Since then it has been continued by Patrick Cassidy
69
The COSMO Project (continued)
• Since late 2007, the objective has been to create an
initial version that includes representations of all of the
words in the Longman Defining Vocabulary, as well as
other ontelms that are considered as basic for
describing everyday things.
• This version will be tested to determine if it contains all
of the primitives needed to represent terms in
specialized fields.
• The number of new primitives required for each
increment of new representations will indicate
whether there is an asymptotic limit to the number of
primitives required to represent all fields.
• This criterion of sufficiency is probabilistic.
70
The COSMO Project (continued)
Attaching Linguistic labels for NLU
WordNet Synset Assignments Cannot Be one-to-one
Example: verb ‘move’ sense 2 in WordNet:
2. (60) move, displace - (cause to move, both in a
concrete and in an abstract sense; 'Move those boxes
into the corner, please'; 'I'm moving my money to
another bank'; 'The director moved more
responsibilities onto his new assistant')
Includes:
(1) physical motion = ‘Translocation’
(2) Transfer of money = ‘MoneyTransfer’
(3) Transfer of Responsibility = ‘AssigningaResponsibility’
71
Principle for Labeling COSMO
Elements with English Labels:
• Use individual words in their most common senses
• Otherwise use word combinations to make the
meaning as clear as possible without a paragraph.
• The documentation elaborates on what the label
hints at
Steven Weinberg: “If words are to have any value to us,
we ought to respect the way that they have been
used historically, and we ought especially to preserve
distinctions that prevent the meanings of words from
merging with the meanings of other words.” – from
Dreams of a Final Theory
72
What’s New in the COSMO?
• About half of the ontelms in COSMO are not also
present in OpenCyc or SUMO
• BUT the goal is to make it as small as possible
while still having all of the semantic primitives
needed to describe entities in any domain
• Keeping it small will make it easier for multiple
developers to agree on the structure, and make it
easier to learn and to use
• “A theory should be as simple as possible, but no
simpler” -- Einstein
73
COSMO Phasing
• Phase 1 will develop an OWL ontology with basic
representations of all of the Longman defining
words. (completed second quarter of 2010)
• Phase 2 will elaborate the OWL representations
to represent more of the intended meaning
(current 2014)
• Phase 3 will convert the OWL version to a
Common-Logic compatible version
• Phase 4 will develop a Natural Language interface
to the ontology to make use easier
74
Open Source, Open Method
• To serve as a widely used standard, any ontology
needs input from many different developers and
users with differing views and preferences.
COSMO is fully open to input from any source,
provided that it is logically consistent with
existing content.
• If funding becomes available for a collaborative
development of a Common Foundation Ontology
by a similarly open method, that project will
supersede COSMO.
75
Multiple Viewpoints
• An important function of a Foundation Ontology is to serve
as a means to translate other, specialized knowledge
representations into each other. It may be, but does not
have to be, used as the only top-level ontology.
• Different ways to represent the same entity can be
accommodated, provided that they are logically consistent
and can be translated into each other. Everything that
anyone feels is necessary can be included – if not in the
core of semantic primitives, then in an extension
representing alternative theories.
• A given application may use only a small part of the
COSMO, extracted as needed for its own purposes;
therefore redundant alternative representations will not
reduce the computational efficiency of applications
76
Criterion for Evaluation
• The question to be determined is whether new primitives,
beyond the starting inventory representing the Longman
vocabulary, are required to represent knowledge in
specialized domains, and if so, how many?
• The rate of increase of the number of ontelms in the
COSMO for each increment (e.g. of 1000 term
representations) will provide evidence whether there is a
limit (an asymptote) in the number of terms required to
represent many other fields.
• If no asymptote is suggested, a small rate of increase may
still allow use of a common Foundation Ontology as a
means of semantic interoperability, but with more careful
attention to versioning.
• When mature, the need to add new primitives should
rarely occur
77
Summary of COSMO Effort to Date
• Representation of the Longman defining
vocabulary in OWL will likely require fewer than
10,000 ontology elements.
• Some of those are not primitive elements, and
can be specified as combinations of other
elements. They are included because they
represent common concepts and will ease the
creation of a Natural Language Interface.
• Planned addition of rules in a CL-compliant
format will increase the number of elements, by
at least the number of relations
78
Why Hasn’t a common FO Been
Adopted Yet?
• Existing Upper Ontologies are complex theories developed
primarily by small closed teams and presented to the world
for adoption.
• They:
– Are complex and hard to understand, and therefore timeconsuming to use properly
– Each use only one of several alternative ways to represent
concepts, unsatisfactory to many who have already
adopted other formalisms
– Have no publicly accessible open-source applications that
demonstrate their utility, therefore not providing the
incentive required to motivate the effort to use them
79
FO Utilities and Applications
• Utilities likely to be required for wide adoption:
– An extraction utility that can extract out only that part
of the FO that is needed for a particular application
• A similar principle is used in the NIEM information
exchange system: Information Exchange Packet
Descriptions (IEPDs) use only a part of the entire NIEM
vocabulary to form messages used for specific purposes.
A utility assists creation of such IEPDs.
• For the FO, the utility would start with the domain
ontology extension(s) required for a particular
application, and extract only the parts of the FO needed
for reasoning with those elements.
80
FO Utilities and Applications (contd.)
• A Natural Language Interface (NLI) is required that can
accept a NL description of a domain concept, using the
English defining vocabulary for the ontology , and:
• Tell whether that concept is already represented
• If not, create a logical representation of the needed
concept, and a text description of the logical
representation, for verification by the user.
• The NLI can be structured in modules (see next slide) to
allow incremental evolution of an increasingly effective
assistant for understanding and using the FO. The NLI
can be developed by the efforts of multiple groups, each
focusing on one or a few modules.
81
Architecture for the Natural Language Interface
to the FO
Knowledge Base for the NLI:
Includes specialized
knowledge of the structure
Of the Foundation Ontology
Parsing
Disambiguation
NLP
Understanding
and generation
TC
Foundation Ontology:
Provides defining concepts to specify
conceptual message Content
Word
Experts
Learning
Entity
Extraction
Metaphoric
Reasoning
Interfaces to Text Input and Output
82
FO Utilities and Applications (contd.)
• Possible Example Applications:
– Database Integration, demonstrating federated
search with reasoning using elements from more
than one RDB.
– Natural Language Understanding, limited to
conversational capability with a 6-year old native
English speaker, and including the capability of the
NL interface, allowing users to query the ontology.
– A biomedical application, possibly demonstrating
integration of patient medical information with
other attributes of the patients, outcomes, and
functions of the providers.
83
Semantic Integration: References
• IKRIS Scenarios Inter-Theory (ISIT)
– Jerry Hobbs with the KRIS Scenarios Working Group
– http://nrrc.mitre.org/NRRC/Docs_Data/ikris/ISIT_spec.pdf
84
Popular Semantic Web domain ontologies
(Swoogle, July 2005)
Ontology prefix
Namespace URI
# of Docs. Populated
rdf
http://www.w3.org/1999/02/22-rdf-syntax-ns#
382K
rdfs
http://www.w3.org/2000/01/rdf-schema#
82K
owl
http://www.w3.org/2002/07/owl#
64K
daml
http://www.w3.org/2001/03/daml+oil#
5K
dc
rss
http://purl.org/dc/elements/1.1/
http://purl.org/rss/1.0/
250K
165K
admin
http://webns.net/mvcb/
130K
sy
http://purl.org/rss/1.0/modules/syndication/
90K
foaf
http://xmlns.com/foaf/0.1/
77K
cc
content
http://web.resource.org/cc/
http://purl.org/rss/1.0/modules/content/
74K
60K
trackback
http://madskills.com/public/xml/rss/module/trackback
56K
iw
http://inferenceweb.stanford.edu/2004/05/iw.owl#
47K
bio
http://purl.org/vocab/bio/0.1/
35K
geo
http://www.w3.org/2003/01/geo/wgs84 pos#
25K
vCard
http://www.w3.org/2001/vcard-rdf/3.0#
20K
85
Skepticism
• “We cannot get everyone to agree on a single
foundation ontology”
– We don’t need everyone, just a self-sustaining community
• “We don’t need another foundation ontology”
– The fact that none has gained a critical mass of users
demonstrates that we do need another one, but one that
is constructed by a very wide community of users.
– The COSMO is not the common FO, but is being used to
demonstrate that a common Foundation Ontology is
technically feasible, if funding is available.
• There is no limited ‘conceptual defining vocabulary’
– Implies an unlimited number of primitive concepts; this is
susceptible to experimental refutation, and the COSMO
project is designed to test this question
86
Distracting Terminology Issues
• Concept: a unit of thought or of automated
information processing – not necessarily an abstract
mental object. Ontologies are composed of ontology
elements (“ontelms”) that represent such entities: see
next slide.
• Definition: A description of the meaning – not
necessary and sufficient conditions; to specify the
meaning of (in words or logic)
• Meaning: an interpretation that approximates humanlevel understanding (see later slide)
• Understanding : conversion to a logical
representation of the meaning
87
Consistent Data Acquisition Up-Front ?
Lynn Vogel: It’s also the fact that, in many cases, if you structure data up
front, it will take you a bit longer to collect it. You could argue that once
you’ve collected it, you have this fabulous treasure trove of structured
data that can advance the science of what we do. But there’s an overhead
for the individual clinicians who are collecting the data. They’re already
under enormous pressure regarding the amount of time they spend with
their patients. If you say, “Oh, by the way, for every patient that you see
now, we’re adding 15 minutes so you can structure your data so that we
all will be smarter,” that’s a pretty hard sell
• Lynn Vogel is vice president and CIO of The University of Texas M. D.
Anderson Cancer Center. In addition, he holds a faculty appointment at
The University of Texas in Bioinformatics and Computational Biology
• (from Price Waterhouse Cooper report on Semantic Technology 2009)
88
Words, Concepts, Ontelms
• Words are not Concepts. The elements in an
ontology (types, relations, functions, axioms,
instances) are neither “concepts” nor words, but
language-independent logical structures. The
meanings of the ontology elements do not
change, but the words used to refer to them may
change rapidly and vary with user.
• To avoid distracting terminology discussion,
these are referred to as “ontelms” (ontology
elements) in this presentation.
89
90
91
<owl:Class rdf:ID="ConceptualWork">
<rdfs:subClassOf rdf:resource="#AbstractSymbolicObject"/>
<rdfs:comment>In COSMO a 'ConceptualWork' (a MentalObject) is classified as an
AbstractSymbolicObject, since such works are always created in symbols, though the symbols may
have information content – the 'meaning'. COSMO differs somewhat from the Cyc description in
that we consider Codes to be included, but have a different usage of the term 'Code'.
Cyc: OPENCYC 1: MAY 23, 2002
The collection of abstract works which are the deliberate creations of one or more individuals
working in concert, have instantiations [#$instantiationOfCW] which are
#$InformationBearingThings, and associated #$AbstractInformationStructures. This is a
specialization of #$DevisedPracticeOrWork [q.v.]. For works with propositional content ; see the
more specific collection, #$PropositionalConceptualWork (PCW). Positive examples include:
#$MobyDickNovel (as opposed to any instances of #$BookCopy such that (#$instantiationOfCW
#$MobyDickNovel BOOK_COPY)), Beethoven's 9th Symphony (as opposed to any performance of
this symphony or any copy of its score).
Negative examples include: games (performances are not IBTs), awards (they do not have
associated #$AbstractInformationStructures), paintings (not abstract), customs (not deliberate
creations), natural languages (not a deliberate creation), and codes (their uses, not instantiations,
are IBTs).
</rdfs:comment>
</owl:Class>
92
Guo’s Longman Analysis
• Guo, Cheng-ming (1989) Constructing a machinetractable dictionary from "Longman Dictionary of
Contemporary English" (Ph. D. Thesis), New Mexico
State University.
• Guo, Cheng-ming (editor) Machine Tractable
Dictionaries: Design and Construction, Ablex
Publishing Co., Norwood NJ (1995)
• Yorick Wilks, Brian Slator, and Louise Guthrie, Electric
Words: Dictionaries, Computers, and Meanings, MIT
Press, Cambridge Mass (1996).
93
Words, Concepts, Representations
• Words are not Concepts
• Concept: a unit of thought or reasoning
– (from Random House Webster)
– 1. a general notion or idea; conception.
– 2. an idea of something formed by mentally combining all its
characteristics or particulars; a construct.
– 3. a directly conceived or intuited object of thought.
• In an ontology a “concept” is only that which is represented by the
elements of the ontology (types, relations, instances, rules,
functions). These are the things that are manipulated by a
reasoning system
• The “representandum”
• Words are not representanda.
94
Words Label Concepts
• Ambiguity: the same word labels multiple
concepts
• Synonymy: more than one word labels the
same concept
• Context-sensitive usage: the same word in
different contexts can label different concepts
• An ontology organizes representations of
concepts – mapping to words is a different
task.
95
Concepts vs. Words
Mathematical
Theory

/ | \
  
/\ \ /
  
| \ / \
  
|\
\
  Cat
| \ /
 
Ontological
Theory (Meanings)
Terminology
“House”
“Residential House”
“Haus”
“maison”
“дом”
House
シャム猫
Siamese
Axioms:
(Every TypicalCat
has ((exactly 4)
Legs))
(Every House has
((atLeast 1) Door))
“Siamese”
“Siamese feline”
“Siamese Cat”
“chat siamois”
“Siamesische Katze”
96
Contexts
• Q: Isn’t context important?
• A: Very. Existing ontologies have modules,
contexts, or similar mechanisms
(“microtheories”). More elaborate contextual
reasoning may be necessary.
97
Some Primitives fromWierzbicka
• I, YOU, SOMEONE, SOMETHING, THIS,
• THE SAME, THINK, WANT, KNOW, SAY, DO,
HAPPEN, GOOD, BAD, WHEN/TIME,
• WHERE/PLACE, BECAUSE, NOT, MAYBE, LIKE,
KIND OF, PART OF.
98
Meaning via Human Interpretation
• Nirenburg and Raskin: Ontological Semantics MIT
Press, 2004
– “Meaning should be studied and represented”
– Meaning needs to be “anchored in extralinguistic reality”
but the “verificationist premise” of Procedural Semantics is
not shared
– In Ontological Semantics meaning is intensional. “…
meaning is a statement in the Text-Meaning
Representation (TMR) language ”
– “The connection between the outside world … and
Ontological Semantics … is carried out through the
mediation of the human acquirer of the static knowledge
sources.”
99
Longman Definitions: “obligation”
• See: http://www.ldoceonline.com/
• Obligation: a moral or legal duty to do something
• Duty: something that you have to do because it is
morally or legally right
• Have to: if you have to do something, you must do it
because it is necessary or because someone makes you
do it.
• Must: to have to do something because it is necessary
or important, or because of a law or order
• Necessary: something that is necessary is what you
need to have or need to do
100
COSMO: “obligation”
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
<owl:Class rdf:ID="Obligation">
<rdfs:comment>A MentalObject that refers to some FutureSituation that the Agent having the Obligation may cause to happen or may refrain from doing; if the Agent does not
perform an Action to cause the FutureSituation to occur, then some negative consequence is likely to be incurred for failure to perform the Obligation. . The type of negative
consequence (legal punishment, social condemnation, eternal damnation, pangs of conscience, being grounded by one's parents) will be characteristic of different types of Obligation..
Each Obligation is assigned by some Authority, which could be a person‘s own conscience (reflecting learned social mores), or the mores of the community. In the case of a Debt, the
Authority may be the person owing the debt and the person to whom the debt is owed, if the debt arises from some agreement or transaction.
An Obligation may be created in an ObligationCreatingEvent (which see).
The notion of 'Obligation' is too primitive to be easily described by simple relations. In essence, an 'obligation' is a relation of an Agent to an Event that is derived from a belief
about what kind of behavior is best in a situation. The exact formalization of this notion is still incomplete as of 0.49. See also 'ResponsibilitySituation' for a closely related concept.
Linguistically an Obligation is expressed in several ways:
'Tom has an obligation to do X'
'Tom is obliged to do X'
'Tom has a duty to do X'
'Doing X is Tom's (obligation/duty).'
'Tom ought to do X'
'Tom must do X'
'Tom should do X'
'Tom is responsible for doing X'
Similar phrases may be used to express an action that is not an Obligation, but is a prerequisite for some desired situation: 'In order to get into college, Tom must get good grades.'
The linguistic analyzer must recognize the discourse relations that distinguish obligations from prerequisites. The type 'Obligation' in COSMO represents only true Obligations.
Each instance of Obligation will represent an Action that the agent with the Obligation is obliged to perform or refrain from. When expressed linguistically, that action will be
prefaced by the word 'to', e.g. 'to drive no faster than 60 miles per hour'.
Cyc: A collection of microtheories; a subcollection of #$SupposedToBeMicrotheory. Each instance of the collection #$Obligation is a microtheory which contains assertions
describing what some agent (the #$obligatedAgents) is obliged to do, or make true, for one or more other agents, possibly including society in general. An obligation is the most general
case of some agent owing something to another. Obligations may be undertaken in conjunction with various kinds of #$Agreements. Unlike an agreement, however, an obligation
need not have a second known party (though some do). An obligation can exist and be understood without identifying another particular agent as the 'holder' of the obligation - and
that may be true, even if the beneficiary (#$obligationOwedTo) can be identified. For example, assuming that parents have an obligation to care for their children, it is not clear with
whom a parent has 'agreed' to take care of his or her child. Some common ways to incur an obligation are through social transactions (e.g. family duties, friendship, favors) or through
financial transactions (e.g. a #$PaymentObligation). In addition, obligations may be imposed on those who are subject to one or more instances of
#$CodeOfConduct, e.g.
#$SportsRulesOf-BoxingSportsEvent or #$OfficeCodeOfConductMt.
•
•
Corresponds to senses 2 and 3 and part of sense 1 of 'obligation' and sense 2 of 'duty' in WordNet:
NOTE that sense 2 is a state, and linguistically would be expressed by a phrase like 'under an obligation', rather than the word 'obligation' itself. Sense 3 should be a subtype,but
is not yet represented.
1. (14) duty, responsibility, obligation - (the social force that binds you to the courses of action demanded by that force; 'we must instill a sense of duty in our children'; 'every right
implies a responsibility; every opportunity, an obligation; every possession, a duty'- John D. Rockefeller Jr)
2. obligation - (the state of being obligated to do or pay something; 'he is under an obligation to finish the job')
3. obligation, indebtedness - (a personal relation in which one is indebted for a service or favor) </rdfs:comment>
•
(continued . . . .)
•
101
COSMO: “obligation” (continued)
<guid>bd58bfd0-9c29-11b1-9dad-c379636f7270</guid>
<rdfs:subClassOf rdf:resource="#Responsibility"/>
<rdfs:subClassOf rdf:resource="#RulesForConduct"/>
<rdfs:subClassOf rdf:resource="#NeedMicrotheory"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="#wasAssignedByAuthority"/>
<owl:someValuesFrom rdf:resource="#Authority"/>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="#refersToExternalEntity"/>
<owl:someValuesFrom rdf:resource="#FutureSituation"/>
</owl:Restriction>
</rdfs:subClassOf>
<wordnet>obligation</wordnet>
<wnsense>obligation1n</wnsense>
<wnsense>obligation2n</wnsense>
<wnsense>obligation3n</wnsense>
<wordnet>duty</wordnet>
<wnsense>duty2n</wnsense>
</owl:Class>
102
Specialists Will Want to Use
Specialized Terms in Definitions
• The “Controlled Defining Vocabulary” is
infinitely expandable.
• Probably, at least three levels will emerge:
– the basic irreducible defining vocabulary
– the general defining vocabulary, having common
terms which are defined (linguistically) by use of
the basic vocabulary, and specified logically within
the FO or some extension as FOL combinations of
the ontelms in the FO.
– specialized defining vocabularies, containing
terms of interest to specific domains, definable by
use of the general defining vocabulary
103
Definition Acceptance Hierarchy
Executable Specification:
Methods, Sequence, States
is used in
Quasi-2nd
Axiomatic Ontology:
Order, Function Terms
OpenCyc
SUMO
DOLCE
accepts
Restricted FOL: OWL
accepts
Taxonomy/Thesaurus/Terminology
104
Economic Benefits of Semantic Interoperability
The practical significance of semantic interoperability has been measured by
several studies that estimate the cost (in lost efficiency) due to lack of
semantic interoperability. One study[1], focusing on the lost efficiency in
the communication of healthcare information, estimated that US$77.8
billion per year could be saved by implementing an effective
interoperability standard in that area. Other studies, of the construction
industry[2] and of the automobile manufacturing supply chain[3], estimate
costs of over US$10 billion per year due to lack of semantic
interoperability in those industries. In total these numbers can be
extrapolated to indicate that well over US$100 billion per year is lost
because of the lack of a widely used semantic interoperability standard in
the US alone.
[1] Jan Walker, Eric Pan, Douglas Johnston, Julia Adler-Milstein, David W. Bates and
Blackford Middleton, The Value of Healthcare Information Exchange and
Interoperability Health Affairs, 19 January 2005
[2] http://www.bfrl.nist.gov/oae/publications/gcrs/04867.pdf
[3] http://www.nist.gov/director/prog-ofc/report99-1.pdf
105
Zipfean Distribution of Characters?
Characters
% of Text
• 640
80
• 1569
94
• 3234
99.2
• 4562
99.84
• 5659
99.968
First 80% of text requires fewer Chinese characters
than each succeeding 80% of each remainder.
106
Zipfean Distribution of Characters?
Characters
• 1153
• 3050
• 4926
• 6223
% of Text
90
99
99.9
99.99
107