OpenCyc (Lenat, 1995) is the open source
version of the Cyc technology, one of the most complete common sense
knowledge bases. Nowadays, it’s being considered as a standard
upper ontology by IEEE.
The next version of OpenCyc (ver 1.0) is expected to entire Cyc
Ontology (more than 300,000 terms), concepts in other languages
(translation skills) and new XML features to extend the Cyc Ontology
with other schemas (such as the newest semantic web ontology languages).
The main advantage that Cyc has over other knowledge bases is the
language in which its knowledge is written, CycL. It is a language
whose syntax derives from first order logic calculus and from Lisp.
This language provides a good expressiveness to write common sense
rules rather than description logic that offer lesser level expressiveness
(as it is a subset of the first order logic). On the other hand,
CycL provides a clear syntax that enables a good performance in
inference tasks instead of the XML syntax of semantic languages
specifications. Thousands of terms and rules are written in CycL,
thus building the different knowledge base layers.
OpenCyc has another version (with several restrictions) written
in OWL including less terms and no rules whatsoever. Due to the
vast taxonomy of the knowledge base written in OWL, the most popular
reasoners need too much time to carry out simple inference. In the
fourth section, we explain all the complications found.
The following sections depict a briefly introduction to the knowledge
base structure and the storage techniques.
Knowledge Base Schema
A Common-Sense knowledge base is a vast
taxonomy of concepts and relations. OpenCyc has a pyramidal layer
structure ranging from the abstract to general concepts and relationships
between these (figure 5):
Figure 5 - OpenCyc Knowledge Base Layers
- Upper Ontology: represents very general relations between very
general concepts. The Upper Ontology doesn’t say much about
the world at all.
- Core Theories: represent general facts about space, time, and
causality. These are the theories that are essential to almost all
common-sense reasoning.
- Domain-Specific Theories: These theories apply to special areas
of interest like military movement, the propagation of diseases,
finance, chemistry, etc
- Facts: These are statements about particular individuals in the
world.
OpenCyc Ontology Storage
As it has been mentioned before, OpenCyc uses CycL to represent
the knowledge. Even though the natural language has more expressiveness
than CycL, this doesn’t enable the reasoning capabilities
and provides special problems to store the knowledge. In contrast,
CycL is a logic computer language that allows automated inference
process without loosing the expressiveness enclosed in a first order
logic model. However, not everything can be described using first
order logic; but this problem is out of the scope of these technical
notes. For a detailed discussion about this, see (CYCL ref).
The OpenCyC Knowledge base provides a big universe of discourse
written with facts, predicates and rules using the well formed formulas.
a) Terms:
- Constants can denote individuals, collections, or collections
of collections, e.g., #$GeorgeWBush, #$Sudan or the collection #$WorldLeader.
- Functions take arguments and return results, e.g. (#$PresidentFn
#$France) return #$JacquesChirac or (#$GroupFn #$Person) that returns
all collections of persons (Americans, smokers, athletes, etcetera)
- Variables take values defined in the universe of discourse, e.g.,
?X or ?Y. A variable is a term used in predicates or functions.
b) Basic predicates:
- #$isa is the most basic term in CycL. This term is used to say
that something is part of a collection. Everything belongs to at
least one collection.
(#$isa #$Golf #$Sport)
- The #$genls term is used to say that one collection is a sub-collection
of another.
(#$genls #$Cat #$OrganismClassificationType)
CycL has predicates that are used to describe syntactic and semantic
conditions for writing well-formed sentences. These are #$arity
and #$argxisa:
- #$arity denotes the number of arguments that a predicate must
have, for example:
(#$arity #$biologicalMother 2)
- #$argxisa defines the type of predicate arguments, for example:
(#$arg1Isa #$biologicalMother #$Animal)
(#$arg2Isa #$biologicalMother #$FemaleAnimal).
CycL has an easy syntax to insert facts. The predicates previously
defined are used to create new facts. As e.g. “Luisa is the
biological mother of Anne”, what is exposed in CycL like this:
(#$biologicalMother #$Luisa #$Anne)
c) Well formed formulas:
(#$forAll ?COUNTRY
(#$forAll ?PERSON
(#$implies
(#$and
(#$isa ?COUNTRY #$Superpower)
(#$headsGovernment ?COUNTRY ?PERSON))
(#$hasStatus ?PERSON #$WorldLeader))))
(#$forAll ?ANIMAL
(#$implies
(#$isa ?ANIMAL #$Vertebrate)
(#thereExists ?PART
(#$and
(#$isa ?PART #$Tongue)
(#$anatomicalParts ?ANIMAL ?PART)))))
|