The IEEE LOM standard (LTSC, 2002) has several elements that cover
different metadata information to describe a learning object. LOM
stores all knowledge about a learning object on these fields in natural
language. Only being human-readable, this feature provides a good
expressiveness but does not enable the reasoning capabilities over
metadata records.
As an example, the coverage field could link to ontology concepts
such as “Baroque” or “Renaissance period”
to insert new facts that enable reasoning capabilities and provide
new room to create powerful search methods. The assertions inserted
in a field must be described in a logic language.
Marvin Minsky said: “People have silly reasons why computers
don't really think. The answer is we haven't programmed them right;
they just don't have much common sense". The key is given in
the following question jointly with the Marvin Minsky assertion:
How to create the common sense on computer reasoning process? In
order to answer this question and to enable the capacity understand
and reason about the world as intimately as people do, Cyc Corporation
has been doing a huge research effort during the last years.
Although the launch of the release candidate was scheduled for
April 2006, and delayed to August, still nowadays it doesn’t
launch. It is not available yet, and for that reason we have been
forced to use the version 0.9.
Our research explores the integration of the common sense reasoning
in SLOR using the OpenCyc knowledge base. In particular, We have
studied two ways to integrate OpenCyc in SLOR:
• First approach: Using an OpenCyc Server connecting thought
the OpenCyc Java API.
• Second approach: OpenCyc knowledge base stored in a RDBMS
System, such as MySQL or Oracle, and management by RDF Frameworks
using the semantic technologies features.
OpenCyc Java API
When the OpenCyc server starts up, it creates a new instance that
opens the default port (3306) and following it shows the CycL prompt
“CYC(1):” at the command line.
We can connect to an OpenCyc Server instance through the OpenCyc
Java API. In order to open a new connection, we need an instance
of the CycAccess class. The class’s constructor receives the
connection parameters (server address, server port, connection protocol
and type of connection) that set up the connection against the server.
We can use the traceOn method it we want to enable the status messages
on the server console.
public
class OpencycController
{
CycAccess _cycaccess;
/** Creates a new instance of OpencycController
*/
public OpencycController() throws
java.io.IOException
{
try
{
cycaccess = new CycAccess("192.168.0.31",
3600,
CycConnection.BINARY_MODE,
CycAccess.PERSISTENT_CONNECTION);
_cycaccess.traceOn();
}
catch (java.io.IOException
err)
{
throw (err);
}
}
//…
}
Connect to the OpenCyc Server
We can also send CycL commands such as queries ‘cyc-query’
to retrieve inference data of the knowledge base. It is an easy
process that has four steps shown below in the example:
1) Declare all variables by means of the CycVariable class.
2) Build the query using the CycAccess object.
3) Link the declared variables to the query through the CycObject
factory.
4) Finally, send the query through the CycAccess object and get
all results into a CycList.
public
class OpencycController
{
CycAccess _cycaccess;
public ArrayList getLivingLanguages()
throws java.io.IOException
{
CycVariable languageVariable = null;
CycList response = null;
ArrayList results = null;
CycList query = _cycaccess.current().makeCycList("(#$isa
?X #$LivingLanguage)");
languageVariable = CycObjectFactory.makeCycVariable("?X");
try
{
CycConstant mt =
this._cycaccess.getConstantByName("InferencePSC");
response =
_cycaccess.current().askWithVariable(
query,
languageVariable, _cycaccess.inferencePSC);
results = new java.util.ArrayList();
Iterator iterator = response.iterator();
while (iterator.hasNext())
{
CycConstant item = (CycConstant)iterator.next();
results.add(item.getName());
}
return results;
}
catch (java.io.IOException
err)
{
throw (err);
}
}
// …
}
Simple Inference: an example of living languages
retrieval from OpenCyc
We have evaluated different queries registering the different response
times. The results showed a satisfactory low average in the global
response time. This fact postulated this approach as an excellent
bet for the final SLOR design.
Are all metadata records written in CycL language?
There not exists an easy answer; OpenCyc has a shut model that
does not binding with other ontology concepts that written in other
languages. Although there are several research projects studying
the OpenCyc interaction with other thesaurus, corpus or ontologies
schemas to improve the common sense reasoning, the real implementation
does not exits yet, in particular the interaction with semantic
web schemas.
In order to provide a flexible schema that enables the reasoning
capabilities offered by both the OpenCyc knowledge base and the
Semantic Web technologies, we have studied two different approaches
to store the knowledge within learning object metadata records:
- To store the metadata records in OpenCyc format: In this approach,
we have chosen to write the SLOR ontology in CycL as a MicroTheory
into OpenCyc Knowledge Base, because of the high performance offered
by the OpenCyc engine. The link to other concepts and the reasoning
with these are the only problems. The paper (Reed and Lenat) that
shows a study about ontology mappings to OpenCyc, the same way we
choose to link to other ontologies concepts despite the fact that
it needs too many assertions.
(synonymousExternalConcept TERM SOURCE STRING)
(overlappingExternalConcept TERM SOURCE STRING)
(extConceptOverlapsColAndReln COL RELN SOURCE STRING)
A future research direction is about to extend the OpenCyc inference
together with different ontologies written in different languages.
- A Blended model between OpenCyc and OWL Ontologies: We postulated
the following hypothesis: If all the information is stored in the
same format (RDF) we can increase the reasoning capabilities because
it is feasible to link and create concepts between different ontologies
without the necessity of establishing a mapping process. Following
this hypothesis, we have serialized the OWL OpenCyc file into a
relational database format using a semantic web framework (the next
section explains this process in-depth). Although this approach
solves the problems enunciated in the previous approach (since all
knowledge is stored in the same format), the reasoning capabilities
are considerably lower. The OWL language only provides a subset
of the first order logic, called description logic (Baader, et al.
2003) materialized in the OWL-DL specification. However this problem
is less important since there are new rule languages (e.g. the SWRL
combining OWL and RULE-ML (Horrocks, et al. 2002)) that improve
the reasoning capabilities providing new inference features. To
enable a machine readable description within learning object metadata
records, it is necessary to write the knowledge in a logic language.
Besides, the techniques used must keep OWL axioms. Therefore, we
can conclude that in order to carry out the creation of the OWL
relationships among different fields into a metadata record, a sublanguage
is required.
We had developed a sublanguage to insert semantic expressions into
LOM metadata fields. For a detailed discussion, see (Sicilia, Sánchez-Alonso,
and Soto, 2005).
Figure 7 - SLOR Sublanguage: Adding semantic expressions
within metadata records
OpenCyc integrated in RDF Frameworks
OpenCyc can be integrated into a semantic web framework making
use of the OWL file published in the web (more than 700MB). We have
been worked with Jena and Sesame, two of the most widely used semantic
web frameworks.
When a new OpenCyc OWL model is created with Jena, all triples
gets stored into the stmt table (figure [8]).
Figure
8 - Snapshot of Stmt Table View (OpenCyc stored in Jena)
A different look shows the Triples table of Sesame, when a new
OpenCyc OWL model is created. It is strongly connected into a relational
system (figure [9]).
Figure 9 - Snapshot of Triples Table View (OpenCyc stored in Sesame)
In order to create this model it is necessary to open a new connection
against a database system. Focusing Jena, the code below shows how
can the OpenCyc OWL file can be stored in a relational database
and how can it be connected to a JENA model too.
public
class OntoSchemaFactory
{
//…
public Model createOpenCycModel()
throws PersistenceException
{
// Load the Driver
try
{
Class.forName("com.mysql.jdbc.Driver");
}
catch (ClassNotFoundException
ex)
{
SystemPersistenceException errsp = new
SystemPersistenceException();
throw (errsp);
}
// URL of database server
String DB_URL2 = "jdbc:mysql://192.168.0.31/opencyc";
String DB_USER = "root";
// database user id
String
DB_PASSWD = "senux";
// database password
String DB = "MySQL";
// database type
// Create database connection
IDBConnection conn2 =
new DBConnection(DB_URL2, DB_USER,
DB_PASSWD, DB);
try
{
ModelMaker c = ModelFactory.createModelRDBMaker(conn2);
Model opencyc;
if (!conn2.containsModel("opencyc"))
{
//Create Model OpenCyc
opencyc = c.createModel("opencyc");
System.out.println("CONSTRUCCIÓN
DEL MODELO");
opencyc.read("http://www.cyc.com/2004/06/04/cyc/#");
}
else
{
//Open Model OpenCyc
c.openModel("opencyc");
}
opencyc = ModelRDB.open(conn2, "opencyc");
return (opencyc);
}
catch (RDFRDBException ex1)
{
ConnectPersistenceException errcon = new
ConnectPersistenceException();
throw (errcon);
}
}
// …
}
Jena Persistent Framework - Connect to underlay
database system.
Creating a Jena Model
Once the persistent connection has been established, the model
parameters information should be set to enable transactions both
for changing the schema and appending new information. Besides,
it is possible to enable an OWL reasoner to discover hidden knowledge
among ontological relationships. Jena provides different choices
to fix the reasoning level used by the built-in JENA reasoner. In
particular, there are four reasoning levels, ranging from simple
to complex:
Reasoning Level |
|
TRANS_INF |
Uses a simple transitive reasoner. |
RDFS_INF |
Uses a reasoner that enables the entailed RDFS inferences,
including the transitive reasoning as well. |
MICRORULE_INF |
Uses a rule-based reasoner that attempts to make a useful
trade-off between semantic completeness and computational efficiency.
(Recommended for OWL) |
RULE_INF |
Uses a rule-based reasoner that hard encloses the semantic model.
This reasoner level requires an inference engine to perform
more complex inferences. Different rules written in SWRL can
be triggered by the inference engine to provide the results
ordered by the reasoner. |
The following code example shows how a Jena Model can be created
and the initial parameters set, as the reasoning level:
Model mbase = OntoSchemaFactory.createOpenCycModel();
m = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM_MICRO_RULE,
mbase);
Although it is look liked a nice theory, when short datasets have
been valued admissible response times in several inferences; we
have been experiencing a dramatic decrease performance when an OpenCyc
model is associated to a reasoner. Why this does happens?
The reasoner carries out all inference processes to discover new
assertions. In big models, such as OpenCyc, the large amount of
relationships complicates the inference tasks which increases the
query response times. Thus we have verified in the first versions
of the SLOR prototype, that this fact fosters the building of unusable
applications (it delays up to 30 seconds to retrieve some data).
Because of this, we discuss in the section four, several solutions
to solve these problems.
Retrieving inferred data
In order to retrieve inferred data, we have been used the generic
classes and interfaces provides by the Jena API. As it is shown
in the example below, we can retrieve all direct and indirect class
of one given. First, we need to retrieve from the model the ontology
class given, it is necessary to execute the inference (through the
getOntClass method of Model class). Second, if the reference is
not null, we can execute the listSubClasses method of the OntClass
interface in order to retrieve all direct or indirect classes (whenever
the flag parameter is set to true). Once executed the listSubClasses
method, an “extended iterator” is returned to move forward
along the inference results. Finally we insert the current class
(returned by the next method of the ExtendedIterator interface)
in each iteration loop.
In OpenCyc, this easy inference takes long time, due to the big
amount of data stored in the underlying persistent model.
public
void getOPENCYCDerivedClasses(String
dclass, LinkedList listClasses)
{
OntClass c = m.getOntClass(Schema.NSCYC + dclass);
if (c != null)
{
ExtendedIterator i = c.listSubClasses(false);
while (i.hasNext())
{
OntClass subc = (OntClass)i.next();
listClasses.add(c.getLocalName());
}
}
}
|