Interview with the OpenCyc Guys

Eugenia Loli 2007-02-21 Original OSNews Interviews 8 Comments

Artificial Intelligence has been at the center of any geek’s dream for years. One of the projects that’s closer to true AI is Cyc. The open source version of the commercial Cyc product is called OpenCyc and it reached v1.0 status last year. Their mission is to grow both the Cyc & OpenCyc ontology and knowledge base — even if they are not directly affiliated with Cycorp (the original creators of the Cyc technology). The answers to our mini-interview are provided by project members Mark Baltzegarm, John De Oliveira and Brad Bouldin.1. Could you describe the Cyc technology in a few simple terms? How does it all work?

Cyc is like an unabridged dictionary for computers. Software applications can use Cyc to look up what something means and then
react intelligently.

A lot of what computers do today, is simply push strings of text around. They react to the presence of certain strings by assembling
other strings and then blindly displaying them. They do this, of course, without any real sense of the underlying meaning. This is all
that happens, for example, in social tagging applications.

In the pursuit of intelligent behavior, software may also use math, too. Google’s page rank and Amazon’s product recommendations appear semi-intelligent in the “choices” they make through various statistical algorithms.

Unlike Google and Amazon, Cyc explicitly represents knowledge. It relies on concepts, facts and rules to do reasoning which is
transparent and traceable. There are no “magic” algorithms that leave one wondering how something happened. One can trace and debug Cyc’s reasoning every step of the way. And, since this chain of thought is both transparent and manipulable, one can later layer in new tactics and strategies for solving problems as they become available.

Cyc’s concepts are organized into one big “ontology”. An ontology is like a taxonomy, but with much richer interconnections between terms. This ontology piece is very useful in its own right, apart from the reasoning engine. We expect it to become an important component of the approaching Semantic Web.

2. How difficult is it to build a complete general knowledge base and commonsense reasoning engine using the OpenCyc product compared to Cyc?

Complete? Well, that’s a huge task with either of the tools. But, Cyc has a wealth of facts and rules that are not part of the OpenCyc
ontology. It also has natural language capabilities that are not in OpenCyc. The most complete commonsense reasoning engine will come from a combination of the two. OpenCyc’s breadth (number of concepts) will outpace that of Cyc, but Cyc’s depth (complex rules) will outpace that of OpenCyc. In any case, all of Cyc is now available under a research license; so, if one is building software for non-commercial purposes, they can have access to all of what is in Full Cyc today. One can incorporate OpenCyc into a commercial product for free.

3. Is it possible to integrate Cyc/OpenCyc with an existing operating system and use it as a basis for an OS that has full support for natural language? How far away are we from having advanced AI on our home computers and operating systems?

Many people have approached Cycorp about making an OS controlled by natural language. Usually, they were looking to layer Cyc on top of Linux, but didn’t have the resources to pursue the project. It may take a Microsoft or Google to get seriously interested in doing this. Most large companies, however, get caught in a kind of incrementalism — so we’ll have to see if they’ll take the leap.

4. How difficult/easy is it to integrate proper speech recognition into Cyc?

Cyc leverages existing speech-to-text technologies and contributes contextual understanding. So, text is the bridge between speech
systems and Cyc. Speech systems produce text and Cyc translates the natural language into its own internal representation. When the speech-to-text processor has trouble choosing among ambiguous sounds, Cyc can help identify which possible meanings are more — well, meaningful.

5. What is the biggest challenge in the development of Cyc?

Bridging the gap between the experts currently using the system and others who can potentially help.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

8 Comments

2007-02-21 6:11 pm
SReilly
I had no idea that Cyc had an open source version. Pretty cool.
Wat was really interesting was that Cyc has been approached by people in the hope of making a natural language OS by layering Cyc on top of Linux. Sounds like a nice project but it seems that in all these situations what was laking was resources.
Shame really, I’d love to see a natural language Linux distro. IMO that would be the Linux desktop killer app allot of us have been waiting for.
Keep up the good work guys!
2007-02-21 9:58 pm
ahwayakchih
Growing AI must be interesting thing to do .
Off-topic: “Cyc” in Polish language means more or less the same what “boob” means in English. So at first look on osnews.com main page i thought “nice, some jokers from Poland have fun” .

2007-02-22 11:08 am
Ultimatebadass
Yeah, no kidding, I laughed out loud at the office when I saw that
2007-02-22 11:38 am
miscz
They should change the name to OpenTit, we shouldn’t be the only ones having fun

2007-02-21 10:37 pm
TommyD
User: [Clicks the Save icon]
‘Puter: Ain’t no saving your sorry butt. This reports is so weak it makes my front-side bus cry.
User: Shut up and save!
‘Puter: Don’t we have a ‘tude. Well, BOOM, there goes your report, I’m going to sleep mode now.
[computer screen goes dark]
User: No, No! It’s due tomorrow. I take it back. Your processor is awesome. I am blown away by you expansion slots…
[Monitor comes back on]
‘Puter: Oh, take it easy, suck up. Your flimsy spreadsheet is safe. Just like to see you squirm, remind you who’s boss here.
User: Sigh.
(edit: spelling errors, naturally)
Edited 2007-02-21 22:41

2007-02-21 11:13 pm
Doc Pain
Attention: The following post is for entertainment only. 🙂
User: I want to create a new user account for my credit card.
‘Puter: Uhm… well, let’s see if we have enough time… yes, we have. Your next meeting is in 30 minutes. So, what name should we take?
User: John Smith.
‘Puter: Is this your name?
User: Yes.
‘Puter: Are you sure?
User: Yes!
‘Puter: Tell the name again.
User: John Smith.
‘Puter: The name is… John Smith. Is this correct?
User: Yes.
‘Puter: In a whole sentence, please.
User: Yes, it is.
‘Puter: What is it?
User: The name!
‘Puter: Which name?
User: John Smith.
‘Puter: This is your name.
User: Yes, of course!
‘Puter: Now, enter your PIN.
User (enters silently via keyboard): ####
‘Puter: Your PIN is (shouts) 1055.
User: Damn! Be silent! My wife is listening!
‘Puter (louder): His PIN is 1 0 5 5!!!
User pulls plug and takes wallet with coins.
Would be nice to have a natural language OS to teach people using their native language correctly. Or evem a foreign language. Ever hears a german 50+ man reading english error messages? Quite funny, guessing what he’s talking about… largooney, fittoores, permeeson, spaatse, lookartey, peeparleena, boorst, rott, bott carmb, brooser, carfine, gongver, kb3, tooster… 🙂
Edited 2007-02-21 23:27

2007-02-22 2:45 am
project_2501
The semantic web won’t happen.
In simple terms, ontologies don’t work unless agreed upon between creators of content. And this very constraint is a show-stopper. The task is too laborious.
Its too laborious in the same way it was too laborious to maintain correct directories and indices for web content.
There is a lot of hype around the semantic web but in the many years the w3c has been sponsoring work on it how much has actually convinced people?
The best you can hope for is better analysis of language using grammar background knowledge. And that can yield better search results … but the killer-apps promoted for the Semantic Web require a perfectly correct onologies agreed between multiple sites – and that will not happen. And even if people say they have agreed – how much does human error and ambiguiy creep in.
There is nothing wrong with AI, informatin retrieval or data mining methods … there is much work still to be done in these areas… but have a look at any serious research department and you’ll find plenty of work in these areas and not much on the semantic web. there was a time when the words “semantic web” were used to get funding….
in summary: there is a place and a need for better analysis of freely formed text, but let’s not get drawn to false prophets in a field which certainly raises peoples hopes and emotions!
2007-02-23 2:56 pm
lorahpj
Eugenia posted the link to the Cyc and Opencyc website, but there’s an excellent video at Google Video with an overview of the project, accomplishments to date, approach, etc. This presentation was done by Doug Lenat the founder of the project.
http://video.google.com/videoplay?docid=-7704388615049492068
Taking an incremental approach, instead of taking on writing an OS based on a natural language interface, I think the “common sense” logic in OpenCyc could really help a project like the on-again/off-again object based filesystems like WinFS (formerly the Cairo file system) or Gnome Storage.
http://www.gnome.org/~seth/storage/
This might be a good place to start as it probably could be used to facilitate indexing and searching.