Synonyms and Homonyms

I’ve heard, read on blogs and in news groups that we need to solve the “synonym and homonym” problem before we can make real progress with Semantic Technology. As far as I’m concerned, it’s solved. Let’s look at each case (synonym and homonym) from the concepts (classes or properties) and instances perspective. I’ll start by summarizing the simplest approaches, which have been around since the beginning:

Synonyms for concepts:
When you determine that two concepts are synonyms (say, sofa and couch), you use the class expression owl:equivalentClass. The entailment here is that any instance that was a member of class sofa is now also a member of class couch and vice versa. One of the nice things about this approach is that “context” of this equivalence is automatically scoped to the ontology in which you make the equivalence statement. If you had a very small mapping ontology between a furniture ontology and an interior decorating ontology, you could say in the map that these two are equivalent. In another situation if you needed to retain the (subtle) difference between a couch and a sofa, you do that by merely not including the mapping ontology that declared them equivalent.

Properties work the same: if hasAuthor and hasCreator are equivalent, you can just state that with owl:equivalentProperty. The entailment is that any document that had an author would now have the same person as a creator and vice versa. Again, this would be scoped to the ontology where you declared the equivalence.

Synonyms for instances:
When we discover that we have two references to the same individual (same instance), we are saying that we have two uri’s with the same referent. This happens all the time. Just use owl:sameAs to make the association. The entailment is that anything that was known about one is now known about the other. Again, this is scoped to where you make the owl:sameAs assertion.

Homonyms for concepts:
As Led Zeppelin says, “and you know sometimes words have two meanings…” What happens when a “word” has two meanings is that we have what WordNet would call “word senses.” In a particular language, a set of characters may represent more than one concept. One example is the English word “mole,” for which WordNet has 6 word senses. The Semantic Web approach is to give each its own namespace; for instance, I might refer to the counterspy mole as cia:mole and the burrowing rodent as the mammal:mole. (These are shortened qnames for what would be full namespace names.) The nice thing about this is, if the CIA ever needed to refer to the rodent they could unambiguously refer to mammal:mole.

I’m not suggesting that people will use these prefixes in free text, but when they are trying to be precise this is a good approach. Then the free text analyzers (entity extractors and the like) merely need to work out which of the word senses was intended from the context of the document.

Homonyms for individuals:
There isn’t, as far as I know, a technical solution for this problem, because the problem is a logical one. To have homonyms at the individual level is to say essentially that a particular uri refers to two different things. It is the equivalent of saying that one Social Security Number refers to two different people (intentionally). If you have a situation where you have non unique keys (say you used phone numbers for personal identity and two people share the same number), you need to recognize that the phone number is not a proxy for a uri for a person. It may be used as the basis for the uri for the device or more accurately an account with the phone company and an agent (person or organization), which in turn could get you to the related individuals, but the phone number itself should not be the uri for a person.

Preferred terms:
We can get all of this with bog standard owl. If you want some of the other features that typically go along with synonyms and homonyms, such as “preferred term” (of all the synonyms for a concept, which is the one I should use when providing information?), one recommendation would be to use SKOS and in particular skos:prefLabel.

There are many shadings and nuances to the discovery and use of synonyms and homonyms in ontologies but I think the basic plumbing of the Semantic Web standards has most of the common use cases covered out of the box.


very helpful to me!!! Thanq

very helpful to me!!! Thanq