Executive Summary

Today, one of the biggest challenges in web technologies is information retrieval in digital resource repositories such as digital libraries and the Internet. To cope with this information growth, existing search methods will need to be enhanced to continue an acceptable level of relevancy and efficiency in the results returned. In this paper two search methods are compared: fulltext search and search enriched with a query reformulation based on semantic technologies. Both of these are implemented in a search module – SQE Semantic Query Expansion.

If ontology is going to provide the next generation of intelligent information management, more attention must be given to creating high-fidelity, truly semantic models that are capable of representing not only the electronic media that are used to store the information, music, graphics, etc. that people care about, but are also capable of representing the propositional content and the linguistic characteristics of those media.  The abstract propositional or musical contents of documents or recordings is often quite independent of the media on which it is encoded, or even the language used for the encoding. 


Entity Extraction is the process of automatically extracting document metadata from unstructured text documents.  Extracting key entities such as person names, locations, dates, specialized terms and product terminology from free-form text can empower organizations to not only improve keyword search but also open the door to semantic search, faceted search and document repurposing.  This article defines the field of entity extraction, shows some of the technical challenges involved, and shows how RDF can be used to store document annotations. It then shows how new tools such as Apache UIMA are poised to make entity extraction much more cost effective to an organization.