The Semantics of the the Dublin Core – Metadata for Knowledge Management

In my previous article, I proposed that the library catalogue could be used as a blueprint for the Semantic Web. Perhaps theoretical and conceptual, the arguments fleshed out the ideas, but not the practical applications. For this article, I will outline in greater detail how exactly, developments in library and information science are playing out, not only in the SemWeb, but also in knowledge management for business.

Dublin Core Metadata Initiative
The SemWeb has been discussed at length by computer scientists and software engineers from all corners of the web. What has gone unnoticed, is development from the Dublin Core Metadata Initiative, an organization engaged in the development of interoperable online metadata standards, not only by computer scientists, but also librarians and information scientists.

Since 1995, the Dublin Core Metadata Initiative (DCMI) has worked on promoting widespread adoption of interoperable metadata standards, and develops specialized metadata vocabularies for describing resources that enable more intelligent information discovery systems. The original objective of the Dublin Core was to define a set of elements that could be used by authors to describe their own Web resources. Naturally this tied into finding ways to implement the DCMI for SemWeb applications.

Comprised of fifteen elements (title, creator, subject, description, publisher, contributor, date, type, format, identifier, source, language, relation, coverage, and rights) used for resource description, the Metadata Element Set provides a simple and standardized set of conventions for describing things online, in ways that make them easier to find. The Dublin Core is already used to describe digital materials such as video, sound, images, text, and composite media, like web pages. These fifteen elements were deliberately made simple so that non-library catalogers could provide basic information for resource discovery.

Because of its simplicity, the Dublin Core has been used with other types of materials, and for applications demanding increased complexity. Because of its design, which allows for a minimum set of shareable metadata in the Open Archive Initiative-Protocol for Metadata Harvesting, there are already thousands of projects worldwide that use the Dublin Core for cataloging, or collecting data from the Web.

Dublin Core and SemWeb
So, how does the Dublin Core Metadata Element Set fit into the SemWeb? Implementations of Dublin Core use not only XML, but are based on the Resource Description Framework (RDF) standard. The Dublin Core is an all-encompassing project maintained by an international, cross-disciplinary group of professionals from librarianship, computer science, text encoding, the museum community, and other related fields of scholarship and practice. As part of its Metadata Element Set, the Dublin Core implements metadata tags such as title, creator, subject, access rights, and bibliographic citation, using the resource description framework and RDF Schema. Thus, the Dublin Core’s role in knowledge management activity representation will be significant in the emergence of the SemWeb.

Opportunities: Business-to-Business (B2B) SemWeb
If the Dublin Core is to play a part in the SemWeb, it offers exciting opportunities in B2B commerce. Negotiation automation deals with business processes that frequently require human knowledge to achieve their goal – particularly to achieve agreement in negotiations. But because B2B vocabularies on the Web are limited to data exchange, and do not extend to the sharing and production of knowledge, XML is the de facto standard for developing B2B applications on the Web.

SemWeb technologies, which extend the XML standards, envision a future Web as a platform for the easy gathering and use of information, and the submission of feedback about that information. Since SemWeb rules add a high level of automation to the processing of business documents across companies, the SemWeb will be significant in the future of B2B, particularly since metadata plays a critical role in investments in data warehousing, data mining, business intelligence, customer relationship management, enterprise application integration, and knowledge management.

Corporate Initiatives
This is where the Dublin Core Metadata Elements come in, where work has already been done over the past decade. First started in 2000 at the Dublin Core metadata workshop in Ottawa, Canada, a special interest group formed to explore metadata issues that are of particular interest to the business community. The provisional charter for the group included:

  • To investigate metadata schemas used in commercial business models for Business-to-Business, and Business-to-Consumer.
  • To promote the use of Dublin Core in internal and cross-company business environments.
  • To identify business sectors and commercial resources (e.g. information, services, catalogs, products) that could benefit from the use of the DC standard.
  • To highlight within the DC Community the commercial ramifications of DC developments.
  • To discuss the possible expansion of Dublin Core to accommodate information vital to commercial requirements and uses.

The workshop followed with a Dublin Core annual conference in 2002 that marked the beginning of a new effort by the Dublin Core Metadata Initiative (DCMI) to involve members of the corporate world in the evolution and application of the Dublin Core standard. Not only did the workshops involve companies using Dublin Core for their intranets and extranets, they also involved information providers (publishers and aggregators). Companies that attended the conference workshops include DuPont, Price WaterhouseCooper, Siemens, Rohm Haas, GSK, and AstraZenica.

Current Initiatives
Outside of the Dublin Core, one company working on applying Dublin Core metadata standards for the SemWeb is Taxonomy Strategies, which has consulted with global companies, government agencies, and NGOs in developing metadata frameworks and taxonomy strategies. Taxonomy Strategies’, not surprisingly, is comprised of librarians and information scientists, including the former president of the American Society for Information Science, and the co-editor of the original Dublin Core Metadata Initiative.

Since the SemWeb, B2B, knowledge management, and the Dublin Core Metadata Initiative all include similar technologies and concepts, it makes sense to place greater attention on the nuances of controlled vocabularies, taxonomies, and ontologies for a broader, more inclusive dialogue in the information technology community.

Conclusion: Librarians and Ontologies
But knowledge management for librarians and information science is nothing new. In 2002, two years before Tim O’Reilly’s coining of the term, “Web 2.0,” librarian Katherine Adams had already argued that librarians will be an essential piece to the SemWeb equation. Her seminal piece, The Semantic Web: Differentiating between Taxonomies and Ontologies, Adams argues that ontologies and taxonomies are synonymous – computer scientists refer to hierarchies of structured vocabularies as “ontology” while librarians call them “taxonomy.” What the Dublin Core offers is an opportunity to bridge together different topics and extend across disciplines to navigate the complexities of the SemWeb.