Semantic Web-based E-Commerce: Webmasters, Get Ready!

Executive Summary

Within the past few months, the Semantic Web has become something any Webmaster must take serious, because the key components are ready for production usage, and there are tangible benefits: With GoodRelations, there is now a standard vocabulary for e-commerce, RDFa provides a stable syntax for embedding such data in existing Web pages, and Yahoo Searchmonkey creates a direct business incentive for companies of any size to care.


Quite clearly, the Web has transformed our access to suppliers globally. Instead of studying heavy volumes of yellow pages, we can simply search the Web for a supplier for any given need. However, most of us have experienced that the Web is not as convenient as we would hope it to be when searching for a product or service. We often spend hours searching and still cannot find what we need. Most annoying is that our computers hardly help us making sense of the vast amount of offers on the Web. We as humans have to filter and recombine a lot of information.

Obviously, things could much better: Our computers could gather, combine, and filter data on products and services and process it automatically – and this no matter where we stand in the value chain: End-users could search precisely for their needs. Retailers could reuse product feature data directly from manufacturers’ Web pages. Recommender systems could combine product and offer data from multiple sources. This would basically just require that we add machine-readable product data and the like to existing Web content.

Now, while researchers have long talked of the benefits of next generation Web technology for helping out, the last quarter of 2008 has brought the availability of all bits and pieces needed to make this a reality. With GoodRelations, there is now a standard vocabulary for e-commerce, RDFa provides a stable syntax for embedding such data in existing Web pages, and Yahoo Searchmonkey creates a direct business incentive for companies of any size to care.

Any Business Should Care

This novel approach will soon determine how visible a business will be for potential customers on the latest generation of search engines and price or product comparison services. It thus affects any business that wants to be found on the Web. Respectively, the topic is at the core interest of everybody who is offering any kind of goods or services anywhere in the world.

It is as relevant for a small hotel somewhere in the alps as for a giant mail-order company, as relevant for someone cleaning cars as for someone offering nanotechnology services. Because of this fundamental impact, it is also crucial that vendors of Web shop software get involved, because they can easily upgrade their solutions so that all deployed Web shops will support this new feature out-of-the-box.

Limitations of the Current Web for E-Commerce

Before we proceed, we have to understand in which sense the current Web is deficient for e-commerce. The key limitation is that the support we currently get from our computers is limited to displaying a page that someone else has encoded and sent over the wire.

In many e-commerce scenarios, however, we have to extract and combine data from multiple Web pages – for example, if we want to compare multiple product models, or if we want to import the product features and specifications published by a manufacturer into our own Web shop. Even though the Web shop does internally maintain its product data in a structured form, the only way for us to extract and reuse it is to copy-and-paste it, element by element.

The following figure illustrates the problem.


Enterprise 1 on the left-hand side maintains a database with data describing the products and services being offered. This data is already pretty well structured, often in the form of multiple tables, well suitable for processing by computers. The Web shop software converts this data into small chunks of HTML code that describes how each page should be displayed in a Web browser. When you navigate your Web browser to that shop, your computer fetches the code representing this page, and displays it on your screen. This is appropriate for you as a human sitting in front of your computer, because all your computer needs to know is how the content should look like.

Loss of Data Structure and Meaning

However, when an enterprise or a single user with a more sophisticated task at hand wants to extract and use the original data, a lot of human effort is necessary to restore the original structure of the data. Think of how annoying it is to cut and paste someone’s mail address from your browser to your address book, element by element.

This is even more annoying if we consider that the data was already in a structured format, ready for processing. So the loss of data structure over the Web is a great cause of unnecessary human information processing at the recipient’s side. For a single item, such may take only 20 or 30 seconds. But over time, we waste a significant amount of labor. Also, this unnecessary step can introduce errors and hamper the data quality.

Semantic Web for E-Commerce

Now, pretty much right from the beginning, the World Wide Web was envisioned to provide more than it does for most of us today. The hope for computers aiding us in managing and processing information in the Web is as old as the Web; it’s inventor, Sir Tim Berners-Lee has clearly articulated this from early on: The vision of a “Web of Data”, in which computers and humans can share and process information smoothly.

The good news is that this next generation of the Web is NOW. Within the past years, a global research community has brought to maturity an impressive set of technical contributions that make will lift the World Wide Web to new heights. We are not talking about early prototypes in some hidden laboratories. Major companies have joined the initiative, and commercial products and services are already entering mainstream markets.

But how does this Web of Data work? The basic idea is simple: Same as current Web pages contain elements that augment the text by multimedia objects like images, video, or audio, we can now embed structured data into the Web page. This structured data can then be extracted by our computers and is suitable for further processing: without retyping, and without copying and pasting textual elements. This simple approach is complemented by mechanisms for encoding meaningful connections between Web resources.

Meaningful in here means they tell us more than the mere fact that we can click on the link to proceed to a related page. This in combination with quite some clever technology paves the ground for much more powerful computer support for using the World Wide Web.

What does this imply for E-Commerce on the Web?

Well, mainly two things: First, Manufacturers can put all the details of product specifications on the Web: Screen sizes of TV sets, weights of cell phone handsets, capacities of car batteries, and the like. Then, consumers can use this data to search very precisely for product models matching their needs. And retailers, both traditional ones and Web shops, can easily import and reuse such data from the Web for advertising and pricing.

Second, Web shops can publish all the commercial properties of their offers in a way accessible to intelligent browser plug-ins, recommender systems, and next generation search engines – for example price information, shipment options and charges, methods of payment, opening ours and shop locations, and the like. This basically holds for all industry branches and all types of business – electronics and engineering, tourism, entertainment, or professional services.

GoodRelations: A Standard Vocabulary for Offers on the Web

A key component in this scenario is GoodRelations, a lightweight, generic Web vocabulary for the Semantic Web that allows expressing all typical aspects of offers for goods and services on the Web. For example, we often want to be able to state that a particular Web site describes an offer to sell cell phones of a certain make and model at a certain price, that a piano house offers maintenance for pianos that weigh less than 150 kg, or that a car rental company leases out cars of a certain make and model from a particular set of branches across the country.

The GoodRelations ontology allows vendors to add a machine-readable definition of their offers. Such is in the interest of shop owners and manufacturers, because it makes sure the particular features and strengths of their products or services are considered by Semantic Web search engines. Such is also in the interest of buyers, because it allows them to find offers that exactly fit their requirements. In addition, GoodRelations makes it easy to exchange product model details and feature data between manufacturers and shop operators so that such data can be reused more easily along the value chain.

GoodRelations has been under development at the University of Innsbruck and Bundeswehr University Munich since 2005. The stable version has been officially released on July 28, 2008. Since then, it has been widely adopted as a standard vocabulary for the Web of Data. While lightweight and simple to use, it is based on several years of research and is compatible with all relevant W3C standards and recommendations. Among other applications, GoodRelations is being officially supported by Yahoo SearchMonkey technology for e-commerce data.

No Strings Attached

The GoodRelations vocabulary is released under a very liberal creative commons license, which grants royalty-free access for commercial and non-commercial use. It is the serious initiative to bring next generation Web technology from the lab to the market, and it has already started to succeed.

Why Should I Care, and Why Now?

It is just now that the first major search engines start collecting and considering respective GoodRelations data, if included in your Web presence. Yahoo, for example, is now encouraging every operator of a Web page worldwide to provide such structured data for their Web page. While this will not automatically improve one’s ranking in the search results, it allows you to communicate many more details of your offer to potential customers. Plus, the data will be considered for numerous value-added services, like comparison services etc.

In short, the following three key developments of the past months should put Semantic Web-based E-Commerce high on your agenda:

  • RDFa has become a W3C Recommendation: This means there is now a stable, standard syntax for embedding RDF metadata into XHTML Web content, which paves the way to adoption by mainstream Web developers. It is now straightforward to add respective extensions directly to existing Web content.
  • GoodRelations ontology release and adoption: The GoodRelations ontology has been released and is experiencing massive support from major vendors and initiatives from the Semantic Web community and traditional corporations.
  • Yahoo SearchMonkey: Due to the official endorsement of GoodRelations by Yahoo SearchMonkey, there is now an immediate, incentive for any business in the world to add respective metadata.

All this has happened just in the past few months. What once was a vision from the research lab has quickly gained great relevance for mainstream markets.

What Should Webmasters and Businesses Do?

If you are a manufacturer of brand products, you can help your retailers sell more if they can simply fetch product feature details from your Web page and combine them with their own pricing and shipping data. It is in your utmost interest that all businesses selling your products have easy access to the latest feature specifications and product details. Only if they can succeed at the point of sale, you will be successful in the market.

If you are developing Web shop software, ERP software, or anything similar, you should develop import and export interfaces for GoodRelations data. This will allow users of your software to create data dumps off the box, and import product model data from Web resources.

If you operate a Web Shop, you should provide GoodRelations data dumps of your range of offers. This is rather simple, see e.g. for instructions.

For any other business, e.g. hotels, rental car companies, etc.: You should ask your Webmaster to create at least a basic description of your range of products using the GoodRelations vocabulary.

And for the creative entrepreneurs: Now is the time to invent new business models based on GoodRelations data.

In short: Now is the time to get ready for the next generation of e-commerce on the World Wide Web!

More Information:

Project page:
Information on product model data:
Yahoo Searchmonkey: