Chapter 1: Applied Semantic Web Technologies: Overview and Future Directions (1/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 1

Applied Semantic Web

Technologies: Overview

and Future Directions

Vijayan Sugumaran

Oakland University, Rochester, MI; Sogang University, Seoul, Korea and

Jon Atle Gulla

Norwegian University of Science and Technology, Trondheim, Norway

Contents

1.1 Introduction .................................................................................................4

1.2 History .........................................................................................................5

1.3 Semantic Web Layers ...................................................................................7

1.3.1 Data and Metadata Layer..................................................................7

1.3.2 Semantics Layer ................................................................................8

1.3.3 Enabling Technology Layer ..............................................................9

1.3.4 Environment Layer .........................................................................10

1.4 Future Research Directions ........................................................................11

1.5 Organization of Book .................................................................................13

1.5.1 Part I: Introduction .........................................................................13

1.5.2 Part II: Ontologies ..........................................................................13

4 ◾ Vijayan Sugumaran and Jon Atle Gulla

1.1 Introduction

Since Tim Berners-Lee’s original idea for a global system of interlinked hypertext

documents from 1989, the World Wide Web has grown into the world’s biggest

pool of human knowledge. Over the past few years, the Web has changed the way

people communicate and exchange information. It has created new business oppor-

tunities and obliterated old business practices. As a borderless source of informa-

tion, it has been instrumental in globalization and cooperation among people and

nations. Importantly, it has also helped individuals join virtual communities and

take part in social networks that cross physical, cultural, and organizational barri-

ers. e rapid growth of information on the World Wide Web has, however, created

a new set of challenges and problems.

Information overload—In 1998, the size of the Web was estimated to exceed

300 million pages with a growth rate of about 20 million per month (Baeza-Yates

and Ribeiro-Neto, 1999). e real size of the Web today is dicult to measure,

although Web search indices cite a lower band number of unique and meaningful

Web pages. e Google search index was measured around 500 million pages in

2000, 8 billion in 2004, and more than 27 billion today. is constitutes an enor-

mous amount of information about almost any conceivable topic. While the early

Web often suered from a lack of high-quality relevant pages, the present Web now

contains far too many relevant pages for any user to review. As an example, at the

time of this writing, Google is returning about 18.6 million pages for the “World

Wide Web” search phrase. If you fail to mark it as a phrase, an astonishing 113 mil-

lion pages are found to be relevant and presented on the result page. In addition,

the deeper Web generates information dynamically based on users’ queries.

Poor retrieval and aggregation—e explosion of Web documents and

services would not be so critical if users could easily retrieve and combine the

information needed. Since Web documents are at best semi-structured in simple

natural language text, they are vulnerable to obstacles that prevent ecient content

retrieval and aggregation. An increasing problem is the number of languages used

on the Web. Studies of Langer (2001) suggested that almost 65% of Web pages

were in English in 1999; data from Internet World Stats* indicate a more balanced

use of languages. e English using population at the end of 2009 constituted

only 27.7% of total online users. e plethora of languages now used on the Web

prevents search applications from applying language-specic strategies, and they

still depend on content-independent statistical models. In a similar vein, the many

www.internetworldstats.com

1.5.3 Part III: Ontology Engineering and Evaluation ..............................14

1.5.4 Part IV: Semantic Applications .......................................................15

Acknowledgment ................................................................................................16

References ...........................................................................................................16

Applied Semantic Web Technologies ◾ 5

misspellings and general syntactic variations in documents hamper the reliability

of statistical scores of document relevance.

Stovepipe system—All components of a stovepipe system application are hard-

wired to work only together (Daconta et al., 2003). Information ows only inside

an application and cannot be exchanged with other applications or organizations

without access to the stovepipe system. Many enterprises and business sectors suer

from stovepipe systems that use their own particular database schemas, terminolo-

gies, standards, etc. and prevent people and organizations from collaborating e-

ciently because one system cannot understand the data from another system.

1.2 History

e Semantic Web term was popularized by Tim Berners-Lee and later elaborated in

2001. e rst part of his vision for the Semantic Web was to turn the Web into a

truly collaborative medium—to help people share information and services and make

it easier to aggregate data from dierent sources and dierent formats. e second

part of his vision was to create a Web that would be understandable and processable

by machines. While humans can read and comprehend current Web pages, Berners-

Lee envisioned new forms of Web pages that could be understood, combined, and

analyzed by computers, with the ultimate goal of enabling humans and computers to

cooperate in the same manner as humans do among each other. Berners-Lee did not

think of the Semantic Web as a replacement of the current Web. It was intended as

an extension for adding semantic descriptions of information and services. Central to

the Semantic Web vision is the shift from applications to data. e key to machine-

processable data is to make the data smarter. As seen from Figure1.1, data progress

along a continuum of increased intelligence as described below.

Text and databases—In this initial stage, most data is proprietary to an appli-

cation. e application is responsible for interpreting the data and contains the

intelligence of the system.

Ontology and

automated reasoning

Increasing

intelligence

XML taxonomies and

docs with mixed vocabularies

Text documents and

database records

XML documents using

single vocabularies

Figure 1.1 Smart data continuum.

6 ◾ Vijayan Sugumaran and Jon Atle Gulla

XML documents for single domains—e second stage involves domain-

specic XML schemas that achieve application independence within the domain.

Data can ow between applications in a single domain but cannot be shared out-

side the system.

Taxonomies—In this stage, data can be combined from dierent domains

using hierarchical taxonomies of the relevant terminologies. Data is now smart

enough to be easily discovered and combined with other data.

Ontologies and automated reasoning—In the nal stage, new data can be

inferred from existing data and shared across applications with no human involve-

ment or interpretation. Data is now smart enough to understand its denitions and

relationships to other data.

In the Semantic Web these smart data are assumed to be application- independent,

composable, classied, and comprise parts of a larger terminological structure.

Ontologies play a very important role in the Semantic Web community. According

to Gruber (1995), an ontology is an explicit specication of a conceptualization.

It represents a common understanding of a domain and its relevant terminology.

Technically, ontologies describe concepts and their taxonomic and nontaxonomic

relationships. For the Semantic Web, ontologies enable us to dene the terminol-

ogy used to represent and share data within a domain. As long as the applications

dene their data with reference to the same ontology, they can interpret and reason

others’ data and collaborate without manually dening any mapping between the

applications. Figure1.2 illustrates language.

e W3C consortium devised a number of standards for dening and using

ontologies. e Resource Description Framework (RDF) was available already in

1999 as a part of W3C’s Semantic Web eort. It uses triplets (subjects, property,

objects) to describe resources and their simple relationships to other resources. It

is used as a simple ontology language for many existing applications for content

management, digital libraries, and e-commerce.

DAML (DARPA Agent Markup Language) was proposed by DARPA, a U.S.

government research organization, as part of a research program started in 2000.

e denition of DAML is published at daml.org, which is run as part of the

WSDL

SPARQL

OWL2.0

SKOS

RDF

DAML + OIL

OWL1.0

20062002200120001999 20052004 20082003 2007 2009 2010

Figure 1.2 Important language standards of Semantic Web.

Applied Semantic Web Technologies ◾ 7

DAML program. DAML is a semantic language targeting the Semantic Web,

although it can also be used as a general knowledge representation language. e

OIL (Ontology Inference Layer) semantic markup language is a European initiative

involving some of the continent’s best articial intelligence researchers. OIL is not

very dierent from DAML, and both languages provide powerful mechanisms for

dening complex ontologies. e DAML+OIL standard from 2001 is a markup

language for Web resources that tries to capture some of the best features of both

DAML and OIL.

In 2004, the Web Ontology Language (OWL), Version 1.0, was recommended

by the W3C consortium. It replaces DAML+OIL as a semantic language designed

for applications that need to process the content of information instead of simply

presenting information to humans. OWL facilitates greater machine interpretabil-

ity of Web content than that supported by XML and RDF by providing additional

primitives along with improved formal semantics from Description Logic. e

three increasingly expressive sublanguages of OWL are OWL Lite, OWL DL, and

OWL Full. OWL DL is the most commonly used language today. Version 2.0 was

introduced in 2009. OWL is now the dominant language for describing formal

ontologies in enterprises and on the Web.

Several supporting standards have emerged recently. e Web Service Denition

Language (WSDL), Version 2.0, from 2007, is one of many languages for specify-

ing Web services. SPARQL (2008) is an RDF-based query language for access-

ing information in ontologies. SKOS (Simple Knowledge Organization System;

2009) is a light weight data model for sharing and linking knowledge organization

systems via the Web. ese and similar standards intend to simplify the use of

Semantic Web technologies in practical applications.

1.3 Semantic Web Layers

As noted above, several standard specications and technologies are contributing to

the realization of the Semantic Web. It is evolving based on a layered approach, and

each layer provides a set of functionalities (Breitman et al., 2007). e assortment of

tools, technologies, and specications that lay the foundation for the Semantic Web

can be broadly organized into four major layers: (1) data and metadata, (2)seman-

tics, (3) enabling technology, and (4) environment. Figure1.3 illustrates some of

the essential specications and technologies that contribute to each layer. e four

layers and key components are briey described below.

1.3.1 Data and Metadata Layer

e data and metadata layer is the lowest; it provides standard representations for

data and information and facilitates the exchanges among various applications

and systems. e Unicode provides a standard representation for character sets in

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 1: Applied Semantic Web Technologies: Overview and Future Directions (1/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 1: Applied Semantic Web Technologies: Overview and Future Directions (1/4)