8 ◾  Vijayan Sugumaran and Jon Atle Gulla
dierent languages used by dierent computers and the URI provides a standard
way to uniformly indentify resources such as Web pages and other forms of con-
tent. e Unicode and URI together enable us to create content and make these
resources available for others to nd and use in a simple way. XML enables us to
structure data using user-dened tags that have well dened meanings that are
shared by applications. is helps improve data interoperability across systems.
Namespaces and schemas provide the mechanisms to express semantics in one
location for access and utilization by many applications. e next component in this
layer is the Resource Description Framework (RDF) that conceptually describes
the information contained in a Web resource. It can employ dierent formats for
representing triplets (subjects, predicates, objects), can be used to model disparate
abstract concepts, and is eective for knowledge management. RDF Schema is a
language for declaring basic classes and types for describing the terms used. It sup-
ports reasoning to infer dierent types of resources.
1.3.2 Semantics Layer
e semantics layer incorporates specications, tools, and techniques that help
add meaning or semantics to characterize the contents of resources. It facilitates
the representation of Web content that enables applications to access informa-
tion autonomously using common search terms. e important ingredients of
this layer are ontology language, rule language, query language, logic, reasoning
mechanism, and trust. As noted earlier, ontologies express basic concepts and the
Environment
Layer
Enabling
Technology
Layer
Semantics
Layer
Data and
Metadata
Layer
Security
Privacy
Trust
Cryptography
Integration
Standardization
Peer-to-Peer
Semantic Grid
Social Network
Personalization
Repository Management
Natural Language Processing
Logic (First Order, DL)
Reasoning
Trust
RDF and RDF Schema
XML
Unicode and URI
Composition
Visualization
Agents
Search
Web Services
Ontologies (OWL)
Rules (RIF/RuleML/SWRL)
Queries (SPARQL)
Figure 1.3 Semantic Web layers.
Applied Semantic Web Technologies ◾  9
relationship between concepts that exist in a domain. ey form the backbone for
the Semantic Web and are used to reason about entities in a particular domain and
manage knowledge sharing and reuse. Ontologies can be used to specify complex
constraints on the types of resources and their properties. OWL is the most popu-
lar ontology language used by applications for processing content from a resource
without human intervention. us, it facilitates machine interoperability by pro-
viding the necessary vocabulary along with formal semantics. OWL Lite, OWL
DL, and OWL Full are the three OWL sublanguages.
Rule languages help write inferencing rules in a standard way that can be used
for reasoning in a particular domain. A rule language provides kernel specica-
tion for rule structures that can be used for rule interchange and facilitates rule
reuse and extension. Among several standards, such as RIF (Rule Interchange
Format), and Datalog RuleML, SWRL (Semantic Web Rule Language) is gain-
ing popularity. It combines OWL DL, OWL Lite, and Datalog RuleML and
includes a high-level abstract syntax for Horn-like rules in both OWL DL and
OWL Lite.
Querying Web content and automatically retrieving relevant segments from
a resource by an application is the driving force behind the SPARQL Web query
language. It provides both a protocol and a language for querying RDF graphs
via pattern matching. It supports basic conjunctive patterns, value lters, optional
patterns, and pattern disjunction. Logic and reasoning are also integral parts of
the semantics layer. A reasoning system can use one or more ontologies and make
new inferences based on the content of a particular resource. It also helps identify
appropriate resources that meet a particular requirement. us, the reasoning sys-
tem enables applications to extract appropriate information from various resources.
Logic provides the theoretical underpinning required for reasoning and deduction.
First order logic, description logic, and others are commonly used to support rea-
soning. Trust is also an important ingredient in this layer; it is basic to the whole
reasoning process. All the applications expect and demand that resource content be
trustworthy and of high quality.
1.3.3 Enabling Technology Layer
is layer consists of a variety of technologies that can develop applications on the
Semantic Web and accomplish dierent types of tasks or operationalize specic
aspects of the Semantic Web. For example, intelligent agents or multiagent systems
can be used to access and process information automatically on the Semantic Web.
Some well-established technologies can be used synergistically to create valuable
Semantic Web applications. Some of the technologies relevant to this layer are
agents, search, Web services, composition (information and service composition),
visualization, personalization, repository management, and natural language pro-
cessing. Software agents and Web services are closely associated with the Semantic
Web and are used in a variety of applications. For example, Klapiscak and Bordini
10 ◾  Vijayan Sugumaran and Jon Atle Gulla
(2009) describe the implementation of an environment that combines agent-
oriented programming and ontological reasoning. Similarly, Gibbins et al. (2004)
discuss agent-based Semantic Web services for situational awareness and informa-
tion triage in a simulated humanitarian aid scenario. Search (information and ser-
vice) and composition (information and service) are other important technologies
utilized in numerous applications on the Semantic Web.
Personalization on the Semantic Web is similar to creating individual views on
Web data according to special interests, needs, requirements, goals, access context,
etc. of the user (Baldoni et al., 2005). e availability of a variety of reasoning
techniques, all fully integrated with the Web, opens the way for the design and
development of dierent modes of interaction and personalization. Information
visualization strives to make the information more accessible and less structured
to improve usability. In the context of the Semantic Web, visualization supports
the user in managing large amounts of data and performing interactive processes
such as searching (Albertoni et al., 2004). Visualization of semantic metadata helps
users gain insight into the structure and relationships in the data that are hard
to see in text (Mutton and Golbeck, 2003). Natural Language Processing (NLP)
technologies play an important role in materializing the Semantic Web with spe-
cic applications such as ontology-based information extraction, ontology learning
and population, and semantic metadata generation. NLP techniques are increas-
ingly used in ontology engineering to minimize human involvement. Much work
remains to be done in the use of controlled natural language, representation of
linguistic information in ontologies, and eective techniques for ontology learning
from unstructured text.
1.3.4 Environment Layer
e environment layer deals with the surroundings and the infrastructure in which
the Semantic Web applications execute and meet the basic expectations of these
applications in terms of data quality and information assurance. It is also concerned
with the operating environment and the degrees of interoperability of various
domains. Some of the key aspects of this layer are security, privacy, trust, cryptog-
raphy, application integration, standards, and environments such as peer-to-peer,
semantic grid, and social networks.
Security and privacy are two important requirements that must be satised in
the Semantic Web environment. Any two applications can interact automatically,
and since the identities of the parties are not known in advance, a semantically
enriched process is needed to regulate access to sensitive information (Olmedilla,
2007). us, security and privacy protections must be implemented carefully for
a variety of Semantic Web scenarios. Cryptography, encoding, and secure transfer
protocols are some of the ways to ensure certain levels of security and privacy on
the Semantic Web. e critical issue of trust within the Semantic Web has been
gaining attention recently. Just like the Web for which no attempts were made to
Applied Semantic Web Technologies ◾  11
centrally control the quality of information, it is infeasible to do so on the Semantic
Web. By having each user explicitly specify a set of trusted users, the resulting web
of trust may be used recursively to determine a user’s trust in any other user or
resource (Richardson et al., 2003).
Application and information integration along with standardization are also
important for the success of the Semantic Web. Information integration is still
an outstanding issue and the host of technologies of the Semantic Web bring sev-
eral relevant and useful tools and techniques that can exploit the interoperabil-
ity context. e environment layer also includes mechanisms to develop Semantic
Web applications that take advantage of some of the contemporary network com-
puting paradigms such as peer-to-peer, semantic grid, and social networks. Such
advances can facilitate larger scale permeation of Semantic Web technologies and
applications.
1.4 Future Research Directions
Semantic applications attempt to understand the meaning of data and connect data
in meaningful ways. After almost a decade of intense research, we now have a num-
ber of practical semantic applications in use. A few commercial semantic search
systems like Powerset have been launched commercially, and specialized semantic
applications are now used for travel planning and user proling, among other func-
tions. Ontologies are to some extent used for integrating large-scale applications
and reporting data from heterogeneous systems. While domains like medicine and
petroleum production already have advanced users of Semantic Web technologies,
most industries have only limited experience with semantic applications. ere are
probably many reasons for slow adoption of new technologies, and fundamental
challenges of the Semantic Web relate to the scale, vagueness, inconsistency, and
instability of data.
e massive amounts of data on the Web and in enterprises represent funda-
mental challenges for logic-based approaches like the Semantic Web. Specifying
the semantic content of data is a tedious and error-prone process that requires
deep expertise in logic and modeling. When the specications are at hand, seri-
ous performance issues with ontology querying and reasoning arise. For some
ontologies, like those specied in OWL Full, reasoning is not possible. Most
textual data are formulated in natural languages that suer from vagueness and
uncertainty. Human beings can relate to terms like tall and heavy if they know
the context of their use. Computers, on the other hand, need precise and com-
plete denitions that allow them to apply the terms and interpret descriptions in
which they are used.
Textual data also tend to show inconsistencies that we can reconcile or accept
in our everyday life. Humans can deal with dierent and inconsistent denitions
of terms like tall and still use such terms in conversations and text. A computer
12 ◾  Vijayan Sugumaran and Jon Atle Gulla
application can in principle deduce anythingright or wrong—from an inconsis-
tent ontology.
In recent years, the challenges of unstable and evolving terminologies have
become more apparent. Since semantic applications must understand the meanings
of text and other data, they must update their underlying terminologies as domains
change to interpret the data correctly. is is a continuous process that is costly to
implement and dicult to organize.
Current research on the Semantic Web is diverse and spans many scientic
disciplines. A number of unresolved theoretical questions are being addressed by
the research community. For the practical use of Semantic Web technologies, a few
areas carry particular signicance.
Ontology learning or creationese techniques allow semi-automatic or
fully automatic creation of (parts of) ontologies from representative domain texts.
Early work on ontology learning used text mining and computational linguistics to
extract prominent terms in text and suggest these as candidate classes and individu-
als. Later research concentrated on extracting relationships and properties using
statistical methods like association rules and simple phrasal searches like the Hearst
patterns (Cimiano, 2006; Gulla et al., 2009). e quality of these techniques is,
however, not impressive and we are still far from learning complete ontologies with
classes, individuals, properties, and rules.
PerformanceWith current technologies, retrieval, storing, and manipulation
of ontologies are computationally too demanding for many large-scale applications.
Initial work on more ecient methods for storing and reasoning over complex
ontologies has started, but more progress is needed for semantic applications to
scale up.
Ontology quality and selectionAs the number of available ontologies
increases, their evaluation becomes more dicult. Existing approaches focus on the
syntactic aspects of ontologies and do not take into account the semantic aspects
and user contexts and familiarities. While many research eorts address the issues
of ontology searching and quality separately, none has considered ontology evalu-
ation and selection together. Research also has not considered task characteristics,
application semantics, and user contexts. Hence, we still have a great need for
developing a semi-automatic framework for selecting the best ontology appropriate
for a specic task within the Semantic Web.
Linked dataIn 2006, Tim Berners-Lee introduced the notion of linked data
as a simplied approach to semantic applications. e approach is based on con-
cepts and technologies for combining and integrating data using RDF triplets only.
Linked data allows more scalable applications to be built and has already been used
in a number of small Web applications and enterprise data architectures (Bizer et
al., 2009). Whether the approach can handle functionally demanding tasks due to
the limited expressiveness and lack of formality remains unclear.
Trust, Security, and PrivacySemantic Web applications assume and expect
that the information content of resources is of high quality and can be trusted.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset