Next: CROSI Mapping System (CMS)
  Up: CAPTURING REPRESENTING AND OPERATIONALISING
  Previous: Semantic Intensity Spectrum
Modular Architecture
We built an architecture that is characterized as a multi-stage and multi-strategy system comprising
of four modules, namely:
Feature Generation
Feature Selection and Processing
Aggregator
Evaluator
In this system, different features of the input data are generated
and selected to fire off different sorts of feature matchers. The
resultant similarity values are compiled by multiple similarity
aggregators running in parallel or consecutive order. The overall
similarity is then evaluated to initiate iterations that backtrack
to different stages.
Multi-component and multi-strategy approaches are demonstrated by
many systems, e.g. COMA [3], GLUE [1], and QoM
[2]. Our approach, as illustrated in Figure
3.1, is different in that it allows:
- multiple
matchers: several heterogeneous matchers run independently producing
intrinsic, yet different but complementary results
- use of
existing systems which are treated as standard building blocks each
of which is a plug and play component of the overall hybrid mapping
system
- multiple loops: the overall similarity is evaluated by
users or supervised learners to initiate iterations that backtrack
to different stages of the process
.

Figure 3.1: A modular architecture.
Challenges for deploying the architecture
There are a number of challenges which we need to consider when
building such a system: in ideal situations, each independent
matcher considers an identical set of characteristics of the input
ontologies and produces homogeneous output for further processes.
However, this is seldom true in practice. There is currently no
standard or common agreement on how an ontology mapping system
should behave, i.e. no formal specification on what should be the
input and how the system should output. If we consider some recent
OWL based ontology alignment systems, we see intrinsic diversities:
some take only names (URIs) of classes, others take as input the
whole taxonomy; some generate as output abstract relationships (e.g.
more general than, more specific than, etc.) while
others produce pairwise correspondences with or without confidence
values; and some are stand-alone systems when others operate as Web
services. Thus, the first and most imminent task is to extract from
the input ontologies features that suit not only systems that
are to be included in the architecture but also future ones. In
other words, extracted features should fully characterize the input
ontologies no matter which representation language is used.
Equally difficult to build are methods to process and aggregate
results from different mapping systems (also refer to as
external matchers). An unbiased measure is to run in
parallel componential matchers each of which produces its own
results. The output that might be heterogeneous is then normalized
and unified to facilitate accumulation and aggregation with numeric
and non-numeric methods.
Bibliography
- [1]
-
A. Doan, J. Madhavan, P. Domingos, and A. Halevy.
Learning to map between ontologies on the semantic web.
In Proceedings of the 11th International World Wide Web
Conference (WWW 2002), Hawaii, USA, May 2002.
- [2]
-
M. Ehrig and S. Staab.
Qom - quick ontology mapping.
In Proceedings of the 3rd International Semantic Web Confernece
(ISWC'04), LNCS 3298, Hiroshima, Japan, pages 683-697, Nov. 2004.
- [3]
-
D. H-H and E. Rahm.
COMA: a system for flexible combination of schema matching
approaches.
In Proceedings of the 28th International Conference on Very
Large Databases (VLDB'02), Hong Kong, China, aug 2002.
This material was prepared under the CROSI project. Copyright remains with the authors. Parts or the whole of this text have been published in conferences, workshops and other knowledge disseminating events.
CROSI presents this information online merely for sake of information dissemination.
This material should not be copy-pasted without acknowledging its origins.
Please contact the authors for information on how to use or reference this material.