 | |
NMARKUP fact-file
What's the Problem?
Knowledge engineers often build
ontologies based on the information given in an appropriate text.
That is the knowledge engineer needs to extract pertinent concepts,
attributes, values & relations from the text and organize them
into a model (ie an ontology). We have noted 2 significant problems
with existing tools:
- A single text may not contain all the entities needed to produce
a complete model & hence it is vital to allow the modeler to
explore several texts or to allow the Knowledge engineer to add
entities from his/her own experience.
- Most systems only allow users to build a single
model/ontology. Whereas a modeler often wishes to develop a family
of models & only later select the most plausible one.
Towards a Solution
- NMARKUP, using the GATE package, processes a text file and
highlights all the nouns in the passage.
- The Knowledge engineer then used NMARKUP's Ontology building
subsystem to select concepts, attributes, values and relationships,
and to indicate how they are related.
- The Knowledge Engineer can, if he wish, add entities which he
thinks are necessary but which are not found in the
text.
- NMARKUP allows the knowledge engineer create a family of
Ontologies
- These ontologies can be displayed graphically or as RDF; further
they can be saved for development in subsequent sessions.
The technology
Below we give a diagram of the NMARKUP system:

The system architecture of NMarkup
Note:
- GATE produces the information about position of words (POS tags)
- NMARKUP comprises 2 sub-modules, namely, NounsMarkup and
OntoServlet.
We have plans to develop NMARKUP to:
- Add a mode which allows the modeller to decide to handle all
nouns or only unusual nouns (the later are more likely to be
domain-specific concepts)
- Once a word/concept has been extracted from a sentence, bring to
the Modeller's attention other unusual words in that same line and
adjacent lines;
- And a variant of the above which allows the user having found an
attributes in a line to search the line and adjacent lines for
adjectives (ie potential values)
- Report verbs which might be functioning in the passage as
Relations
Try a Demonstration
A video which demonstrates the use of NMARKUP is available (VIDEO)
Further Reading
Key document:
Gang Lei (2003) A Web-Based Tool for Developing Ontologies from
Texts. Master of Science (by Research), Computing Science,
University of Aberdeen. (This is available on-line on request to Derek
Sleeman
<dsleeman@csd.abdn.ac.uk>)
Other relevant documents:
H Cunningham, D Maynard, K Bontcheva, V Tablan, C Ursa and M
Dimitrov (2003) Developing Language Processing Components with
GATE (a User Guide) for GATE version 2.1 (February 2003)
http://www.gate.ac.uk/sale/tao/index.html
Semantic representation
View in the AKT Triplestore Browser or as
RDF.
Also available in DOAP RDF (Description Of A Project) |