What's the Problem?
Knowledge engineers often build
ontologies based on the information given in an appropriate text.
That is the knowledge engineer needs to extract pertinent concepts,
attributes, values & relations from the text and organize them
into a model (ie an ontology). We have noted 2 significant problems
with existing tools:
- A single text may not contain all the entities needed to produce
a complete model & hence it is vital to allow the modeler to
explore several texts or to allow the Knowledge engineer to add
entities from his/her own experience.
- Most systems only allow users to build a single
model/ontology. Whereas a modeler often wishes to develop a family
of models & only later select the most plausible one.
Towards a Solution
- NMARKUP, using the GATE package, processes a text file and
highlights all the nouns in the passage.
- The Knowledge engineer then used NMARKUP's Ontology building
subsystem to select concepts, attributes, values and relationships,
and to indicate how they are related.
- The Knowledge Engineer can, if he wish, add entities which he
thinks are necessary but which are not found in the
- NMARKUP allows the knowledge engineer create a family of
- These ontologies can be displayed graphically or as RDF; further
they can be saved for development in subsequent sessions.
Below we give a diagram of the NMARKUP system:
The system architecture of NMarkup
- GATE produces the information about position of words (POS tags)
- NMARKUP comprises 2 sub-modules, namely, NounsMarkup and
We have plans to develop NMARKUP to:
- Add a mode which allows the modeller to decide to handle all
nouns or only unusual nouns (the later are more likely to be
- Once a word/concept has been extracted from a sentence, bring to
the Modeller's attention other unusual words in that same line and
- And a variant of the above which allows the user having found an
attributes in a line to search the line and adjacent lines for
adjectives (ie potential values)
- Report verbs which might be functioning in the passage as
Try a Demonstration
A video which demonstrates the use of NMARKUP is available (VIDEO)
Gang Lei (2003) A Web-Based Tool for Developing Ontologies from
Texts. Master of Science (by Research), Computing Science,
University of Aberdeen. (This is available on-line on request to Derek
Other relevant documents:
H Cunningham, D Maynard, K Bontcheva, V Tablan, C Ursa and M
Dimitrov (2003) Developing Language Processing Components with
GATE (a User Guide) for GATE version 2.1 (February 2003)
View in the AKT Triplestore Browser or as
Also available in DOAP RDF (Description Of A Project)