![]() |
|
![]() |
|
| Armadillo from The University of Sheffield |
![]() |
Armadillo: The information overload we experience from Internet is partly due to vast quantities of redundant information. Redundancy is apparent in the presence of multiple citations of the same facts in superficially different formats. This redundancy can be exploited to bootstrap an annotation process needed for Information Extraction, thus enabling production of machine-readable content for the Semantic Web. For example, the fact that a system knows the name of an author can be used to identify a number of other author names using resources present on the Internet, instead of using rule-based or statistical applications, or hand-built gazetteers. By combining a multiplicity of information sources, internal and external to the system, texts can be annotated with a high degree of accuracy with minimal or no manual intervention. Armadillo fact-file
What's the Problem?
Towards a SolutionThe information overload we experience from Internet is partly due to vast quantities of redundant information. Redundancy is apparent in the presence of multiple citations of the same facts in superficially different formats. This redundancy can be exploited to bootstrap the annotation process needed for Information Extraction, thus enabling production of machine-readable content for the Semantic Web. For example, the fact that a system knows the name of an author can be used to identify a number of other author names using resources present on the Internet, instead of using rule-based or statistical applications, or hand-built gazetteers. By combining a multiplicity of information sources, internal and external to the system, texts can be annotated with a high degree of accuracy with minimal or no manual intervention. Armadillo utilizes multiple evidence from similarity (see SimMetric project), from source reliability and from Information Extraction capture certainty. Using these multiple strategies, Armadillo connects findings across the corpus. In so doing, Armadillo models the relevant domain and builds an RDF ontology and a knowledge base. Further readingFabio Ciravegna , Sam Chapman , Alexiei Dingli and Yorick Wilks , Learning to Harvest Information for the Semantic Web, in Proceedings of the 1st European Semantic Web Symposium , Heraklion, Greece, May 10-12, 2004. [ PDF ]. Sam Chapman, Barry Norton, Fabio Ciravegna: Armadillo: Integrating Knowledge for the Semantic Web, Dagstuhl workshop on Learning for the Semantic Web, 3-18 February 2005, Dagstuhl , Germany . Semantic representationAlso available in DOAP RDF ( Description Of A Project ) |