![]() |
|
![]() |
|
| CS AKTiveSpace from The University of Southampton |
![]() |
CS AKTiveSpace fact-file
Winner of the 2003 Semantic Web ChallengeCS AKTive Space (CAS) is an integrated Semantic Web application which provides a way to explore the UK Computer Science Research domain across multiple dimensions for multiple stakeholders, from funding agencies to individual researchers.
What's the Scenario?The scenario of use that founded CAS is based on a real-world community request: the desire for a funding council to be able to get a fast overview of the council's domain from multiple perspectives. This requires bringing together data from heterogeneous sources and constructing methods to present the possible relations in the data to a user quickly and effectively. We have found that our original scenario applies equally to any stakeholder of the domain. While a funding body may wish to know, for instance, how funding in an area is distributed geographically, a researcher may also wish to know this. Similarly, graduate students may wish to understand who is in the community of practice for the top researchers in their area. Towards a SolutionCAS provides multiple ways to look at and discover simple information or rich relations within the Computer Science domain of the UK. It facilitates querying, exploring and organizing information in ways that are meaningful to the users: where one user or group may be interested in seeing the relationship between funding, research area and geographical region, another may be interested in who the top researchers are in AI and what their phone numbers are. With CAS, people can formulate and see at a glance rich results like these, without having to string together large, complex queries. Because of its rich, easily manipulatable representation of the CS domain, CS AKTive Space supports the exploration of patterns and implications inherent in the domain content. CAS allows all stakeholders in the CS domain, from funding bodies to researchers, to explore their space for associations and opportunities that were previously either unavailable or too cumbersome to attempt to discover. VisualisationCAS exploits a variety of visualizations and multi dimensional representations that are designed to make content exploration, navigation and appreciation direct and intuitive (schraefel et al 2003). As mentioned, the knowledge services supported in the application include investigating communities of practice (Alani et al 2003) and a researcher's prominence within their field, considered both in terms of their scholarly impact (Kampa 2002) and also of their cumulative research grant income. Inference and IntegrationThese services in turn rely on extensive inference across heterogeneous resources. For example, the notion of a community of practice is calculated from a given researcher's coauthors, the projects that they are involved with, the institutions with which they are affiliated and the topics in which they conduct research. We aim to provide a content space in which a user can rapidly get a Gestalt of who is doing what and where, what are the significant areas of effort both in terms of topic and institutional location, what of this work is having an impact or influencing others and where are the gaps in research coverage. HeterogeneityThe application exploits a wide range of semantically heterogeneous and distributed content relating to Computer Science research in the UK. For example, there are almost 2000 research active Computer Science faculty, there are 24,000 research projects represented, many thousands of papers, and hundreds of distinct research groups. These entities are described by a number of existing sources, such as institutional information systems (university web sites, research council databases), bibliographic services and other third party data sets (geographical gazetteers, UK Research Assessment Exercise submissions). Live SourcesThis content is gathered on a continuous basis using a variety of methods including harvesting and scraping of publically available data from institutional web sites (Leonard and Glaser, 2001), bulk translation from existing databases, and direct submissions by partner organizations, as well as other models for content acquisition. In particular, we support both regularly scheduled harvesting to identify and deal with changes to existing data sources, and on-demand harvesting in response to changing user requirements (Ciravegna et al, 2003) or update notifications from component sources. ScaleThe content is mediated through an OWL ontology (http://www.aktors.org/publications/ontology/) which was constructed for the application domain and which incorporates components from other published ontologies (Niles and Pease, 2001). The content currently comprises around seven million RDF triples, and we have developed scalable storage and retrieval technologies and maintenance methods to support its management (Harris and Gibbins, 2003). ChallengesThis work illustrates a number of substantial challenges for the Semantic Web. There are issues to do with how to best sustain an acquisition and harvesting activity. There are decisions about how best to model the harvested content; how to cope with the fact that there are bound to be large numbers of duplicate items that need to be recognized as referring to the same objects or referents; the degree to which our inferential services can cope as more content becomes available; how we present the content so that inherent patterns and trends can be directly discerned must be considered; how trustworthy is the provenance and accuracy of the content; and how all this information is to be maintained and sustained as a social and community exercise. ReferencesAlani, H., Dasmahapatra, S., Shadbolt, N. and O'Hara, K. (2003) ONTOCOPI: Using Ontology-Based Network Analysis to Identify Communities of Practice. IEEE Intelligent Systems, March/April 2003, 18-25. Ciravegna, F., Dingli, A., Guthrie, D. and Wilks, Y. (2003) Integrating Information to Bootstrap Information Extraction from Web Sites IJCAI03 Workshop on Information Integration, held in conjunction with the International Conference on Artificial Intelligence (IJCAI-03), Acapulco, August, 2003. Harris, S and Gibbins, N. (2003) 3store: Efficient Bulk RDF Storage. In Proceedings of the First International Workshop on Practical and Scalable Semantic Web Systems (PSSS2003), Sanibel Island, Florida, USA. Kampa, S. (2002) Who are the experts? E-Scholars in the Semantic Web. PhD Thesis. University of Southampton. Leonard, T. and Glaser, H. (2001) Large scale acquisition and maintenance from the web without source access Handschuh, Siegfried and Dieng-Kuntz, Rose and Staab, Steffan, Eds. In Proceedings of Workshop 4, Knowledge Markup and Semantic Annotation, K-CAP 2001, pages 97-101. Niles, I. and Pease, A. (2001) Towards a Standard Upper Ontology. In Proceedings of the Second International Conference on Formal Ontology in Information Systems (FOIS-2001). schraefel, m. c., Karam, M. and Zhao, S. (2003) mSpace: interaction design for user-determined, adaptable domain exploration in hypermedia. In Proceedings of AH2003 Workshop on Adaptive Hypermedia and Adaptive Web-Based Systems, Budapest, Hungary. Semantic representationView in the AKT Triplestore Browser or as RDF. Also available in DOAP RDF (Description Of A Project) |