Digital library evaluation (WP7) Cluster objectivesDigital libraries need to be evaluated as systems and as services to determine how useful, usable, and economical they are and whether they achieve reasonable cost-benefit ratios. Results of evaluation studies can provide strategic guidance for the design and deployment of future systems, can assist in determining whether digital libraries address the appropriate social, cultural, and economic problems, and whether they are as maintainable as possible. Consistent evaluation methods also will enable comparison between systems and services. The evaluation cluster will work both on evaluation methodologies in general as well as on providing the infrastructure for specific evaluations. Thus, the following objectives are addressed: Development of a comprehensive theoretical framework for DL evaluation, which can serve as reference point for evaluation studies in the DL area. Research on new methodologies will be supported in order to overcome the lack of appropriate evalution approaches and methods. Development of corresponding toolkits and test-beds in order to enable new evaluations and to ease the application of standard evaluation methods.
Cluster activitiesIn order to reach these goals, the following activities will be carried out: Workshops on DL evaluation, for collecting exixting evaluation approaches and methods. Evaluation support to the DL community, by creating an evaluation forum for enabling communication between evaluation specialists and DL developers. Development of new evaluation approaches and methods, in order to overcome the weaknesses of current approaches and the lack of methods for new types of applications. Development of evaluation toolkits, e.g. for collecting and analysing experimental data. Creation of test-beds for new content and usage types in DLs, by starting from the existing test-beds for XML and cross-lingual retrieval and extending these towards new media, applications and usage types. Creation of test-beds for usage-oriented evaluation, by extending existing test-beds or by creation of test-beds of user interactions. CLEF (Cross Language Evaluation Forum): it supports global digital library applications by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) creating test-suites of reusable data which can be employed by system developers for benchmarking purposes. INEX (Inititaive for the Evalution of XML Retrieval): it provides an opportunity for participants to evaluate their XML retrieval methods using uniform scoring procedures and a forum for participating organisations to compare their results.
Expected resultsSurvey of existing evaluation methodology A conceptual framework for DL evaluation Collection of evaluation approaches and methods Collection of evaluation toolkits and testbeds New test-beds for new content and usage types and for usage-oriented evaluation
Cluster report in DELOS Newsletter, Issue 1, April 2004 Cluster report in DELOS Newsletter, Issue 2, October 2004 Cluster report in DELOS Newsletter, Issue 3, June 2005 Cluster coordinatorNorbert Fuhr Universität Duisburg-Essen Germany
|