production
Skip to Content

CS5063: EVALUATION OF AI SYSTEMS (2018-2019)

Last modified: 28 Aug 2018 10:55


Course Overview

How do we assess whether an AI system works and is effective?  Indeed, what does it mean for an AI system to be effective?  In this course, we will look at different ways of evaluating AI systems, including performance on benchmark data sets, usefulness at helping users achieve a task, and subjective opinions (ie, do people like the system).   Much of the course is devoted to statistics (including the R programming language), experimental design, and ethical issues.  In practical and assessment work, students will evaluate deployed AI systems, and also critique evaluations in published AI research papers.

Course Details

Study Type Postgraduate Level 5
Session First Sub Session Credit Points 15 credits (7.5 ECTS credits)
Campus Old Aberdeen Sustained Study No
Co-ordinators
  • Professor Ehud Reiter
  • Dr Nigel Beacham

What courses & programmes must have been taken before this course?

  • Either Any Postgraduate Programme (Studied) or Master of Engineering in Computing Science

What other courses must be taken with this course?

None.

What courses cannot be taken with this course?

None.

Are there a limited number of places available?

No

Course Description

The course will cover concepts, methods, techniques and tools/technologies for evaluating AI systems. Students will be equipped with knowledge on statistical analysis (e.g., variance, correlations and regression) and learn to use software/tools for statistical analysis. The course will introduce criteria for the evaluation of AI systems (e.g., usability, accessibility and learnability), and the theoretical evaluation of AI systems (e.g., guarantees regarding correctness, completeness, complexity, admissibility of heuristics, and so on). The course will provide a comprehensive exposition to issues pertaining to the empirical evaluation of AI Systems, including the design of experiments (to address specific criteria/issues), human-driven experiments (including the design of forms and questionnaires, interviews, “talk-aloud” experiments, logging/filming, etc.), systems with optimal behaviours vs. (sub-optimal) human-like behaviour, crowd-sourcing of experiments (including Amazon’s “Mechanic Turk” and others), evaluation through gaming, and other related topics.

Degree Programmes for which this Course is Prescribed

  • Master Of Science In Artificial Intelligence

Contact Teaching Time

Sorry, we don't have that information available.

Teaching Breakdown


Assessment

Group report (50%); Individual report (50%).

Resit: where a student fails the course overall they will be afforded the opportunity to resit those parts of the course that they failed (pass marks will be carried forward).

 

Formative Assessment

None.

Feedback

Formative feedback for in-course assessments will be provided in written form. Additionally, formative feedback on performance will be provided informally during practical sessions.

Compatibility Mode

We have detected that you are have compatibility mode enabled or are using an old version of Internet Explorer. You either need to switch off compatibility mode for this site or upgrade your browser.