CS5063: EVALUATION OF AI SYSTEMS (2018-2019)

Last modified: 22 May 2019 17:07

Course Overview

How do we assess whether an AI system works and is effective? Indeed, what does it mean for an AI system to be effective? In this course, we will look at different ways of evaluating AI systems, including performance on benchmark data sets, usefulness at helping users achieve a task, and subjective opinions (ie, do people like the system). Much of the course is devoted to statistics (including the R programming language), experimental design, and ethical issues. In practical and assessment work, students will evaluate deployed AI systems, and also critique evaluations in published AI research papers.

Course Details

Study Type	Postgraduate	Level	5
Term	First Term	Credit Points	15 credits (7.5 ECTS credits)
Campus	Old Aberdeen	Sustained Study	No
Co-ordinators	Professor Ehud Reiter Dr Nigel Beacham

What courses & programmes must have been taken before this course?

Either Any Postgraduate Programme (Studied) or Master of Engineering in Computing Science

What other courses must be taken with this course?

None.

What courses cannot be taken with this course?

None.

Are there a limited number of places available?

Course Description

The course will cover concepts, methods, techniques and tools/technologies for evaluating AI systems. Students will be equipped with knowledge on statistical analysis (e.g., variance, correlations and regression) and learn to use software/tools for statistical analysis. The course will introduce criteria for the evaluation of AI systems (e.g., usability, accessibility and learnability), and the theoretical evaluation of AI systems (e.g., guarantees regarding correctness, completeness, complexity, admissibility of heuristics, and so on). The course will provide a comprehensive exposition to issues pertaining to the empirical evaluation of AI Systems, including the design of experiments (to address specific criteria/issues), human-driven experiments (including the design of forms and questionnaires, interviews, “talk-aloud” experiments, logging/filming, etc.), systems with optimal behaviours vs. (sub-optimal) human-like behaviour, crowd-sourcing of experiments (including Amazon’s “Mechanic Turk” and others), evaluation through gaming, and other related topics.

Contact Teaching Time

Information on contact teaching time is available from the course guide.

Teaching Breakdown

More Information about Week Numbers

Details, including assessments, may be subject to change until 31 August 2025 for 1st Term courses and 19 December 2025 for 2nd Term courses.

Summative Assessments

Group report (50%); Individual report (50%).

Resit: where a student fails the course overall they will be afforded the opportunity to resit those parts of the course that they failed (pass marks will be carried forward).

Formative Assessment

There are no assessments for this course.

Feedback

Formative feedback for in-course assessments will be provided in written form. Additionally, formative feedback on performance will be provided informally during practical sessions.

Course Learning Outcomes

None.

CS5063: EVALUATION OF AI SYSTEMS (2018-2019)

Course Overview

Course Details

What courses & programmes must have been taken before this course?

What other courses must be taken with this course?

What courses cannot be taken with this course?

Are there a limited number of places available?

Course Description

Contact Teaching Time

Teaching Breakdown

Summative Assessments

Formative Assessment

Feedback

Course Learning Outcomes

Compatibility Mode