production
Skip to Content

PX5508: INTRODUCTION TO DATA SCIENCE (2020-2021)

Last modified: 05 Aug 2021 13:04


Course Overview

In this course we study the typical workflow for a data analysis project. We will learn how to access and collect data, how then to clean the data, and organise it in databases to prepare it for later analysis.

We will then perform descriptive and exploratory data analysis and finally visualise the results and create a report.

Course Details

Study Type Postgraduate Level 5
Session Second Sub Session Credit Points 15 credits (7.5 ECTS credits)
Campus Aberdeen Sustained Study No
Co-ordinators
  • Dr Murilo da Silva Baptista

What courses & programmes must have been taken before this course?

  • Any Postgraduate Programme
  • Either Master Of Science In Health Data Science or Master Of Science In Data Science

What other courses must be taken with this course?

None.

What courses cannot be taken with this course?

Are there a limited number of places available?

No

Course Description

A typical data analysis project consists of several steps that make up a workflow.

In this course we will first discuss how to obtain data. There are many different ways to obtain data, from online repositories, web scraping and API communication, to the interaction with data bases such as mySQL and Mongo. We will also describe how we can measure our own data and make them computational.

The next step is typically to clean the data and to get it into a format that is suitable for subsequent analysis. We will discuss how structured and unstructured data can be used and how we can move data up a hierarchy of data quality levels.

We will then learn how to build simple databases (mySQL and Mongo) and interact with them.


Contact Teaching Time

Information on contact teaching time is available from the course guide.

Teaching Breakdown

More Information about Week Numbers


Details, including assessments, may be subject to change until 31 August 2023 for 1st half-session courses and 22 December 2023 for 2nd half-session courses.

Summative Assessments

1 group project, with writing (1/3), oral (1/3), and coding component (1/3). Students can opt to be marked individually within a group (in which cases marks will be a mix of collective marks and individual contributions in the rate of 1/2 (collective for the report), 1/4 (oral, individually), 1/4 (code, individually), or as a group formed by a single member, where marks will be distributed as 1/3, 1/3, 1/3.

Resit (for students taking the course in AY20/21)

Resit of any failed element for the whole group or for an individual student

Formative Assessment

There are no assessments for this course.

Course Learning Outcomes

Knowledge LevelThinking SkillOutcome
FactualCreateLean to visualize and present the data together with its corresponding analysis.
ConceptualUnderstandLearn how to build simple databases (mySQL and Mongo) and interact with them.
ProceduralUnderstandLearn introductory concepts to pre-access the data to learn about main features of the data.
ProceduralApplyTo prepare and organize the data so that data format is appropriate for further analysis.
ReflectionAnalyseTo learn the basic fundaments to make sense out of the data. To explore and analyse data from descriptive, inferential statistics, and statistical models, and also from machine learning methods.
FactualUnderstandExploring the available data, to understand how to obtain the data or to generate our own.
ConceptualAnalyseTo have a comprehensive overview of the whole data science cycle.

Compatibility Mode

We have detected that you are have compatibility mode enabled or are using an old version of Internet Explorer. You either need to switch off compatibility mode for this site or upgrade your browser.