
Besides resources developed in-house at CGEBM, our bioinformaticians make use of resources located at the University of Aberdeen and elsewhere online. Major resources available to the university community are categorised below.
University Services
- Hardware
-
The Information Technology Services at the University support research computing and have a dedicated high performance compute cluster (HPCC), Maxwell. Maxwell is designed to meet both the bioinformatics needs of genomics within CGEBM and other high performance and high throughput compute needs in diverse disciplines across the University and with the University's external partners. CGEBM works closely with IT services to ensure that the informatics needs of genomics projects are met.
The Maxwell HPCC is available for use by researchers at the University of Aberdeen or at external institutions or companies. The principal objectives of the research computing services are to encourage the effective and innovative use of IT as a research tool; to provide more proactive and in-depth IT support to researchers; to facilitate the use and re-use of IT resources; and to free up researcher time currently absorbed in IT administration to focus on research outputs.
For more information on Maxwell, check out the Research HPC page or browse under "Digital Research" on the IT Services website. Please contact Digital Research if you have further specific questions about HPCC Maxwell or to register for an account.
- Software
-
Many bioinformatics software tools exist and are freely available. The University of Aberdeen research HPCC, Maxwell, has many of the essential bioinformatics software tools for common applications such as genome assembly and annotation, differential gene expression analysis, sequence alignment and variant calling pre-installed and ready to use.
In addition, the University provides a Galaxy server for analysis of data using a graphical interface, allowing users to perform bioinformatics workflows without requiring advanced command-line expertise.
If you require advice on selecting or using software, or would like to know what tools are already in use, please contact us or come along to our free bioinformatics advice clinic in the IMS atrium, Foresterhill, on the second Tuesday of each month. CGEBM is happy to provide guidance to help ensure appropriate and effective use of bioinformatics resources.
Data Resources and Management
- Online Biological and Medical Data Resources
-
Genome-Related
The following online resources provide tools including genome browsers, genome data, and more.
- European Bioinformatics Institute (EBI) - databases, tools, courses and more
- NCBI - more databases, tools and resources
- IGV - a genome browser which can be downloaded and used to view your own data
- Ensembl - an Ensembl site for human, mouse and other vertebrate genomes
- Ensembl Genomes - a landing site with links to Ensembl sites for plant, metazoan, protist, fungal, bacterial and SARS-CoV-2 genomes
- UCSC genome browser - warehouse of genomes for a large number of species
Functional Genomic and Genotype/Phenotype-Related
The following online resources aim to provide links between variants in respective genomes with expression or other measurable phenotypes.
- GTEx - a primarily human map of genetic variation linked to expression across tissues, which define expression quantitative trait loci (eQTLs). These data are useful for interpretation of regulatory variants in health and disease. The resource is based on ~20k samples from ~1000 individuals and 54 non-disease tissues.
- eQTL Catalogue - a resource of human expression and splicing QTLs based on uniform reprocessing of relevant public datasets across tissues and cell types.
- GeneNetwork - a searchable, systems-genetics warehouse of QTLs and regulatory networks underpinning complex traits. Species represented include human, model and non-model animals and plants.
- Ensembl Variant Effect Predictor - a resource like this, which predicts effects of variants based on a number of scores, exists for most genomes known to Ensembl.
- GWAS Central - a database of summary-level results from association studies.
- dbGaP - the Database of Genotype and Phenotype stores results of studies that seek to link genotype to phenotype in humans.
- ArrayExpress - stores data from functional genomics experiments with required metadata. Links to downloadable raw data are provided where relevant.
- ClinPGx - a comprehensive pharmacogenomics resource relating genetic variation in humans to drug responses.
Human Molecular Atlases
These resources represent efforts to understand human cell types in health and disease.
- HCA - the Human Cell Atlas, a program to map all human cell types in three dimensions.
- HuBMAP - Human BioMolecular Atlas Program, an open-science program mapping healthy cells in the human body.
- HPA - the Human Protein Atlas, a resource describing human proteins in cells and tissues.
- Allen Institute Brain Map - atlases and tools for studying human brains including expression, connectivity, behaviour and more. Resources for mouse are available as well.
- Developmental Atlases - warehousing data on chromatin accessibility and gene expression for humans. This resource addresses a number of model organisms as well.
- Tabula Sapiens - a human single-cell transcriptome reference database.
Clinical Variant Related
The following resources warehouse clinical variant data as follows:
- ClinVar - reports on human genetic variants and their relationship to disease. Provided by NCBI.
- COSMIC - (Catalogue Of Somatic Mutations In Cancer). Provided by the Sanger Institute.
- TCGA - (The Cancer Genome Atlas) a cancer genomics resource that warehouses mutation, expression and other data for a large number of cancer types.
- gnomAD - (Genome Aggregation Database) warehouses population frequency data for human genetic variants. Focused on identifying rare disease-causing variants in diverse populations. Based on more than 800k samples, including data from the Exome Aggregation Consortium (ExAC), Human Genome Diversity Project (HGDP) and the 1000 Genomes Project (1kGP). Provided by the Broad Institute.
- DECIPHER - a database associated with the Deciphering Developmental Disorders consortium, which seeks to find genomic variants associated with altered childhood development.
- IGSR - the International Genome Sample Resource shares data made available by the 1000 Genomes Project and adds in new datasets. It informs population genetic and disease variant studies.
- OMIM - a literature compendium of human variants in mendelian disorders
- Managing Your Data
-
Are you storing tens of gigabytes of your data on a mobile hard disk?
Contact Digital Research about storing your data on the university's state-of-the-art High Performance Compute Cluster, Maxwell.
Do you have several Excel spreadsheets containing summary data? Do you need to collect results from other Universities? Are you requiring help to set up a survey, or do you need data linked or cleaned?
Contact the Data Management Team. Located in Rooms 0:039a & 0:039b of the Polwarth Building, the Data Management Team provides a wide range of services, including database design (in Microsoft Access or SQL Server), web applications, patient matching on the NHS Grampian Community Health Index (CHI) or the Patient Monitoring System (PMS), random population surveys, case-control studies, record linkage, questionnaire design, data entry, data cleaning, manipulation and data analysis. The team also has a wide range of experience in dealing with data in all different formats and can assist in cleaning and transferring it into a more manageable format.
These services are available to all researchers, both NHS and University, with costs being met either through UKCRC funding for eligible projects or via research grants with allocated funding for data management. If a project is not eligible for UKCRC funding, it is essential that any research grant application includes provision for data management costs.
Do you have routinely-captured sensitive clinical data and need secure storage for it?
Contact the Safe Haven Team about storing your data in the Grampian Data Safe Haven.
Bioinformatics Training
- Bioinformatics Training
-
The following University and online resources, along with many others, offer training in bioinformatics:
- CGEBM run a range of internal bioinformatics training workshops. Please see full details on our Training page.
- The Galaxy project provides analysis training tutorials for scientists using their web tool interface. CGEBM also provide training for various Galaxy workflows as part of our bioinformatics training portfolio.
- Genomic Data Science courses are available from Coursera. Six different courses cover different bioinformatic methods essential for analysis and interpretation of next generation sequencing data. Note that Coursera courses are mostly fee-based.