
Lecturer
- About
-
- Email Address
- thuan.chuah@abdn.ac.uk
- School/Department
- School of Natural and Computing Sciences
Biography
Dr. Edward Chuah is a Lecturer (equivalent to Assistant Professor in the US) in the Computing Science Department, University of Aberdeen. He received his PhD in Computer Science at the University of Warwick (His PhD supervisor was Dr. Arshad Jhumka). Before his PhD, he worked in Singapore as a research engineer, software engineer and lecturer (1.5 years in software engineering, 8 years in R&D and 2 years in teaching). After his PhD, he continued his research in failure diagnosis and developed new research in network security as a post-doc at Lancaster University (His advisor was Prof. Neeraj Suri). He also taught at the University of Exeter. His current research focus is in High-Performance Computing (HPC) systems failure diagnosis and identifying attacks in large networks. He has been working on the topic of failure diagnosis since 2010. His work involves processing large volumes of real data to generate new insights into a distributed system and to improve its reliability and security.
Research topics: Large-scale systems dependability, Network security (attacks identification), HPC reliability (failure diagnosis, failure prediction, error propagation and error detection), Data analysis.
Prospective research students: If you are interested in studying for a PhD in one of the aforementioned research topics, then email your CV, academic transcripts and proposal to Dr. Chuah at thuan.chuah@abdn.ac.uk for an informal discussion.
Service to the community:
2022: Reviewer for the Latin American Journal of Computing, ACM Computing Surveys.
2021: PC member for the 2nd International Conference on Information and Software Technologies (ICI2ST).
2020: Reviewer for the 2nd International Conference on Machine Learning and Intelligent Systems (MLIS).
2019: Reviewer for IEEE Access.
2018: Reviewer for ACM Computing Surveys, Software: Practice and Experience.
Qualifications
- PhD Computer Science2020 - The University of Warwick
- MSc Distributed Systems Engineering2004 - Lancaster University
- BSc (Hons) Computer Science2003 - The University of Leicester
Prizes and Awards
- J. Tinsley Oden Faculty Fellowship, The Oden Institute for Computational Engineering and Sciences, University of Texas at Austin, USA. September 2011. The Fellowship was to collaborate with the late Prof. Emeritus James C. Browne on research in HPC health monitoring and fault management.
- The Alan Turing Institute Doctoral Studentship, UK. September 2016 to March 2020. Link
- Research
-
Research Overview
Research interests:
Edward's main research interest is in large-scale systems dependability. His initial research interest is focused on reliability [1-5], one of the attributes of dependability. Currently, his focus is on the security aspect of dependability where he investigate security in large networks [6]. He also has a general interest in machine learning, anomaly detection, causal inference and software security.
Expertise:
Edward's expertise is in system failure diagnostics and data analysis. He is the main developer of FDiag, a system log-based failure diagnostics toolkit [5]. FDiag has been used by HPC systems administrators at the Texas Advanced Computing Center to uncover previously unknown causes of compute node soft lockups. He also developed several more system log-based diagnostics tools. ANCOR is a novel anomaly-correlation approach that linked system resource usage anomalies with system failures [4]. CORRMEXT is a new workflow that identified patterns of error propagation on large supercomputer systems [2]. FDiag, CORRMEXT, etc. are available on GitHub.
System log-based diagnostics tools
Selected publications:
- E. Chuah, A. Jhumka, S. Alt, R.T. Evans, N. Suri, Failure Diagnosis for Cluster Systems Using Partial Correlations, in Proceedings of IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA), 2021.
- E. Chuah, A. Jhumka, S. Alt, D.B.-Thomert, J.C. Browne, M. Parashar, Towards Comprehensive Dependability-Driven Resource Use and Message Log-Analysis for Cluster Systems Diagnosis, Journal of Parallel and Distributed Computing, 2019.
- E. Chuah, A. Jhumka, J.C. Browne, N. Gurumdimma, S. Narasimhamurthy, B. Barth, Using Message Logs and Resource Use Data for Cluster Failure Diagnosis, in Proceedings of IEEE International Conference on High-Performance Computing, Data and Analytics (HiPC), 2016.
- E. Chuah, A. Jhumka, S. Narasimhamurthy, J. Hammond, J.C. Browne, B. Barth, Linking Resource Usage Anomalies with System Failures from Cluster Log Data, in Proceedings of IEEE International Symposium on Reliable Distributed Systems (SRDS), 2013.
- E. Chuah, S. Kuo, P. Hiew, W.C. Tjhi, G. Lee, J. Hammond, M.T. Michalewicz, T. Hung, J.C. Browne, Diagnosing the Root-Causes of Failures from Cluster Log-Files, in Proceedings of IEEE International Conference on High-Performance Computing (HiPC), 2010.
- E. Chuah, N. Suri, A. Jhumka, S. Alt, Challenges in Identifying Network Attacks Using NetFlow Data, in Proceedings of IEEE International Symposium on Network Computing and Applications (NCA), 2021.
Research Areas
Accepting PhDs
I am currently accepting PhDs in Computing Science.
Please get in touch if you would like to discuss your research ideas further.
Computing Science
Accepting PhDsResearch Specialisms
- Computer Science
Our research specialisms are based on the Higher Education Classification of Subjects (HECoS) which is HESA open data, published under the Creative Commons Attribution 4.0 International licence.
- Teaching
-
Teaching Responsibilities
Edward taught various courses ranging from Learning from Data to Software Engineering and Algorithm Analysis to undergraduate and postgraduate students.
- Publications
-
Page 1 of 2 Results 1 to 10 of 19
An Empirical Study of Major Page Faults for Failure Diagnosis in Cluster Systems
Journal of SupercomputingContributions to Journals: Articles- [ONLINE] DOI: https://doi.org/10.1007/s11227-023-05366-1
A Survey of Log-Correlation Tools for Failure Diagnosis and Prediction in Cluster Systems
IEEE Access, vol. 10, pp. 133487-133503Contributions to Journals: ArticlesChallenges in Identifying Network Attacks Using Netflow Data
Chapters in Books, Reports and Conference Proceedings: Conference ProceedingsFailure Diagnosis for Cluster Systems using Partial Correlations
Chapters in Books, Reports and Conference Proceedings: Conference ProceedingsSentiment Analysis based Error Detection for Large-Scale Systems
Chapters in Books, Reports and Conference Proceedings: Conference ProceedingsUsing Resource Use Data and System Logs for HPC System Error Propagation and Recovery Diagnosis
Chapters in Books, Reports and Conference Proceedings: Conference ProceedingsTowards comprehensive dependability-driven resource use and message log-analysis for HPC systems diagnosis
Journal of Parallel and Distributed Computing (JPDC), vol. 132, pp. 95-112Contributions to Journals: Articles- [ONLINE] DOI: https://doi.org/10.1016/j.jpdc.2019.05.013
- [ONLINE] http://dx.doi.org/10.1016/j.jpdc.2019.05.013
Enabling Dependability-Driven Resource Use and Message Log-Analysis for Cluster System Diagnosis
Chapters in Books, Reports and Conference Proceedings: Conference Proceedings- [ONLINE] DOI: https://doi.org/10.1109/hipc.2017.00044
- [ONLINE] http://dx.doi.org/10.1109/hipc.2017.00044
Using Message Logs and Resource Use Data for Cluster Failure Diagnosis
Chapters in Books, Reports and Conference Proceedings: Conference Proceedings- [ONLINE] DOI: https://doi.org/10.1109/hipc.2016.035
- [ONLINE] http://dx.doi.org/10.1109/hipc.2016.035
CRUDE: Combining Resource Usage Data and Error Logs for Accurate Error Detection in Large-Scale Distributed Systems
Chapters in Books, Reports and Conference Proceedings: Conference Proceedings- [ONLINE] DOI: https://doi.org/10.1109/srds.2016.017
- [ONLINE] http://dx.doi.org/10.1109/srds.2016.017