Reprothon: how reproducible is our code?

09 May 2022

by Dr Maria Kamouyiaros

2022-05-09

Reproducibility has always been an important component of good research, and as time has gone on, more and more journals are now requesting full code used for data-handling and analysis to accompany manuscripts. But the big question is: how reproducible is our code, really? How long does a script we’ve spent hours working on last? Does code have a "shelf life"?

To start to answer these questions, on the 9^th of June 2021 Dr Thomas Cornulier, a research fellow in applied statistics, and the Aberdeen Study Group, a PGR-led group for skill-sharing and coding, hosted an open-access hackathon: Reprothon 2021. The aim, to collect data on the reproducibility of R code available on one of the biggest coding and programming forums, Stackoverflow.

Hundreds of thousands of posts from 2008 to March 2021, went through a filtering and structured sampling process, ready for all to access and start testing verified (i.e., worked at the time) R code. The event featured 23 amazing contributors based in Aberdeen, Edinburgh, Dundee, Reading, Austria and France. We all worked through code used in anything from plotting bar charts to more complex time-series analyses. Together, in 1.5 hours, we tested 134 pieces of code.

Credit to everyone who contributed on the day (not all pictured): Alexandra Jebb, Annette Raffan, Auriel Sumner-Hempel, Aurore Ponchon, Camilla Negri, David Fisher, Eilidh Fummey, Heather Ritchie-Parker, Hongjie Zhao, Katherine August, Laura Mackenzie, Lucy Henshall, Marcela Espinaze, Maria Kamouyiaros, Max Tschol, Rosie Baillie, Sania Wadud, Susan Kenyon, Tamsin Woodman, Thomas Cornulier, Virginia Iorio, Yanlin Liu, Zhibin Wen

Luckily for all of us with a vault of "dusty" R scripts, the majority of them passed, but almost 12% still failed despite having been verified, highlighting that code really doesn’t last forever (it’s not just my bad scripting skills). So, what can we do to help minimise this? Here’s a short list of some good habits to have that I put together from the discussions on the day:

Always note down what versions you’re using
Include reproducible example data (even if it’s just a subset) with example command line output
Call in your values, don’t write them out manually in your code – this makes it transferable between datasets and projects
Annotate your code (for real this time)
Use relative file paths
Refer to your packages for your functions using "::" ("package_name::function")
If you’re generating values, make sure you set a seed (a specified starting value) when you can!

twitter.com/Kamouyiaraki/status/1402676954743123975

It is impressive how many posts were tested within such a small timeframe, and there’s still more that this data can offer (are there specific packages or analyses that “age” faster?). For me, however, this hackathon was a perfect demonstration of how locally organised, online events make for an open and accessible way to discuss, share ideas and collaborate with people across institutions; people you’d probably never get a chance to meet in person, all working towards a common goal.

If you missed the live event, the Reprothon is still ongoing! All information and data is fully available on the Aberdeen Study Group website. So, if you are interested in joining the next live event on the 11^th of May 2022 or to contribute to this on your own time and keep the data going with a group of friends, colleagues, relatives and/or pets it’s all available for you!

twitter.com/abdnStudyGroup/status/1394253442353405955

Published by School of Biological Sciences, University of Aberdeen Logo

Comments

#1
Adam Price said on 10 May 2022 at 08:45
Thanks Maria, that was very interesting read and nice to hear about this kind of thing being done nere

Share this post

Browse by month

2026

Jan There are no items to show for January 2026
Feb
Mar
Apr There are no items to show for April 2026
May There are no items to show for May 2026
Jun There are no items to show for June 2026
Jul There are no items to show for July 2026
Aug There are no items to show for August 2026
Sep There are no items to show for September 2026
Oct There are no items to show for October 2026
Nov There are no items to show for November 2026
Dec There are no items to show for December 2026

2025

Jan
Feb There are no items to show for February 2025
Mar
Apr There are no items to show for April 2025
May
Jun
Jul There are no items to show for July 2025
Aug There are no items to show for August 2025
Sep
Oct
Nov There are no items to show for November 2025
Dec There are no items to show for December 2025

2024

Jan
Feb
Mar There are no items to show for March 2024
Apr
May There are no items to show for May 2024
Jun
Jul There are no items to show for July 2024
Aug There are no items to show for August 2024
Sep
Oct
Nov There are no items to show for November 2024
Dec

2023

Jan
Feb
Mar There are no items to show for March 2023
Apr
May There are no items to show for May 2023
Jun
Jul
Aug There are no items to show for August 2023
Sep
Oct
Nov There are no items to show for November 2023
Dec

2022

Jan There are no items to show for January 2022
Feb
Mar
Apr
May
Jun There are no items to show for June 2022
Jul There are no items to show for July 2022
Aug
Sep There are no items to show for September 2022
Oct
Nov There are no items to show for November 2022
Dec

2021

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug There are no items to show for August 2021
Sep
Oct There are no items to show for October 2021
Nov
Dec There are no items to show for December 2021

2020

Jan There are no items to show for January 2020
Feb There are no items to show for February 2020
Mar There are no items to show for March 2020
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec