Showing our working - Transparency for trustworthy science

Showing our working - Transparency for trustworthy science

Showing our working - Transparency for trustworthy science

Article copyright CC BY-SA 4.0 (Richard J. Acton, 2023)

Why show our working?

We all remember being told to show our working in school, usually in maths. Why were we asked to do this? So that someone else can follow our reasoning, step-by-step, and see (with a little effort) for themselves if they found our reasons sound, or at least partially so. By showing our working anyone looking at our results can understand what we think is going on and why.

This is one of the fundamental motivating factors for open science, sharing our data and methods so that others can assess our conclusions for themselves with all of the same information. Openness is core to the corrective mechanism that drives scientific progress; you can't make progress if you can't spot mistakes, gaps, and misunderstandings. Importantly, we can't spot these if we can't all see our working. Ideally researchers should be in a position where anyone can ask: "how do you know what you think you know?" and we can provide a detailed and compelling answer that anyone can challenge and, with some effort, check for themselves.

Trust in our conclusions is, rightly, derived from the transparency and accountability of our processes. This applies both within the scientific community and to our relationship with the public.

As the depth of our understanding of the world has increased, the length of the answer to this question has grown longer. It is harder, as a practical matter, to show all of our working from start to finish. Doing so however is no less important now than it has ever been, if anything the complexity of modern science makes it more important than ever.

Explaining how we reached our conclusions now takes quite a long time. In many cases there is a great deal of context needed to understand some of the questions we now tackle. This can present a significant communication challenge when interacting with the public and even specialists in other disciplines. Think about that time someone said “We all know that…” but you didn’t and so what followed didn’t make sense. This happens to everyone and sometimes we’re too embarrassed or polite to ask for clarification. As specialisms have evolved, the knowledge and effort necessary to assess the strength of conclusions even in seemingly close fields has increased, this makes the importance of explanatory clarity greater than ever.

One of the factors which makes completeness of description challenging is that it is rarely one person, or even one research group, that is responsible for the full chain of steps that produce a modern research paper, especially 'prestigious' ones. Thus, there is no single person with full insight into the granular details of every experiment and method used in many modern papers. Consequently, ensuring that every detail needed for reproducible work can be a significant coordination challenge among co-authors.

So how are we, the scientific community doing at this task of showing our work? Unfortunately, not as well as you might hope.

Are we any good at showing our working?

To start with, across disciplines our work is getting harder to read as it becomes more laden with 'science-ese' or general scientific jargon, or so conclude Plaven-Sigray and co-authors in a 2017 paper in eLIFE "The readability of scientific texts is decreasing over time". This does not help the general accessibility of our work to colleagues, students, science journalists or the public. Nor does it appear to be driven by specific technical jargon which is a useful communication shorthand, it's mostly the addition of superfluous polysyllabic obfuscationalisms presumably intended to seem more erudite and authoritative. If it’s unnecessarily hard to understand it’ll be unnecessarily hard to reproduce.

In 2021 a series of papers was published by the Reproducibility Project: Cancer Biology  summarising the results of efforts spanning eight years to reproduce the findings of 193 experiments from 53 prominent papers in field of cancer biology. Their results were not particularly reassuring. None of the 193 experiments were described in sufficient detail for the project team to design protocols to repeat them, yes none of them could be repeated without additional information from the original researchers. Of the 193 experiments the researchers were eventually able to repeat 50, after they were able to get some additional information from the original researchers, 32% of whom did not respond helpfully or at all to their inquiries. Between 40% and 80% of these 50 experiments, were successfully replicated, depending on how the criteria for successful replication were applied. This is not an isolated result, see the further reading section for some popular science books which expand on this point.

Reproducing biological work is genuinely a difficult task, lab work requires significant skill that is often hard to fully capture in protocols. Biology has a lot of inherent variability even when you are going to great lengths to get the same starting conditions there are sometimes factors that remain outside of our control.

Analysing your data though, reproducing that should be easy right? It's all on the computer you can just run the same thing again right? Alas it is not as simple as you might think.

It can take several months to reproduce a bioinformatic analysis in a published paper if it is possible at all. Significant strides have been made in reproducible analysis since that paper was written but I've not found a more recent attempt to quantify the problem. Why is this the case? What makes reproducibility hard and what can we do to make it better?

How can we do better?

As the Data Outputs Manager for the Human Developmental Biology Initiative (HDBI) it's my role to make it easier for scientists to make their data available to others, and easier for them to make their data and experiments reproducible.

So what steps can we take to make it easier for others to replicate our work? A significant step is more precisely, completely, and systematically describing what we do. A word that is commonly associated with this is 'metadata', a slightly nebulous term meaning roughly data about data. Who generated it? When? With what machine? With what settings on that machine? Why? What properties did the samples measured have? Species, type(s) of cell, stage of development? What were the sample preparation steps?  The answers to all these questions and more can be considered metadata.

The challenge is usually in three parts:

  1. Getting everyone to agree on what pieces of information are necessary and desirable.
  2. What format to store them.
  3. Getting everyone to actually supply this information when publishing their work.

When coming up with metadata standards my preference is to take a 'linked data' approach as this lets you use a shared set of existing tools and to draw on pre-existing resources e.g. an agreed set of standard terms to refer to certain classes of things like cell-types or sequencing technologies. A set of guiding principles that we can adopt when making our data available for others to use, so that they can both use it in their own work, and check existing conclusions is based around the acronym FAIR.

Be FAIR

Findable - Has a unique identifier that can be looked up in a database, plus some associated terms so you can find the id with a search.

Accessible - If I've got the ID I can download a copy, or figure out who to ask for permission to download a copy if there are, for example, privacy restrictions.

Interoperable - It's in a file format I can read (without expensive proprietary software). It's described in standard terminology so I can easily search for it and connect it with data about the same/related things.

Re-usable - Where it came from/how it was generated and other attributes are well documented according to community standards. It's licensed so it can be used.

Think like a chef

Reproducing lab work and computational work is similar to cooking, a recipe is comprised of a list of ingredients (Materials & Reagents, Data), a set of steps to follow (Protocols, Code), and descriptions of the environment in which the food is cooked. e.g. the type and temperature of the oven (Lab environment, Compute environment) and additional information that helps me find, contextualise and appropriately use the recipe (metadata).

Some recipes are overly vague and some highly specific, this might depend on the difficultly and stakes of getting a tasty or at least edible result.

Science can be like high stakes cooking. Think of a meal for diplomats from two countries with a tense relationship who also happen to be restaurant critics with a bunch of food allergies and dietary restrictions but the head chef was ill that day so now you, a relative novice, has to cook. The recipes have to be good, really good! 

Here's a table fleshing out the analogy with some small examples of the sorts of information that can fall into these categories as they apply to these different disciplines:

A table with three columns titled: Cooking, Lab work, and Computational analysis. There are four rows: Inputs, Process, Environment, and Context.  For cooking the inputs are the ingredients. For lab work they are materials and reagents like cell lines, organisms, and media. For computational analysis they are the data. For cooking the process are the cooking instructions. For lab work the lab protocol. For computational analysis this is the code and documentation. For cooking the environment are the conditi

The instructions need to be precise about inputs, processes, environments and contexts. The difference is between ‘enough flour to form a dough’ and ‘200g of plain flour, 00 grade’. In science you would not say that you referenced the human genome, but specify Homo sapiens (NCBI:txid9606) genome (Ensembl 109, GRCh38.p13 ). Just like how a meringue in a wet or slightly greasy bowl won’t work, how you wash your glassware in the lab can make a difference!

When a protocol can't quite capture your bench work well enough that someone else could do your experiment if they read it then you can take the approach of JoVE (https://www.jove.com/) (The Journal of Visualised Experiments), or if that's a bit much the less formal approach is available; anyone with a smartphone or action camera can film their experimental work, upload it to figshare and get a DOI to reference in a protocol on protocols.io (https://www.protocols.io/) or in a paper.

When doing an analysis you don't re-write the code for all the steps from scratch, you use existing tools to perform many calculations. These in turn use other tools creating a 'tree' of 'dependencies'. The way these tools work can change if the software gets updates so to re-run your analysis exactly I need to know not just the versions of the tools you were using but the versions of the tools your tools were using and so on. It's tools all the way down.

How can we encourage the adoption of these practices?

So why are we not working more reproducibly already? It's quite hard to do in certain cases often because tooling and automations have not caught up to make it easier. It's also not yet a norm to which we expect one-another to conform in the scientific community either when we review others' work or have our own reviewed.

In his article ‘Five selfish reasons to work reproducibly’ Florian Markowetz lays out some excellent reasons to get ahead of the curve on working reproducibly.

The carrot and the stick

The laboriousness of recording and providing this level of detail can be a major impediment to researchers but what can be automated, what tools, practices and standard procedures can scientists adopt to make the provision of sufficiently detailed structured information a part of their workflow that does not get in their way and if anything makes their lives easier?

Depending on your position, you can either make something easier (provide the carrot) or incentivise by putting in restrictions (the stick). Here are some examples:

  • Software Developer
    • Work on open-source tools - don't develop a proprietary platform or product to try and solve these issues or people (like me) will tell others not to use it as the incentives don't line up with the degree of interoperability and portability needed in science go with an open-source business model compatible with the needs of open science.
  • Peer Reviewer
    • Ask questions about reproducibility and FAIR data. This is where the expectation of higher standards in this area can begin to be set. If you are not asking these questions in your reviews of papers maybe clue in editors in that academics now expect this, and shape their decisions on what to put out for review in the first place.
  • Journal Editor
    • If you are a journal editor and you are deciding between many good submissions make this a criterion for what you choose.
  • Grant Reviewer
    • If you review grants ask about people's plans for making their data FAIR and their analyses reproducible. This will get them thinking about it well in advance rather than panicking at the end.
  • Press
    • If you are a member of the press or a populariser of science, report favourably on publications that show their work and skeptically of those that don't.
  • Public (also applies if you fall into any of the other categories)
    • Ask your elected representative or the relevant minister (at time of writing Michelle Donelan Secretary of State for Science, Innovation and Technology) why the research councils aren't holding their grant awardees to a higher standard on reproducibility and FAIR data so public funds are put to best use?

Where can I learn more?

I've written a short ebook as a resource of HDBI members ‘Data: Inception to Publication & Beyond'  this is directed at a more technical audience but aims to be written in an accessible style. It features many links to external resources to learn more about a given topic in various media. I'm looking for feedback comments and suggestions for improvement from any readers.

References & Further Reading

Data: Inception to Publication & BeyondData: Inception to Publication & Beyond

Garijo, Daniel, Sarah Kinnings, Li Xie, Lei Xie, Yinliang Zhang, Philip E. Bourne, and Yolanda Gil. 2013. “Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome.” Edited by Christos A. Ouzounis. PLoS ONE 8 (11): e80278. https://doi.org/10.1371/journal.pone.0080278.

Harris, Richard F. 2017. Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions. New York: Basic Books.

Markowetz, Florian. 2015. “Five Selfish Reasons to Work Reproducibly.” Genome Biology 16 (1). https://doi.org/10.1186/s13059-015-0850-7.

Plavén-Sigray, Pontus, Granville James Matheson, Björn Christian Schiffler, and William Hedley Thompson. 2017. “The Readability of Scientific Texts Is Decreasing over Time.” eLife 6 (September). https://doi.org/10.7554/elife.27725.

Ritchie, Stuart. 2020. Science Fictions: Exposing Fraud, Bias, Negligence and Hype in Science. London: The Bodley Head.

 

Image elements: Icons - https://bioicons.com & Emoji - https://openmoji.org