Image credit: Greg Moss / Wellcome Sanger Institute

Categories: Sanger Science8 July 2025

Weaving representation into genomics research

By Hayley Clissold, Policy and Advocacy Lead at the Wellcome Sanger Institute

Representation in research is key to ensure that science can benefit all of us. The Wellcome Sanger Institute is one of the first research institutions in the UK to launch an organisational Representative Research Strategy. In this blog, we reflect on the Strategy, what it sets out to achieve, what it means and why it is important in today’s society.

Sign up for our email newsletter

Article 27 of the Universal Declaration of Human Rights, drafted by the United Nations General Assembly in 1948, states that everyone has the right to benefit from science. To achieve this, representation in scientific research is a crucial step. We need to ensure that the people and topics we study are representative of those who should be benefiting from the science. This will ensure that health interventions, like diagnostics and medicines, and the services that are driven by science can benefit all of us.

Estimates indicate that 80 to 90 per cent of global genomic datasets are derived from research participants of European descendent, despite the fact that Europeans only account for about ten per cent of the global population. When I first read this fact, I was profoundly impacted by its meaning – that there is a stark lack of representation across genomic science. This fact has underpinned much of the drive behind our organisational Representative Research Strategy.

Humans share most of our DNA – around 99.9 per cent of our genomes are identical. However, tiny differences in the other 0.1 per cent, not only make us look different from each other, but also affect how susceptible we are to diseases, how we respond to medicines, and how we are influenced by our environment.

Genetic commonalities can be found across populations, communities and those with similar ancestries. For example, the most common genetic cause of cystic fibrosis in Europeans is a genetic variant that causes a small deletion in the CFTR gene, also known as F508del. However, this variant is far less prevalent across other ancestries – instead we see a number of other more common disease-causing variants in people with East Asian ancestries. As we move towards new and improved diagnostic tools and treatments that rely on molecular tests, it is important that we make sure these are developed using diverse and representative data. This will ensure that everyone can benefit from these advancements, not just some.

While ancestral biases are widely recognised within research, we also see biases across many other facets including sex, gender, and socioeconomic status. Not only that, but there are also biases in non-human research too. For example, biodiversity genomics has historically focused on European plants and animals and as such, many biodiverse hotspots around the world are not as well understood or studied.

At the Sanger Institute, we firmly believe that representative research will strengthen the impact of our science, making our research findings more relevant and applicable to communities, populations, species and ecosystems around the world. It will also allow scientific data to be reused by more researchers globally, helping them to answer their own research questions based on the needs and challenges across local, regional and global contexts.

Without appropriate representation in research, we risk missing important biological insights and exacerbating existing social and health inequities throughout society. Now is the time to address this, particularly as genomics becomes more integrated within real-world settings, and increasingly gets partnered with other technologies, like artificial intelligence, which are powered by vast amounts of data.

RELATED SANGER BLOG

Tackling power imbalances in genomics research

In 2024 Wellcome organised a workshop to delve into the ethical, legal and social implications of genomics in Cape Town, South Africa. In this blog, we talk about some of the topics discussed with three of the attendees

Representation for societal impact

Developing the Strategy was a highly collaborative endeavor, bringing together expertise from all six of the Institute’s scientific programmes as well as colleagues in EDI, research governance, translation, strategy, and Wellcome Connecting Science. Moreover, we knew we were not alone in attempting to address these challenges and so, to ensure our approach was aligned with and contributed to ongoing efforts across the sector, we formed an advisory group with external experts from the UK and Mexico who lent their time and expertise to shape and refine the Strategy.

The Strategy outlines a ten-year vision for Sanger science to be designed and carried out in an equitable and representative way; explicitly giving consideration to how this will deliver societal benefit. As Sanger science focuses on both human and planetary health, we decided to take a broad approach to representation in the strategy so that it included not only humans, but other life too, with a particular focus on biodiversity.

Through this process, we acknowledged how diverse our research already is, and how our researchers are already considering and embedding representation throughout the entire scientific process and at all levels. For example, Project Jaguar is a collaboration between scientists from Argentina, Brazil, Chile, Colombia, Mexico, Peru, Uruguay and the Sanger Institute in the UK. It is the first coordinated effort to understand the diverse genetic backgrounds of Latin American populations, alongside the influence that the regional environment, climate and culture play on gene prevalence and expression.

RELATED SANGER BLOG

Empowering microbiome research in underrepresented countries

Mentoring and training early career scientists; building technology that can adapt to humid and hotter countries; finding a way of transporting anaerobic microorganisms. The pathway to accelerating microbiome research isn’t lacking challenges

Another example is the Human Cell Atlas – a consortium with around 4,000 members across 104 countries who are creating comprehensive reference maps of all of the cell types in the human body to enhance our understanding of health, as well as disease diagnosis, monitoring and treatment. For this research to deliver its intended impact, it is crucial that the data generated are representative of diverse populations and so, the consortium is committed to ensuring the inclusion of both scientists and data from underrepresented countries.

In terms of biodiversity, there are multiple projects in which the Sanger Institute collaborates to improve representation from diverse hotspots – the Aquatic Symbiosis Genomics (ASG) project, for example, is reading the genomes of 1,000 freshwater and marine species that represent more than 500 symbiotic relationships across the globe, with the aim to understand how species evolve and live together.

Therefore, when developing our Strategy, we sought to build on the exciting work already underway at Sanger and explore opportunities for further growth.

Sanger's research collaborations - More than 10,000 publications with collaborators in 178 countries since 1996

Enabling change

We believe that representation and equity should be woven into the entire scientific process, so we focused our strategy on four priority areas:

1. Our science strategy

This priority area explores how we can weave representative research into the fabric of our organisation, including in our scientific strategy and culture, as well as how we can learn from and engage with other organisations to drive meaningful change. One example activity we will be implementing is to incorporate representative research and equity into our scientific guidance and formal training.

2. Samples and cohorts

This priority area looks at the samples and cohorts we use in our science. This is a challenging area for the Sanger Institute as we often use, rather than build, existing research cohorts and sample collections, or we collaborate with other researchers who build these. We aim to encourage researchers to consider other ways to address equity and representation, for example, by making use of diverse cohorts where possible and being transparent on where there may be limited applicability of research that is less representative. We also commit to advocating to funders and government, by putting forward recommendations on the need for more representative cohorts and biobanks, and to push for funding to build and maintain these resources.

3. Data capture and analysis

This priority area explores how we can consider representation and equity in the scientific tools, methodologies and analysis methods that we use.

4. Research outputs and impact

This priority area considers the stage beyond the research project and how research findings are shared, utilised and bring benefits to others around the world. For example, as an Institute committed to open science, we already openly share most of our data and resources, but it is important for us to consider how we can improve equitable access and reuse of these resources. The Tree of Life programme at the Sanger Institute, for example, is teaming up with other international projects like the Earth BioGenome Project (EBP) to actively develop portals that are open and accessible to researchers from other fields outside of genomics.

A sector-wide effort

As genomics becomes universal and applied to an array of different real-world settings, we must ensure that the research is representative of populations, species and ecosystems around the globe so that everyone can benefit from science. Truly representative research will not happen overnight, but efforts like ours will begin to pave the way to instigate much needed change in science. Encouragingly, several other organisations across the sector, including research institutions, funders and publishers, are already beginning to reflect and act on this important issue. It is clear that we all have a role to play in tackling this, and together, we have the potential to make great strides in this space so that science can truly benefit everyone.

Download a PDF of the Sanger Institute's Representative Research Strategy