News That Matters


A Leap Towards Genetic Diversity: The New Human Pangenome

Imagine a world where our understanding of human genetics is no longer based on a single reference genome, but rather on a diverse and inclusive representation of our species. This is what scientists have accomplished with the creation of a new "pangenome" draft, which incorporates the DNA of 47 individuals from every continent except Antarctica and Oceania. This groundbreaking achievement has the potential to revolutionize our ability to diagnose diseases, discover drugs, and understand genetic variants, as it takes into account the genetic diversity between individuals and populations that was previously missed.

The project, funded by the US National Human Genome Research Institute, is still in its draft stage, with researchers aiming to include 350 people by mid-2024. This scientific milestone has been detailed in papers published in Nature and its partner journals, marking a significant step forward in the field of genomics.

The human genome is made up of 3.2 billion base pairs, and the new reference adds an impressive 119 million base pairs to the library. This is a far cry from the first draft of the human genome, which was released in 2001 and only fully completed in 2022. The 47 anonymous individuals included in the pangenome project had previously participated in the 1000 Genomes Project completed in 2015. To ensure a more inclusive representation of human genetic diversity, the team is currently recruiting new individuals to represent Middle Eastern and African ancestry populations not included in the 1000 Genomes Project.

Ethical considerations and "the principle of justice" are key elements of this endeavor, as the Human Pangenome Reference Consortium presents the first draft of the human pangenome reference. This pangenome contains 47 phased, diploid assemblies from genetically diverse individuals, covering over 99% of the expected sequence in each genome and boasting an accuracy of over 99% at structural and base pair levels.

The draft pangenome captures known variants and haplotypes, revealing new alleles at structurally complex loci. In addition to the 119 million base pairs of euchromatic polymorphic sequences, it also includes 1,115 gene duplications relative to the existing reference GRCh38. A significant portion of the additional base pairs, 90 million to be exact, are derived from structural variation.

When the draft pangenome was used to analyze short-read data, it reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared to GRCh38-based workflows. This demonstrates the increased accuracy and potential benefits of using the pangenome as a reference. The assemblies are highly contiguous and accurate, with 1,115 protein-coding gene families within the reliable regions of the full set of assemblies experiencing a gain in copy number in at least one genome.

In conclusion, the creation of the new human pangenome marks a significant step towards a more inclusive and accurate understanding of human genetics. By incorporating the DNA of individuals from diverse populations, this pangenome has the potential to revolutionize our ability to diagnose diseases, discover drugs, and understand genetic variants. As the project continues to expand and include even more individuals, we can look forward to a future where our knowledge of human genetics is truly representative of the diverse tapestry of humanity.