At a glance:
The human genome, the genetic code that makes us uniquely human, has been the subject of scientific study and fascination for decades. The human genome contains approximately 3.055 billion base pairs. These are the molecular building blocks of DNA, consisting of adenine (A) paired with thymine (T) and cytosine (C) paired with guanine (G). This vast number is a testament not only to the complexity of the human race, but also to the intricate interplay of genetic factors that influence our physiology, behavior, and predispositions. Billions of base pairs and complex sequences determine everything from hair color to susceptibility to certain diseases. Understanding the entire human genome is both important and challenging.
The first essentially complete human genome sequence was released in 2013 as part of the Human Genome Project, which took the use of state-of-the-art DNA sequencing technology and pushed it to its absolute limits.
A subsequent refinement took place in 2019, yielding GRCh38.p13. Despite its advanced nature, the GRCh38 reference sequence harbored 151 cryptic sequences scattered across the genomic landscape. Critical genomic territories such as the pericentromeric and subtelomeric realms, recent fragment replication domains, amplicon arrays, and ribosomal DNA (rDNA) arrays were included. These regions underpin fundamental cellular operations. Furthermore, several of the most substantial reference voids in GRCh38 manifested as extended sequences of indeterminate bases, encompassing human satellite (HSat) repeat sequences and the p-arms of all five acrocentric chromosomes. Notably, GRCh38 also exhibited some synthetic or inaccurate segments, accompanied by a genome-wide deletion bias indicative of partial assembly. While 92% of the human genome was meticulously mapped, repetitive DNA in the remaining ~8% remained elusive due to technological constraints.
The introduction of revolutionary technologies like long-read DNA sequencing platforms, combined with sophisticated computational frameworks, ignited the potential for comprehensive human genome sequencing. Specifically, Oxford Nanopore technology shifts the field by sequencing fragments exceeding 4 million bases, uniquely enabling the traversal of the most intricate human genomic stretches, thus providing a pivotal tool to dissect the intricacies of human health and pathophysiology.
On March 31, 2022, The Telomere to Telomere (T2T) Consortium, leveraging Oxford Nanopore Sequencing, reported in Science a trailblazing reference genome dubbed T2T-CHM13. This magnum opus encompassed all mitotic territories and seamlessly spanned the entire p-arms of the five acrocentric chromosomes.
Boasting a length exceeding 3 billion base pairs across 23 chromosomes, this impeccable, continuous sequence was employed as a genomic cornerstone, revealing more than 2 million hitherto undetected genomic variants. Such a holistic view is pivotal for exhaustive genomic variability investigations, accentuated by the unique capacity of nanopore technology to discern methylation aberrations typically linked with pathological states, sans conversion. This epoch heralds a renaissance in genomics, unlocking unparalleled avenues for probing human health and ailment mechanisms.
Summary of the complete T2T-CHM13 human genome assembly. (Nurk S et al., 2022)
Despite the monumental success of the T2T-CHM13 assembly, the genomic odyssey remains an ongoing quest. A notable lacuna is the absence of a Y chromosome within chM13. Given the non-viability of pure Y chromosome CHM, novel sample sources are imperative to unveil this enigmatic chromosome. Moreover, while chM13 epitomizes a complete human haplotype, it doesn't encapsulate the gamut of human genetic polymorphism. Current revelations merely offer a snapshot of intricate structural divergences within the densely repetitive genomic sections decoded. Future explorations necessitate long-read resequencing endeavors to scrupulously probe polymorphic shifts and unveil their potential phenotypic reverberations.
For research purposes only, not intended for personal diagnosis, clinical testing, or health assessment