An international consortium with over 50 institutions has announced an ambitious project to compile high-quality genome sequences from all 66,000 vertebrate species on Earth, including all mammals, birds, reptiles, amphibians and fish. With an estimated total cost of $ US600 million ($ 835 million), it is a project of biblical proportions.
It is called the Vertebrate Genomes Project (VGP) and it is organized by a consortium called Genome 10K or G10K.
As the name implies, this group initially planned to sequence the genome of at least 10,000 vertebrate species, but now, thanks to the enormous advances and cost savings in genetic engineering technologies, G10K has decided to increase the ante, with the aim to both a female and female individual of each of the approximately 66,000 vertebrates on earth.
Cofounders from the project announced the new goal yesterday at a press conference at the opening session of the Genome 10K Conference of 2018, which is currently being held at Rockefeller University in New York City. The project includes more than 150 experts from 50 institutions in 12 countries.
The announcement corresponds to the introduction of 14 new high-quality genomes for species representing all five vertebrate classes, including taken from the larger horseshoe bat, Canadian lynx, platypus, Anna's hummingbird, the kakapo parrot (of which only 150 are leftover individuals), Goode & # 39; s desert turtle, bivalve caecilian (a strange limbless amphibian that resembles a snake) and climbing perch.
These 14 taken, and those taken during the course of the project, will be made available to scientists for research purposes.
Indeed, there is more to VGP than to the sequencing of animals taken. Like the Human Genome Project, this effort will undoubtedly lead to breakthroughs in sequencing and genome assembly with high resolution, resulting in lower costs and fewer errors.
The project will also address important questions in biology and disease and have immediate consequences in the field of evolution, genomics and conservation biology. At that last point, a complete catalog of the vertebrate species of the earth could serve as a protection against extinction – both in terms of preventing extinction and possibly reviving extinct species in the future.
Yesterday at the press conference, Oliver Ryer, co-founder of G10K and director of the San Diego Zoo Institute for Conservation Research, said that VGP has the potential to transform all the realms of biology & # 39; He said that it will allow scientists to understand the reasons for extinction, including the presence of harmful mutations, inbreeding and genetic bottlenecks.
For example, Ryer described the discovery of a harmful recessive gene among Californian condors by saying "we can now identify birds that are carriers of this deadly trait." In the end, he believes that the project will make us "better stewards of life on earth" "and enable us to" preserve our biological heritage ".
When G10K was launched 10 years ago, the members had no idea how long it would take to sequence enough quality to do good science and do it affordably.
"I am incredibly excited that we are now in a position to do well," said David Haussler, a co-founder of the G10K and director of the UC Santa Cruz Genomics Institute, at yesterday's meeting. "Now it's really the time to start," and "we have no excuse not to do this."
To generate genome assemblies of high quality, the VGP team emphasizes & # 39; long reading & # 39; above & # 39; short lectures & # 39 ;, meaning that sequencing technologies that produce longer chunks of contiguous genetic data are preferred over those with shorter ones. This makes it considerably easier to assemble the DNA sequences into whole chromosomes.
So instead of having to work with a jigsaw puzzle with millions of pieces, the long-read pieces will result in a puzzle consisting of thousands of pieces.
The researchers will also refrain from combining male and female chromosomes into a single genome – a common practice that resulted in far too many errors. Instead, the team will merge both the DNA of the father and the mother of individuals in a process called phasing.
As Gene Myers, a member of the VGP team and a principal investigator at the Max Planck Institute of Molecular Cell Biology and Genetics, said yesterday, every species will be a "one and done" deal, meaning the quality of the sequences will be it will be good that the work does not have to be repeated in the future. That way "we can continue with science," he said.
As far as the process is concerned, the researchers will build long-read sequences with a first assembly of chromosome pieces called "contigs". These chunks are joined together to create even larger pieces, called scaffolds, which in turn are linked to others to make even larger assemblies, all the way down to the size of the chromosomes. Optical DNA maps and computer algorithms help with the process, ensure the correct order and mark any structural errors.
"The progress in the field of elongated sequences and scaffolding technologies with a large reach is a revolution for de novo [starting from scratch] DNA sequencing, "said Myers.
"After a break of 10 years, this trend inspired me to return to the assembly of the genome, because I think we can eventually produce near-perfect, telomere-to-telomere genome reconstructions and if current cost trends continue for less than $ 1000 [$1393] on average per vertebrate species, which drastically changes the landscape of genomics. "
Indeed, it was not so long ago that it took millions of dollars and years of effort to complete the genome of a single animal. New sequencing technologies could soon make it possible to create a complete genome in one week, according to Adam Phillippy, G10K assembly chair and head of the National Human Genome Research Institute of the NIH. It now costs about US $ 30,000 ($ 41,785) to sequence the DNA of a new species for the first time.
The new sequences are stored and made publicly available in the Genome Ark database, a digital open access library of genomes. Corporate sponsors DNAnexus and Amazon Web have "played an important role in getting this project off the ground," Phillippy said.
"This project is bizarre and excessive – but it is feasible and inevitable," said Harris Lewin, a member of the UC Davis VGP team at the press conference.
Around $ US600 million ($ 835 million) will be needed to complete all phases of VGP, according to a G10K press release. To finance the project, G10K buys money from private institutions and corporate sponsors.
But the consortium also does some crowdsourcing, has already collected $ US2.5 million ($ 3.5 million) from the $ US6 million ($ 8 million) needed for the first phase of the project (the first phase includes the order of at least one person from all 260 live vertebrates).
All hyperboles aside, this is one of the most ambitious projects we have seen for a while, in competition with the Human Genome Project (HGP), the Human Connectome Project (a continuous effort to map all connections of the human brain) and the VGP sister project, the Earth BioGenome Project (EGP), which was announced earlier this year. The aim of the EGP is to sequence all eukaryotes (there are about 8.7 million species on the planet), for an estimated cost of $ 4.7 billion ($ 6.5 billion).
In an e-mail to Gizmodo, a G10K spokesperson said that EGP will function as the coordinating body, and VGP-vertebrate genomes will be contributed to the overall effort to eliminate replication of work.
No timeline was given for the VGP project, but as the HGP demonstrated, a slow start is not necessarily a reflection of the overall pace of a project. As time goes on, and as technologies and techniques improve, the VGP researchers should see an acceleration of efficiency, both in terms of speed and cost. Once completed, we have a remarkable repository at our disposal, one that even Noah would be proud of.