This month marked the first phase culmination of “The 1,000 Genomes Project,” an extensive five-year study that sought to determine the gene sequences of 1,092 individuals. The project’s well-represented research team comprised of about 400 researchers from around the world including Aravinda Chakravarti, a member of the Institute of Genetic Medicine at the Hopkins School of Medicine.
Launched in 2008, the Project seeks to examine human genetic variation using a multinational analysis of four major ancestry groups: American, European, African and East Asian. By studying variations in the genomes of people with different migratory history, the researchers are hoping to identify specific genes that cause disease.
Genetic diseases are caused by errors in gene sequences, which are comprised of DNA. These gene sequences are so intricate that even a minor error can lead to dramatic physical consequences.
For example, sickle cell anemia is a severe genetic disease that distorts the shape of red blood cells and causes them to carry oxygen in an inefficient manner. This disease is caused by a single error in the beta-globin gene — a point mutation — meaning that one DNA base pair in the whole gene sequence is wrong.
Let’s keep this error in perspective: There are approximately 3 billion DNA base pairs in the human genome. It takes a mere base pair exchange, the smallest of accidents, to cause the devastating physiological problems of sickle cell disease.
Given the precise nature of the DNA sequence needed for normal function, it makes sense that it contains little variation throughout the human population (humans are thought to be 99.5% genetically identical to one another). However, these variations are considered within the “normal range” as they do not take into account the effect of disease of gene variation.
The 1,000 Genomes Project aims to study individuals in the normal range, which only includes subjects who do not suffer from genetic diseases. The research team believes that studying genetic variability among healthy individuals will grant them the skills necessary to understand what happens when the genome undergoes mistakes. Essentially, the genomes of the healthy subjects will serve as a standard for future scientists studying human genetics.
The study found that some genetic variations appeared more frequently than others. If a gene variation was seen in more than five percent of the samples, it was considered a common variant. If it appeared 0.5 percent, it was termed a low frequency variant. Finally, rare variants were found in less than 0.5 percent.
The frequencies of common variants among the four ancestral groups were pretty consistent and had mostly been known to scientists from previous studies. The real area of interest centered on the other two categories. The study described variants that had been previously unidentified and attributed them to different populations.
Another name for gene variants is alleles. In sexual reproduction, different alleles get shuffled around when gametes combine and form a new individual. Combining genetic material in sexual reproduction causes genetic drift, which is a change in allele frequency. This change is partly responsible for human genetic variation.
The study that set this type of genetic work into motion was that of the Human Genome Project. In 2003, scientists working on this project unveiled the human genome for the very first time. They also sequenced the genomes of other organisms including E. coli, a fruit fly, and a mouse.
While the Human Genome Project took years to sequence the genome of one human, the 1,000 Genomes Project has utilized advances in technology and data quality and is now capable of sequencing 10 billion bases in 24 hours — that’s almost two human genomes per day!
In the final phase of the project, scientists are planning on sequencing an additional 1,500 genomes. With this comprehensive library to guide them, the hope is that the 1,000 Genomes Project will shine a light on the study of genetic disease.