Published by the Students of Johns Hopkins since 1896
January 22, 2025

Scientists store binary data in DNA strands

By ZACH BUONO | April 6, 2017

B9_DNA-1024x640

Nogas1974/CC-By-SA-4.0 DNA could be used to store data.

How can the complex compilation of Peter Jackson’s Lord of the Rings trilogy be broken down into unintelligible bits of binary data, stored on complex electronic computer systems and then finally transmitted with the tap of a finger?

While the current electronic capabilities for storing data are improving every day, they are intrinsically flawed. Data stored on these systems can fade away, get lost or be corrupted over time.

Engineers and scientists are continually improving data storage methods in an effort to replace obsolete former methods. The question that remains is how to store data in the future to ensure no loss of important information gathered throughout history.

A pair of researchers at Columbia University and the New York Genome Center may have the answer.

These researchers have discovered a new, efficient way of storing and reaccessing information by harnessing the potential of a data storage system that actually lies within each person. This data storage system is DNA.

By using a new code-writing technique and a revolutionary DNA strand constructing process, researchers were able to accurately transfer binary data from electronic strings of ones and zeros into the four nucleotide bases (A, C, T and G) and then transcribe this on to actual organic DNA material.

After two weeks, the researchers then used modern sequencing technology to read back the genetic strands, finding that there were zero data transfer errors in the 72,000 strands of DNA they sequenced.

Overall, the research team was able to pack 215 petabits of information onto a single gram of DNA material. This error rate is remarkably low, especially for the amount of recorded information, which shows the promise for DNA as an accurate data storage system.

In addition, DNA is incredibly compact and can last hundreds of thousands of years if kept in the right environment, giving the medium even more credence as a possible solution for long-term data storage.

Even if the information stored on this DNA does begin to degrade over time, the strands themselves can always be reproduced through the natural process of polymerase chain reaction (PCR). The researchers reported no errors after conducting multiple rounds of PCR duplication, so a potentially infinite number of copies of the stored information could be created if desired.

While this process works well in a laboratory setting, the sequencing technology used to read back the binary information stored on the DNA is too slow to directly transfer information to a screen in real time.

This means that in order to actually enjoy an encoded movie, the viewer would have to transcribe the information back into binary and watch it using basic digital equipment. In addition, the actual machinery used for this process costs more than a DVD player.

Even if one wanted to install cutting edge laboratory equipment in lieu of a more practical piece of technology, the actual cost associated with synthesizing and reading the DNA is about $9000, putting this process out of the realm of most people’s budgets.

While this new technology, with further advancement, provides great potential for long term, reliable, efficient and reproducible data storage, for the time being, a basic Netflix account might still be the way to go.


Have a tip or story idea?
Let us know!