Nearly 60 years ago, Francis Crick and James Watson discovered DNA and caused a revolution in the field of genetics. Nearly 10 years ago, the human chromosomes were sequenced, opening the entire genetic code to scrutiny. Significant portions of DNA were thought to be "junk" - leftovers from evolutionary developments no longer needed, superfluous coding we did not use. Yet, with every advance made in the study of genes, we discover new levels of complexity in DNA, and we realize how much more we have to learn. It turns out that "junk" DNA is not junk after all, but provides timing and control mechanisms vital for the body's health.
Less than two percent of DNA codes for the production of proteins, used for building everything from bones to skin, organ tissue to red blood cells. For a long time, much of the rest of DNA was considered excess baggage. Beginning in 1990, the Human Genome Project (HGP) worked to map the three billion chemical nucleotide bases of the human genetic code. With great fanfare, the HGP gave the world the human genome in April 2003, two years ahead of schedule. When the HGP began, some scientists only wanted to map the sections of genome that coded for protein. Mapping all the "junk" DNA was considered a waste of time because it served "no known biological role."
Twenty years ago, geneticists expected humans to have far more genes than the lower animals, perhaps as many as 100,000. It was a shock when they discovered that humans had only 20,000-25,000 protein coding genes, about the same number as a mouse. It didn't make sense. Humankind felt insulted that we should have comparatively so few genes. In fact, it turned out that rice, the food staple that goes well under stir-fry, possesses 28,000 genes and the pufferfish has 27,000 genes - both notably more than the human genome. Yersinia pestis, the bug that causes plague, boasts 4,052 genes on a single chromosome, yet humans have just six times as many on 46 chromosomes. The sea urchin, a hard ball filled with gonads and little else, boasts 23,000 genes. There was at least one conclusion to draw; there had to be more to human DNA than just blueprints for protein production.
It was recognized that large portions of the genome appeared to be waste material. Scientists had suggested several reasons for the existence of junk DNA. According to one hypothesis, these chromosomal regions were trash heaps of defunct genes, often called pseudogenes, which had been cast aside and fragmented during the process of human evolution. A similar suggestion was that the junk DNA serves as a gene reservoir from which potentially advantageous new genes might emerge. There was some idea that the non-coding DNA provided timing commands for the coding DNA, telling it when to start and stop replicating.
Nobody appreciated the extent of the regulation.
During the past 10 years, the Encyclopedia of DNA Elements Project, or ENCODE, has worked to decipher the meaning of the 3 billion "letters" in human DNA and determine their secret functions. This consortium of scientists from dozens of labs across the globe spent a decade analyzing genetic sequences from 140 types of cells, and the results of their studies show that the patterns found in non-coding DNA are not random after all. Instead of being a trash heap for failed evolutionary attempts, non-coding DNA serves distinct and vital purposes.
A 2004 study conducted by David Haussler of the University of California at Santa Cruz compared the genome sequences of a man, a mouse and a rat. Haussler's team was surprised to discover that several large portions of DNA were identical across the three species. To be certain that the patterns were not simply a coincidence, the researchers looked for sequences that were at least 200 base pairs in length. Statistically, a sequence of this length could not appear in all three by random chance. Amazingly, the researchers found no less than 481 distinct sequences, each consisting of at least 200 base pairs, that were common to humans, mice and rats. Chicken and dog genomes were also found to have a majority of these sequences, and even fish shared a large number.
"What really surprised us was that the regions of conservation stretched over so many bases. We found regions of up to nearly 800 bases where there were absolutely no changes among human, mouse and rat," said Haussler. Some of the sequences were found to overlap with genes that code for proteins, but the majority of them did not. Of 481 sequences, 256 showed no overlap, and 114 had no conclusive relation to genes.
Haussler and other researchers started to refer to these sequences of non-coding DNA as "conserved elements" or "ultra-conserved" DNA. Looking through the lens of evolutionary theory, they believed that there had been about 400 million years since humans, rodents, chickens and fish shared a common ancestor. Yet, despite 400 million years of evolution, they were amazed to find these sequences unchanged, suggesting that the conserved DNA was important for ensuring survival.
The ENCODE researchers have now informed us that DNA has some 4 million regulatory sites, which act like switches to turn genes on and off. In other words, individual genes don't necessarily code for just one type of characteristic or function; they are like 25, 000 instruments being directed to perform a complicated genetic symphony. Just as the 26 letters of the English alphabet can be used to spell a vast variety of words, genes can be used at different times, in varying combination with other genes, in different types of cells. The 4 million regulatory sites, or transcription factors, determine when and how long certain genes are expressed, offering a nearly infinite number of possible combinations. These regulatory maestros direct the genes when to play their parts and give the biochemical directions necessary to build and maintain our bodies day by day.
"There is a modest number of genes and an immense number of elements that choreograph how those genes are used," said Eric D. Green, director of the National Human Genome Research Institute.
It has also been demonstrated that genetic defects do not only cause problems in the genes that code for proteins but can also interfere with the proper regulation of the genes that code for those proteins. In fact, the researchers now believe that genetic diseases are caused far more often by combinations of defects in the non-coding DNA than in simply single protein-coding genes. The genes might be there to make the proteins, but they are not being turned on and off at the ideal times. It may be that certain diseases like lupus and diabetes can be caused by a few genetic mutations in the regulatory regions of DNA.
"Humans are 99.9 percent identical to each other, and you only have one difference in every 300 to 1,000 nucleotides," said Manolis Kellis, associate professor of computer science at MIT and major contributor to an article published in the Sept 5 issue of Nature. "What ENCODE allows you to do is provide an annotation of what each nucleotide of the genome does, so that when it's mutated, we can make some predictions about the consequences of the mutation."
DNA research has revealed a galaxy of new information over the past several decades. Through a better understanding of our genes, scientists have been able to diagnose the causes of many illnesses faster and more accurately. Doctors can now develop customized treatment plans for patients based on their specific genetic codes. Yet, while scientists have learned a great deal about the human genome, the majority our DNA remains a mystery.
Secular scientists may see these new discoveries as additional evidence that humans and animals came from a common ancestor. However, there is another possibility; these sequences are simply some of the necessary DNA "ingredients" for making vertebrates work the way they do, whether they are mice or dogs or humans. They provide the fine details of the genetic recipe for these creatures – instructions written in code by a brilliant Creator whose blueprints include mysteries that we simply have yet to unravel.