Subscribe to Feed            Add to your Favourites

“It suddenly struck me that that tiny pea, pretty and blue, was the Earth. I put up my thumb and shut one eye, and my thumb blotted out the planet Earth. I didn't feel like a giant. I felt very, very small.” – Neil Armstrong (1930-2012)

Fresh Reads from the Science 'o sphere!

Sunday, July 08, 2007

What Doesn't Kill You Lets You Live

Sometimes I sit in the lab and think to myself:

"What am I doing here? I'm panning for specks of gold in an abandoned gold mine! There is nothing important left to discover; I'm wasting my time on picky details that nobody will care about!"

Then suddenly, like a huge rubber anvil, a bizarre new scientific discovery fell out of the sky and twacked me on the head.



Last month, the Encyclopedia of DNA Elements (ENCODE) Project Consortium published their findings on 1% of the human genome (about 30 million base pairs).

They were focusing on the regions of the genome that do not encode for proteins, and used a barrage of high-throughput genomics technologies to identify and catalogue functional DNA sequences.

They expected to discover new transcription start sites and non-protein coding transcripts. They also expected that the presence and activity of transcription start sites will be related to chromatin accessibility and histone modifications.

And they were right.

In the evolutionary aspect of their project, they compared the DNA sequences of 14 mammal species with 14 non-mammal vertebrates. They found that 5% of the human genome was highly similar in all mammals (evolutionary constraint). Within these constrained regions, only about 60% of the sequence had evidence of biological function.

Which is rather odd, since highly conserved sequences are expected to be functional. Why would a good 40% of the sequence be conserved if they had no function?

Stranger still, they then discovered that approximately 50% of functionally important non-coding DNA sequences were NOT constrained.

Mammals are using vastly different sequences for essentially the same functions.

In other words: functional conservation without sequence conservation.

Now that is weird.

The researchers noted that (italicized by me):

Surprisingly, many functional elements are seemingly unconstrained across mammalian evolution. This suggests the possibility of a large pool of neutral elements that are biochemically active but provide no specific benefit to the organism. This pool may serve as a ‘warehouse’ for natural selection, potentially acting as the source of lineage-specific elements and functionally conserved but nonorthologous elements between species.

I knew it!


Mainstream evolutionary theory would have you believe that many (if not all) features of a living organism were "fine-tuned" by natural selection to be well-adapted to its environment.

You get the feeling that the organism is under such massive selection pressure that natural selection would efficiently prune away useless DNA sequences and preserve only the sequences that helped the organism to survive. This is true for protein-coding sequences, which tends to be highly conserved from fish to human beings.

We are all built from roughly the same building blocks.

So it was generally assumed that all conserved DNA sequences ought to be functionally important. However, the wide variation in total genome size is a clue that non-coding sequences do not play by the same rules.

To take two extreme examples: the grasshopper Podisma pedestris has about six times (18 Gbp) the total genome size of a human, while the tiny shrimp-like Ampelisca macrocephala has nearly 20 times (63 Gbp) our genome size!

In fact there is a 3 300-fold difference between the largest and smallest animal genomes.

There is no correlation between genome size and the number of protein-coding genes (which is much less variable, around 10-fold) in animals.

Our building blocks are arranged and used in different ways.

And because animals are complex systems with redundant genes and many organizational levels, it's hard to pin biological function all the way down to the DNA sequence level.

So I don't find it surprising that there are many neutral elements in the human genome. Considering the robustness of even a single cell, the reality is that Nature tolerates many variations at the sequence level, and natural selection won't remove even huge chunks of DNA if they don't kill the organism fast enough to stop it from reproducing.

To oversimplify:

What doesn't kill you, lets you live.

It usually doesn't make any difference and it rarely makes you stronger.

As for functional conservation without sequence conservation - structural biologists are already aware of this at the protein level. Some amino acid sequences are highly conserved, but changes in a few key positions can drastically affect the fold of the protein, so sequence conservation is not enough to guarantee the same function.

The converse is also true - drastically different amino acid sequences can end up having a similar 3D fold and serve similar functions. The EMBL Dali tool is an online resource that allows biologists to submit a protein crystal structure and search for other proteins with a similar 3D shape, no matter what the amino acid sequence is.

With the discovery that so much functionally important DNA has no evolutionary constraint, just knowing the DNA sequence itself is no longer sufficient.

Data from ENCODE and other related research projects may help create a Dali-like tool for non-coding DNA sequences one day.

This could prove quite difficult since the function of non-coding DNA is not just a matter of 3D shape - confounding variables such as the chromatin configuration and the expression timing of specific transcription factors will have to be accounted for.

So there are still some big mysteries left to be solved!

Oh well.

*goes back to panning for gold*

Would you like to know more?

- More about ENCODE


Elia Diodati said...

It's perhaps worth mentioning that two of the co-authoring institutions on the Nature paper are the Bioinformatics Institute and Genomics Institute of Singapore.

Larry Moran at Sandwalk has a contrary opinion on the usefulness of junk DNA.

One assumption that seems to underly the premises of the ENCODE project is that evolutionary pressure can and does exert selection pressure to prune truly useless sequences in the DNA. I am not a biologist, but AFAIK classical evolutionary theory applies only to expressed sequences. If there is a hidden selectional advantage hidden in junk DNA, it would never see the light of day as a feature of the organism, and therefore would never undergo natural selection.

Yes there must be some cost in keeping copies (I think of it as backups) of non-expressed information in the junk DNA segments but even if there were some selection pressure for "data storage" it would not be the same evolutionary mechanism at work. I guess that's really the latent issue that we can hope to scratch the surface at here, not so much "hey, junk DNA has a function!".

Hawks said...

You have been tagged. Don't blame me, I'm just following orders.

Lim Leng Hiong said...

To Elia:

Yes there must be some cost in keeping copies (I think of it as backups) of non-expressed information in the junk DNA segments but even if there were some selection pressure for "data storage" it would not be the same evolutionary mechanism at work. I guess that's really the latent issue that we can hope to scratch the surface at here, not so much "hey, junk DNA has a function!".

Considering the wide variation in the size of non-coding sequences in animal genomes, natural selection doesn't seem to care much about pruning away non-functional regions. The cost for keeping so many extra copies seem minimal - so many neutral sequences.

As for the functionally conserved but not sequence conserved elements I would first ask: Eg. Is there a lot of redundancy in the putative transcription factors that bind there?

Lim Leng Hiong said...

To Hawks: