tanarill: (Science!)
[personal profile] tanarill
I have been reading, for my next class. See, the professor, ten days before the class, sent out an email that goes, "I have uploaded the papers we will be using instead of a textbook." There were more than thirty of then. As, in my experience, Science! papers tend to be so dense that a four-page paper takes half an hour to read, I got to work, and it was a good thing; the average length of these papers is around 8 pages, with 2-ish of those being pictures. Still, nice of him to put them up early.

So, I was reading these papers, and . . .


I have a problem with Junk DNA. It is a pet peeve, you might say. And to my utter delight, so do all the other gene scientists, because they treat those words like naughty words and are working damn hard to make very idea a thing of the past.

See, it's like this. About twenty years ago now, technology got good enough that it began to be possible to sequence fragments of DNA. It was not very good technology, you understand - we're talking the equivalent of an 8-bit processor trying to work out the eight billion, give or take, letters in the human genome. It all had to be done by hand and was thus tedious and expensive. But it could be done, and it had to be done, if we wanted to understand things like genetic disease. Which is why the human genome project.

One of the things the human genome project found, which they were absolutely not expecting, was how small it is. To explain: by that point, they had sequenced a number of proteins and had a good idea of how long the gene for a typical protein is, and how long the human genome was. Simple division gave them a number of proteins. So what they were expecting was to find a bit of DNA corresponding to each expected protein, plus DNA that makes tRNAs and rRNAs.

What they found was DNA to make RNAs, and some DNA to make protein, and lots and lots of other . . . stuff. In fact, from what they could find, humans use a tiny fraction of the genespace for DNA that codes proteins, including proteins that we are not currently using but have left over from a few million years ago, when our ancestors did. The scientists, being scientists, scratches their heads, went, "Huh. That's weird," and set about finding out if all that other stuff has a purpose, and if so, what it is.

The media, being media, promptly named it 'junk' DNA. After all, if it doesn't make a protein, what use is it?"

As it turns out, tons. For example, to get this stuff to condense down into chromosome during cell division, it has to wrap around these proteins called 'histones,' and then the histones coil back on themselves (think how a telephone cord winds up), and then the whole thing attaches to other scaffold proteins. Only, as it turns out, it's not the histones interacting with the scaffolding, it's the DNA itself, through regions now called 'structural domains.' They don't code, but they are absolutely necessary for life, because without them, cells don't divide. Also, part of the junk DNA.

Another example. The scientists, looking at how small the protein-coding sections of DNA are, took a second look at how many unique proteins they could find. There turn out to be more proteins than we apparently have genes for, which makes no sense. So they began creating cDNA, which is an artificial construct. To make it, you start with an mRNA that you know makes a protein (how you know it make a protein is somewhat more complicated, but roll with it) and use certain sciencey tricks to build a DNA that would make that mRNA. In theory, they could compare cDNA to the genome and find the bit of DNA that matched.

Only . . . it didn't work like that. cDNAs from multiple proteins would match the same fragment of DNA, but in an odd way. It would match for, say two hundred DNA bases, then there would be a gap of a hundred bases in the DNA, and then it would match again, and so on. Some cDNAs would completely miss a fragment of DNA in the middle of two gaps that another cDNA would match; the first cDNA would, in this case, just have a much bigger gap in the corresponding DNA. And so on.

Eventually, they figured out that we save DNA by having genes that are, essentially, interchangeable parts. First thing, an mRNA is transcribed, start to finish. Then, after transcription but before translation, a protein comes in and chops out the noncoding bits in the middle, called introns. But, since the proteins to do it are already right there, they can also cut out some but not all of the coding DNA fragments, which are called exons*. The simple diagram explains. Numbers are for exons, while minus signs are for introns.

DNA: ----1111--2222----333344445555---6666------7777-----8888
mRNA1: 1111333344445556666
mRNA2: 111122223333444455557777
mRNA3: 111122223333444455558888
mRNA4: 2222333344445555

The result is than from one section of gene, you can make a lot of possible mRNAs, and thus, proteins. The proteins are similar - the exons they share allow them to do the same thing - but one has an exon that serves as a signal for the cell to excrete it, one has an exon that serves to stick it to the cell membrane, and one has an exon that serves to keep it floating randomly in the cell. These similar proteins, which all came from the same parent DNA, are called 'splice variants.' Does the noncoding DNA do anything? Why yes, it allows the cells to make the variants in the first place, and it's beginning to look like it also helps to decide how often a specific variant is made. It, also, what the media would call junk DNA.

We still don't know what the vast majority of the DNA is doing. We can be pretty sure that there's some that doesn't code but is important, and we know because of circumstantial evidence: the code is almost identical in humans and, say, yeast cells**; if you take that bit of DNA out of the yeast, the cell does something weird and usually fatal; if you put the human version into the yeast, it is fine again. People also did studies where they removed these things in mouse egg cells, and not only don't the fetuses survive, they have all kinds of abnormalities to boot. We know it definitely isn't coding, because we can look for the protein it would make and there isn't any. Certain parties would have us believe that this DNA is junk. I beg to differ.

Of course, this is not to say that there isn't DNA which really is garbage. I'm just saying that throwing away every bit of DNA that doesn't directly make protein is wrong, wrong, so very wrong. Calling it 'junk' is wrong. Let's all agree to call it "the DNA that we don't know what it does yet," instead. Sure, it won't sound as miraculous every time the news gets to announce that, gasp, someone figured out what a bit of the DNA that we didn't know what it did actually does. But on the other hand, it's more accurate all along and doesn't leave people wondering why it's called junk if it is actually being used.

I am looking at you, the media.

*Exons can be expressed. Introns are intervening sequences. Do not confuse the two, or you will be confused.
** A general rule of evolution is: things only stay exactly the same if they must be that way to work. If simple yeast cells and humans both have a certain DNA code somewhere, it's probably pretty damn important.

Date: 2011-12-31 05:51 am (UTC)
From: [personal profile] rtydmartel
This was very interesting -if a little beyond my understanding to read.

I agree with the "if we still have it, it's probably important" reasoning.

Date: 2011-12-31 11:18 pm (UTC)
everbright: Eclipse of Saturn (Default)
From: [personal profile] everbright
The example with the numbers helped a lot, for me; (the prose paragraph was a bit confusing.) But, cool! This really changes how I understand DNA. I honestly thought that if you could 'clean' your DNA of all the junk-bits, you'd end up with basically the same thing, just without the random viral-inserted noise.

So, yeah. Not something I'll be writing into a story then!
Edited Date: 2011-12-31 11:23 pm (UTC)

Most Popular Tags