Monday, August 27, 2012

RNA, the True Secret of Life?

In THE DOUBLE HELIX A Personal Account of the Discovery of the Structure of DNA, James Watson narrates the famous story of Francis Crick bursting into the Eagle Pub to tell everyone within earshot that he had found the secret of life.

Certainly, DNA is one secret of life and (leaving aside a few viruses – which aren’t really alive) is a common (and rather vital) denominator of all terrestrial life. But life has many secrets. Since Crick and Watson (building on the often neglected contributions of Rosalind Franklin) elucidated the structure of DNA, life has revealed more and more of those secrets. But one important secret remains almost as well kept as ever: the secret of just how life emerged in the first place.

There are several rival theories as to how life may have originated on earth (or some other space-rock) but the question of just how we got from a non-living “Primordial Soup” to DNA-based life presents a particular puzzle.

What struck Crick and Watson immediately as they surveyed the model double helix they had constructed were the implications of that structure for DNA replication. Each strand of the DNA double helix comprises a series of nucleotides and (as was later found) these nucleotides constitute the individual “letters” of the DNA code. The two strands of the DNA helix are complementary and the sequence of one strand can be inferred from the sequence of the other. If two strands are separated and furnished with a supply of fresh nucleotides, each strand can serve as the template for the assembly of a new copy of the original double helix.

But if all DNA could do were to serve as a template for making more copies of itself, it would be pretty boring stuff. What makes DNA interesting is that it also serves as a template for making proteins.

It is important to realize here that while proteins are often structural molecules – like muscle proteins – they may also be enzymes – like the enzymes often included in modern detergents. Enzymes are tightly folded proteins that “catalyse” (ie speed-up without getting directly involved themselves) other reactions. Enzymes have specific shapes which allow them to bind to other molecules and thereby encourage those other molecules to react with one another. Enzymes are crucially important for regulating what goes on inside living things.

The way in which proteins (including enzymes) are produced from DNA is quite complex. The first stage involves the creation of multiple RNA copies of the “master” DNA template. RNA is a very similar molecule to DNA – consisting of long chains of nucleotides – but it is normally single-stranded. The second stage involves the creation (from the RNA templates) of long chains of amino acids (which is what proteins are).

But here’s the thing ....

In order to make proteins from DNA, and even to replicate DNA, you need to have proteins (enzymes) that make everything work properly. And where do these enzymes come from? They are coded for by the DNA.

While this system works perfectly well once it is all in place, it is difficult to imagine how the system ever got going in the first place. The solution to this conundrum is that the DNA/protein system is probably not what originally got going. It is almost certainly a refinement of a far simpler system.

It is a feature of the aforementioned single stranded RNA molecule that (as well as serving as a DNA analogue – at least for one strand of DNA) it can also adopt tightly folded configurations which double-stranded DNA could never imitate. This means that, in certain respects, RNA molecules are rather like protein molecules and, it was discovered, can act like primitive enzymes that catalyse other reactions.

Once you have RNA molecules (formed entirely randomly as the primordial soup dribbled down hot rocks) that just happen to be able to catalyse (however poorly) reactions that result in RNA copying, you’re away! The “chicken and egg” problem presented by the DNA/protein system is circumvented. A self-catalysing RNA system will, given a supply of nucleotides, keep on replicating until the cows come home – which, given that replicating systems constantly mutate and the best mutations are chosen by natural selection (and later on in the process by human breeders) is exactly what happened in the end.

Nobody knows for certain how terrestrial life began, but the RNA hypothesis is a strong contender. We may never know exactly what did happen but there is a plethora of exciting research going on with RNA (and similar nucleotide polymers) that is confirming the plausibility of some theories as to what might have happened.

So next time you are enjoying a pint in a university-town, watch out for someone bursting in and announcing that the real secret of life has just been found.

Sunday, April 8, 2012

What is a gene?

Alongside their usual “UNICORNS CAUSE CANCER” style headlines, the tabloid press are also quite fond of “BOFFINS DISCOVER THE GENE FOR BELIEVING IN UNICORNS” style headlines. I think it kind of goes without saying that most of those who write such headlines have only the vaguest idea of what a gene is. To be fair, the more we discover, the vaguer the scientific notion of what a gene is has become, but the basics are very well established.

So what is a gene?

There are all sorts of useful analogies, similes, and metaphors we can use here. I think my favourite is the story of the pilgrim who asked for an audience with the Dalai Lama.
He was told he must first spend five years in contemplation. After the five years, he was ushered into the Dalai Lama's presence, who said, 'Well, my son, what do you wish to know?' So the pilgrim said, 'I wish to know the meaning of life, father.' And the Dalai Lama smiled and said, 'Well my son, life is like a beanstalk, isn't it?' “In Held 'twas In I” by Procol Harum
But I’m going to try here to describe what a gene (the real secret of life) really is instead of what it is a bit like.

Now we’ve all heard of the “chromosome”. Say this word to most people (and indeed Google images) and it probably conjures up an image like this:

Now there’s a good reason why the word “chromosome” conjures up an image like this. Basically, it’s when chromosomes look like this that we can see them under normal microscopes. But chromosomes only look like this (all bunched up and double) when they are getting ready to divide. Most of the time, and in most organisms, chromosomes look nothing like this.

Most people (even journalists) who’ve heard of chromosomes have also heard of “DNA” and are aware that it comes in the form of a double helix:

This is basically what you are looking at (ignoring all sorts of caveats that we can sweep under the lab bench for now) when you look at a length of chromosome (or at one of the strands of the chromosome in the doubled up chromosome in the chromosome picture).

So there you have it, chromosomes are (caveats aside) basically long strands of DNA.

But we haven’t mentioned “genes” yet I hear you cry.

Well a gene is a short(ish) bit of chromosome (or DNA strand if you prefer). Now (returning to analogies) “genes” are often compared here to beads on a string. But, since there isn’t really any “string” (just molecules and links between them) popper beads maybe provide a better analogy …. except that there aren’t really any beads either.

Let’s look at the DNA molecule in more detail:

DNA is made from Nucleotides – which is what the “N” stands for in “DNA”. There are just four different nucleotides involved Adenine, Cytosine, Guanine, and Thymine - which are often denoted by their initial letters: A, C, G and T.

If we un-twist the DNA and look at a short bit of it, it looks a bit like this:

But that’s already a bit complicated, so let’s simplify things still further:

(For any pedants reading, each box here represents a nucleotide together with a phosphate deoxyribose; but let's keep things simple.)

Now the more astute among you will have noticed that these two strands are complementary – the sequence of Gs, Cs,As and Ts in the strand at the bottom can be inferred from the sequence of Gs, Cs,As and Ts in the strand at the top (and vice versa).

As this implies, we only really need one strand and, indeed, we are only really interested in one stand today: the” sense” strand. The complementary strand is “anti-sense” and we can ignore it until we come to DNA duplication – which we’re not going to come to in this post.
Going back to analogies again for a second, it’s a bit like every time Guardian journalist Ben Goldacre (@BenGoldacre / writes a sensible sentence in his blog, Daily Mail journalist Melanie Phillips (@MelanieLatest / writes a completely irrational and nonsensical sentence in her blog, and the two kind of cancel each other out.
Anyway, this leaves us with:

These are a bit like popper beads I suppose, but they are nucleotides not genes. There may be, not billions and billions and squillions (said in a Lancashire accent), but certainly hundreds or thousands of these in one gene.

So what use is that?

Well these for nucleotides form a kind of code – a code comprising only four “letters”, but a very powerful code for all that.

But if a chromosome is just a long series of nucleotides and a gene is a simply a part of that series, how do we know where one gene ends and the next one begins?

Well I suppose (and here I’m going to resort to a serious(ish) analogy) it’s a bit like the old style telegrams where you were restricted to twenty-six capital letters and that was it. You had to write stuff like ….
…. in order to avoid misreading (try it without the STOPs).

It’s like that with the genetic code. There’s no punctuation, it’s all in the sequence of “letters”, but, as has been noted, we don’t even have twenty-six, we only have four. These make up three letter “words” called “DNA triplets” and each triplet codes for one amino acid.

Just as a DNA strand is a string of nucleotides, a protein is a sequence of amino acids and each gene coded for the string of amino acids that make up a particular protein. Like this:

So the sequence of nucleotides CTA codes for the amino acid “aspartic acid”, AAA codes for the amino acid “phenylalanine” and ATG codes for “stop making protein”.

Since this “protein” only has two amino acids in it, I’m not sure you can really call it a “protein”. It would more usually be called a “dipeptide”. But you’ve almost certainly eaten some of this (give or take a methyl group); it is the artificial sweetener called “aspartame” or “Nutrasweet”. I doubt that there are actually any real genes out in the wild for making aspartame, but I suppose there could be, and it’s a nice simple example of what a very short gene could do.


So now you understand what a gene is. It’s a sequence of nucleotides that codes for a protein (or at least part of a protein – some proteins are made from more than one amino acid chain).

I suppose, armed only with the understanding presented above, you could (naively) begin to imagine that if you have lots of genes for (say) muscle protein (or genes that produce extra good quality muscle protein) you might be more likely to make it as athlete, but how does it all get so complicated and how can you have a gene for believing in unicorns?

Well part of the answer (the full answers really are complicated) is that proteins, as well as being structural like muscle proteins, can be regulatory, like enzymes – which control all sorts of things that go on in our bodies.

Once you consider that the products of some genes can control what other genes do (in all sorts of complicated direct and indirect ways that we don’t need to go into here) you begin to realize that genetics is very sophisticated and subtle and complex.

Your computer is not really built from the kind of transistors you used to get in transistor radios any more (and still less from valves) but the principle is the same. A transistor is a switch that turns another switch on and off. Once you start putting a few transistors together, you rapidly start to get quite complex behaviour. Put shedloads together and you get something that can do stuff like decide to stall my Ford Galaxy just before I want to set off from a junction (while producing a fault-code which my garage insists doesn’t exist).

Anyway I digress. My point is that even simple feedback mechanisms (and the feedback mechanisms in genetics are far from simple) can produce really really complex behaviour.

Some species of bird are genetically programmed to build very sophisticated nests to lie in. My cats are genetically programmed to catch birds (fortunately for the birds they’re both rather crap at it) but are not genetically programmed (and not bright enough) to even move a twig out of the way before lying down on an otherwise perfectly comfortable and sunny patch of grass in the garden.

These complex behaviours require lots of genes (and maybe lots of so called “junk” DNA) working in harmony. On the other hand, the colours of my cats (one is black and the other is tortoiseshell) arise from the actions of just one or two genes (though even here – especially in the case of the tortoiseshell – things are a bit more complicated than you might imagine).

So while you probably can’t really have a gene for believing in unicorns, you probably can (for example) have a genetic makeup that makes you more susceptible to superstition and irrational views.

At heart, however, a gene is simply a code for making a protein.