Myriad mechanisms of Gene regulation


The 2007 ENCODE pilot study report on the human genome showed astonishing complexity in the structure of the information stored on, in and around the DNA molecule.1 Now come two new studies that show astonishing complexity in the function of the information copying and usage systems in cells.

Transcription (copying) of information from a DNA molecule onto a messenger RNA
Transcription (copying) of information from a DNA molecule onto a messenger RNA molecule is carried out by a molecular machine called RNA polymerase (RNAP). Initiation of the transcription process (schematic A) is followed by the engagement of the transcription machinery resulting in the elongation of the RNA strand (schematic B). A pause during this process helps to regulate the rate of copying. A molecular model of the real system is shown (C). The DNA is shown as the twin coils of rounded blue bead-like nucleotides protruding at top and bottom, the RNAP is shown as the long purple spaghetti-like molecular machinery, and the RNA transcript is shown in green emerging from the centre of the RNAP. (Images from www.wikipedia.org)

Ingenious transcripts

The first step in using the complex information stored on the DNA molecule is to transcribe (copy) it onto a messenger RNA molecule (mRNA). Transcription is carried out by a molecular machine called RNA polymerase (RNAP) which attaches to the DNA strand at the START end of a gene and works its way, nucleotide by nucleotide, to the STOP end, producing an exact complimentary copy of each nucleotide at each step in the chain. More than one RNAP can work on a particular gene at any one time, and a recent study by an international team working on the mechanics of transcription found that in a culture of human cells there were, on average, two RNAPs per gene.2

The rate of transcription often needs to vary—for example, in response to environmental stress or a fight-or-flight threat situation—and one might think that the best way to increase the rate would be to increase either the number of copying machines working on the gene, or to increase the speed at which the machines progress along the DNA. Surprisingly, cells use neither of these options.

In a normal metabolic state, RNAP copying seems stunningly inefficient. Only about 1 in 90 transcripts produce mature messenger RNA; the majority are aborted. Furthermore, the measured step-by-step transcription rate goes about twice as fast as previously measured for whole transcript production because along the way there are quite long pauses.

There are three main phases in the transcription process. First, a region upstream of the transcription site called the promoter is activated. Second, the promoter acts upon an adjacent region to initiate the formation of the transcription machinery. Third, when the transcription machinery goes into action, it is said to be engaged in the copying process.

About a third of the transcripts can be found in each of these three stages at any one time.3 The average residence time in each stage was 6 seconds for the promoter, 54 seconds for initiation, 517 seconds for engagement, and pause times ranged from 204 to 307 seconds. At any one time, about a quarter of all transcripts were paused. A single gene produced a mature RNA transcript every 31 to 63 seconds.

Because of the long pauses in the transcription process, there is a ‘traffic pile-up’ in the transcription queue. The authors likened it to a Sunday driver going slowly along a country road, with cars lined up for miles behind. This may seem to be an awkward and inefficient way to proceed, but the authors suggest that there may be method in this apparent madness.

The speed at which the RNAP can copy is limited by its inherent enzymatic properties, so that leaves only the rate of initiation and the length of the pause time as control points for controlling the rate of mRNA production. By having a very high rate of initiating that is mostly abortive, that then leaves the pause control as the single determinant of copying rate. By having just one control parameter—length of the pause time—the rate of transcript production can be varied almost instantaneously if needed. The authors end the report by saying:

‘We therefore expect that future results with endogenous genes [i.e. in living organisms rather than in cell culture], as more sensitive microscopy methods are introduced, will reveal the myriad of controls by which genes are expressed [emphasis added].’2

It is not hard to imagine what at least some of these myriad controls might involve. For example, cells normally function at only a fraction of their potential rate and range of operation. This is often referred to as redundancy—having more structure and functional capacity than strictly needed. The so-called ‘inefficiency’ of RNA transcription (only 1 transcript in 90 reaching maturity) may actually be a method of both repression and ready activation. Since the excess capacity in a redundant system is not normally used, it spends most of its time in a repressed state.

One of many methods of gene regulation involves small fragments of RNA that bind to the RNA transcript and thus interfere with and prevent its translation into protein. There are many different ways in which this can occur,4 and it is quite possible that the large proportion of aborted RNA strands may act as repressors. On the other hand, when accelerated transcription is required, the rapid rate of transcription initiation can quickly be turned to full use in being carried through to mature RNA production. There is more than enough capacity for acceleration in such a mechanism when compared with the pause-time control rate. The average production rate of mature RNA was 1 every 31 to 63 seconds per gene, while the pause time ranged from 204 to 307 seconds. By turning the pause time down to zero, RNA production could thus be accelerated by 3 to 10 times over the normal rate, well within the 90 to 1 value for aborted initiations.

So not only does DNA contain a myriad of information structures, it is also consulted by the cell for that information in a myriad ways. It makes reading a book, or an article like this, pale into insignificance by comparison.

Smart thinking

Once the information on the DNA molecule has been transcribed onto an RNA molecule, a number of post-transcription processes occur, and then the transcript is translated into protein. Sounds easy? Read on!

The human brain is the most complex organ in our body. It is made up of about 100 billion nerve cells (neurons), that each has numerous tree-like branches (dendrites). When we learn or remember something new, a new pathway for thinking is created by a unique pattern of dendrites joining up into a memory circuit. Problem—how to prevent a highly branched dendritic network from joining up with itself and short-circuiting the memory or thought pattern?

Researchers approached this problem by studying a simpler system—the development of dendrites in the fruit fly Drosophila melanogaster, which has only about 200,000 neurons in its brain!5 What they discovered was beautifully elegant, surprisingly simple yet mind-bogglingly complex in the execution.6 A particular cell surface protein on the dendrites (called Dscam) is made subtly different in each dendrite so that each one can sense whether the nearby branch it is about to join up with is ‘self’ or ‘non-self’. It is similar in concept to the complexes of proteins in flakes of human skin that allow a dog to track the scent of a particular individual human sometimes a day or more after the person has passed by. Except in the dendrite case, it is thought that variations in just the one protein, Dscam, solve the problem.

How do you make one protein in a large number of different varieties? The answer is alternative splicing—the RNA transcript is cut and pasted together in slightly different ways to produce proteins that are almost exactly the same, but not quite—just a few amino acid differences. The Dscam gene can potentially generate more than 38,000 closely related trans-membrane proteins that are different enough to be reliably identifiable, but similar enough to function in exactly the same way.

Trans-membrane proteins are folded up and down through the membrane, joining up both the outside and the inside of the cell several times. Dscam is thus able to be sensed by other dendrites from the outside, but can be also be used as a signalling molecule to tell the internal workings of the cell whether to go ahead with the connection if it is ‘non-self’, or to stop the connection if the other is part of itself. It doesn’t matter which one of the 38,000 versions a particular dendrite has, as long as it is different from its near neighbours. Easy, once you know how!

But just how do you cut and paste a single RNA transcript into 38,000 different but functionally identical proteins? Well, the mechanics are complex and dynamically multi-functional,7 but not yet fully known. We do know, however, that the spliceosome—the machine that does the alternate splicing—is the largest machine in the cell. It consists of about 300 different proteins and several nucleic acids.8 It clearly takes a big machine to do a big job!

Vary or perish

According to a new theory of how life works at the molecular level, called facilitated variation,9 all the mechanisms of variability—both within an individual organism and between parent and offspring—must be in place before life can function and persist in the face of environmental challenge and change. A purely mechanical kind of life—such as William Paley’s watch found upon a heath—would become extinct the first time a malfunction occurred. But life as we now see it in its vast molecular detail is astonishingly variable. If the new theory is correct, and life without such ingenious built-in mechanisms of variation is not possible, then life itself becomes the greatest testament to creation that the world has ever seen.

Posted on homepage: 11 December 2009


  1. Williams, A., Astonishing DNA complexity demolishes neo-Darwinism, Journal of Creation 21(3):111–117, 2007. Return to text.
  2. Darzacq, X. et al., In vivo dynamics of RNA polymerase II transcription, Nature Structural & Molecular Biology 14(9):796–806, 2007. Return to text.
  3. The statistics quoted here are based on a special ‘gene cassette’ containing 200 copies of the gene being studied that had been stably engineered into the DNA of the cultured cell line. Return to text.
  4. Nilsen, T.W., Mechanisms of microRNA-mediated gene regulation in animal cells, Trends in Genetics 23(5):243–249, 2007. Return to text.
  5. Posey, K.L. et al., Survey of transcripts in the adult Drosophila brain, Genome Biology 2:1–8, 2001; <genomebiology.com/2001/2/3/research/0008>. Return to text.
  6. Hattori, D. et al., Dscam diversity is essential for neuronal wiring and self-recognition, Nature 449: 223–227, 2007. Return to text.
  7. House, A.E. and Lynch, K.W., An exonic splicing silencer represses spliceosome assembly after ATP-dependent exon recognition, Nature Structural & Molecular Biology 13:937–944, 2006. Return to text.
  8. Nilsen, T., The spliceosome: the most complex macromolecular machine in the cell? Bioessays 25(12):1147–1149, 2003. Return to text.
  9. Kirschner, M.W. and Gerhart, J.C., The Plausibility of Life: Resolving Darwin’s Dilemma, Yale University Press, New Haven, CT, 2005. Return to text.