The human genome is amazingly complex:
Massive new GTEx study counters Darwinism
The human genome is a stunning example of God’s brilliance. Humans have struggled to understand how it works for two reasons. First, the engineering is beyond us. It is not anything a human could have accomplished, and it has taken a massive effort by thousands of scientists who spent billions of dollars just to crack open a few of its secrets. Second, Darwinists need the genome to be simple, so they have consistently underestimated its complexity. This low-balling of expectations slowed progress as powerful elements within the scientific establishment dragged their feet. This stymied the work of the real pioneers, who pushed the others, almost literally kicking and screaming, into the light. Once more, evolutionary dogma has been shown to be a science-stopper. And what the light reveals is nothing that anyone expected.
Nearly two decades after the initial sequencing of the human genome, a multi-million-dollar, multi-institutional program has just finished its final reporting. This was the Genotype Expression Project (GTEx). The goal of this 10-year study was to look at variations in the genome and see how they affect RNA production, phenotype, and disease. They were able to separate the effects by sex, race, tissue type, and cell type. What they discovered was a treasure trove for Bible believers. The genome is nothing like anyone expected. It is so complex that it was obviously designed by a higher intelligence. Let me explain.
In the 1990s, scientists petitioned the US government to spend three billion dollars sequencing the human genome. They claimed it would lead to a cure for disease. This failed spectacularly. They also promised we would understand how the genome worked if we could only obtain the DNA sequence of our chromosomes. After the genome sequencing was completed in 2003, we learned this was the understatement of the century, and the century was still young. The human genome is much more complicated than essentially any evolutionist imagined. That first sequence was just a peek into the amazingly complex, four-dimensional information system that God so brilliantly engineered.
In days gone by, scientists held to a “one gene, one enzyme” hypothesis. That is, one gene would produce one protein. This came from studying bacterial genomes, which are fairly straightforward. But in more complex organisms, we have discovered a multi-faceted information processing computer in the nucleus, where any particular letter can be incorporated into many different RNAs and proteins, depending on context. Likewise, the fact that we only have about 23,000 protein-coding ‘genes’ yet produce several hundred thousand unique proteins was a massive surprise, as I explained in my 2010 article Splicing and Dicing the Human Genome.
The first attempt to understand the complexity of the genome was pioneered by Ewan Birney and a long list of scientists from many universities. Called the ENCODE Project (the Encyclopedia of DNA Elements) they looked at genetic expression in a mere 1% of the genome. They discovered that any given letter is incorporated into an average of six different RNA transcripts and that most of the genome is functional, at least to the point where it is copied into RNA transcripts. They were the ones to first uncover the massive “splicing and dicing” system in the genome, where smaller subsections of genes (called introns) could be used in a wide variety of proteins, in different cell types, under specific conditions, at different stages of life. All of this, they found, was programmed into the DNA sequence on top of the protein-coding areas. In other words, the genome codes for multiple things simultaneously.
This was hugely controversial. Why? One reason is that it directly attacked the “98% of the genome is junk” mantra that evolutionists had been spouting since the 1970s when they discovered that only a small part for the genome codes for proteins. But the ENCODE project could not be lightly dismissed.
Before we go any further, we have to establish some basic definitions. The first is the word gene. In this context, a gene is a section of DNA that is transcribed into RNA. This could be a coding region that is translated into protein or a non-coding region like a long intergenic non-coding RNA (lincRNA). Smaller regulatory RNA elements are not generally considered ‘genes’ even though they were part of the study.
After this, we need to know that an allele is simply a variant of a gene. You have two copies of each chromosome and so you can carry two alleles at any site. For example, you might have the blood type AB, meaning one chromosome carries the A allele and the other carries the B allele of the blood type gene.
As opposed to the genotype, which is the sequence of DNA letters, the phenotype is the way an organism looks or behaves. On some level, this is controlled by the genotype, but there is a huge difference. Just because a person carries a trait in their DNA does not mean they will show that trait on the outside. The classic case is with recessive genes. One can carry a gene for, say, blue eyes or type O blood, but not have these traits because the other alleles (in this case for brown eyes or type A or B blood) overwhelm them. But the environment also plays a role in the development of the phenotype. Exercise, nutrition, disease exposure, and many other factors affect which genes are turned on and off at any given time. Thus, the environment often controls the DNA.
We also have to define what geneticists mean when they say cis and trans. A cis-acting element is something that affects something nearby on the same chromosome. A mutation in a gene promoter, for example, might affect the gene that sits immediately downstream, but it would not be expected to affect the same gene on the complementary chromosome. However, trans-acting elements affect expression of both copies, and maybe other genes in other places on other chromosomes, depending on what is being discussed.
Clearly, we also have to define variation. The evolutionist assumes that all genetic diversity is due to mutation. But this is not part of the creation model, where God could have engineered any amount of (good, non-deleterious) diversity straight into the genome of Adam and Eve. It is true that many mutations have accumulated in the genome over the past 6,000+ years. But there is a difference between God-created diversity (most of which is extremely common) and mutation (most of which is rare and geographically restricted). In short, most of the variation in RNA production is due to the presence of factors that God made. Apparently, He loves diversity.
In the end, we have to keep track of several things. Any variation in the genome could affect the amount of RNA being produced (expression variation). Alternatively, it could affect one version of the gene over another (allelic variation). Or it could affect how sections of the coding region are recombined (splicing variation).
The GTEx Project
We had the human genome sequence, we knew that complex things were happening inside it, and we knew that a lot of variation exists between people. Do any of those variations affect how the genome operates? How many are important and how big are the effects? This is what GTEx wanted to learn.
They obtained samples from 52 tissue types (including all major organs and organ regions) from 838 organ donors. They measured the amount and sequenced the RNA produced in all these samples and made sure they had complete genomes for each donor (at ≥ 32-fold coverage). One limitation of the study was that the majority were of donors of European descent, but African Americans and Asians were also included for comparative purposes. They also compared their results to those from living samples (e.g. blood) and cell culture to make sure the amount of RNA found in recently deceased people was similar to that being produced in living cells.
Science magazine published seven of these papers and an accompanying editorial on 11 September 2020. I will attempt to summarize the results of these papers in everyday language. The writing is highly technical, but this is not something we want to miss.
The main summary paper from the GTEx Consortium laid out the problem by stating, “ … genetic risks for complex traits and diseases … are mainly driven by non-coding loci with largely uncharacterized regulatory function.”1 In other words, we see lots of variation in the genome, but we don’t know what most of it does. Also note the word non-coding. Yes, a lot of what was once considered “junk” is now known to affect the lives of living things, humans included.
I am going to condense the most important information in each of these papers into a few short paragraphs.
First, the GTEX Consortium gave us a summary of their ‘atlas of genetic regulatory effects across human tissues’.1 Perhaps unsurprisingly, they discovered that variations in RNA expression and splicing are more common in the coding areas. But only ⅓ of these are affected by cis-acting (in other words, nearby) variants. Thus, long-distance control of genes is quite common, and a lot of variation exists in this system. However, what is more surprising is the fact that the average gene has more than one expressed form. In other words, the variations found in our genomes causes us to produce different RNA versions of nearly all our genes. Since most of this variation is, I believe, created by God, he clearly programmed a huge amount of diversity into the human genome.
But there are also tissue-specific differences in RNA production (e.g. brain cells have different RNA expression profiles than other tissues). There are even differences within specific tissues (e.g. different brain cell types produce different RNAs). We have also learned that more trans and cis effects were found in testes. In other words, that one tissue expresses genes differently than in any other place in the human body.
More than that, we discovered that alternate alleles are not always expressed equally. Even though they are found in the same gene, and even though they might have the same upstream control sequences in place, the amount of RNA produced for different allelic variants can be quite different.
They also uncovered race-specific and sex-specific differences. Fully 369 RNA transcripts were significantly different between the sexes. Among the (phenotypically defined) races, SLC44A5 is a sugar and amino acid transport gene that is expressed in all tissue types. It is one of the main drivers of the skin-color differences among the so-called races. African Americans produce RNAs containing equal amounts of both alleles while European Americans produce RNAs with predominantly one allele only. But the derived ‘light-skin’ allele also reduces expression of the gene in cells lining the esophagus. Like many other examples, this gene has a pleiotropic effect: gene variants cause multiple phenotypes in unrelated parts of the body. This is caused by the hierarchical and multiplicative nature of the information within the genome. We are complex creations!
Melissa Wilson wrote a short article titled Searching for sex differences.2 GTEx found that more than ⅓ of all genes show sex-biased expression in at least one tissue. But they also discovered that individual variation produces overlapping effects (in other words, only some males might produce more of one RNA transcript than most females, and v.v.). Thus, the difference between male and female is the result of the sum total of the effects of many different genes. Unsurprisingly, there were differences in hormone expression genes, but also in autoimmune (female) and cancer (male) associated transcripts. In the end, thousands of genes, in all tissue types, were expressed differently between males and females, but the expression difference is small (median ratio = 1.04).
Oliva et al. wrote the main article about sex differences.3 They found 13,294 genes associated with sex differences, across all tissues, but only 369 of these had truly significant differences among the sexes. They found a 10-fold difference from one tissue to the next in the number of differentially transcribed genes (from 473 to 4558, depending on tissue type). They claimed that ⅓ of transcriptome is differentially expressed in at least one tissue. Only 4% of these were X-linked, but these had greater differences than autosomal genes. Only 18% were different in only one tissue. These are particularly interesting to me and they do not apply only to the obvious tissue differences between males and females. For example, there was actually more difference in expression profiles of skin and arterial tissue than in breast tissue. They also detected genomic regions with clusters of sex-linked genes, such as the pseudoautosomal region 1 on the X chromosome (for females) and the q arm of chromosome 20 (for males).
Kim-Hellmuth et al. examined seven specific cell types within different tissues.4 In any given tissue, different cell types exist, e.g. neurons, myocytes, and/or epithelial cells. They discovered 3,347 coding and lincRNA genes with different expression profiles among the cell types with single tissues, and 987 genes with different splicing patterns. They were, however, stymied by the low power of their statistics (too many variables and too few samples). They suggest that many more of these relationships are yet to be discovered, but larger studies with many more people would be required.
Demanelis et al. examined the relationship between telomere length and RNA transcripts.5 Telomeres are the repetitive DNA that exists at the tips of most chromosomes. They are anchored to the inside of the nuclear membrane when the cell is not dividing and have been associated with longevity (longer telomeres correlate with longer lifespans). They also get shorter with each cell division, conferring upon the cell lineage a certain maximum lifespan. It turns out that relative telomere length varies across tissues and among the sexes. The greatest difference is between blood (short telomeres) and testes (long telomeres). With the exception of the thyroid, telomere length shortens with age in all tissues (the statistics for testes were not reported, but these should lengthen with age). Telomere length also varies among individuals and is longest for people with African ancestry, but age is the single greatest contributor. Thus, telomere length has an inherited component but also depends on telomerase activity.
Some of these studies depend on gene expression for detection. Telomerase is not expressed in differentiated tissue. Age affects genetic expression, so does telomere length, and the two interfere with one another. Sadly, a chronic disease burden was associated with shorter telomere length, even after excluding the effects of cancer.
Ferraro et al. examined rare genetic variations and how they affected transcription.6 These are of particular interest for the creation model because these are more likely to be due to post-creation mutation. Rare variants are ubiquitous in the human genome. Any time you add new people to a genetic database, even a world-wide database with many thousands of individuals, you will be adding new rare variants.7 This is partly due to the high rate of de novo mutations per generation. Part of this is due to the rapid increase in human population size of the past few thousand years. Any individual in an expanding population is more likely to pass on the unique variants they were born with than an individual in a static or, worse, shrinking population. Discovering functional rare variants are difficult because of low statistical power inherent in genome-wide association studies and a general lack of funds for studying extremely rare factors that affect only a few people. But they discovered that transcriptome-based assays were far superior to genome-based estimates for determining the effects of rare functional variants. Rare variants were found that affect the expression of genes, the expression of genes with specific variant alleles, and the alternate splicing of codons.
Using statistical techniques designed to look for outliers, they determined that the average individual has a median of four gene expression outliers, four allele expression outliers, and five splicing outliers. They also determined that these outliers were usually associated with a rare variant within 10 kb. Strangely, no outliers were detected in genes associated with the detection of chemical stimuli or sensory perception.
Copy number variations had a disproportionate effect, as did variations within splice sites, frameshifts, and inversions. In other words, once copy number was factored out, rare functional variants were highly likely to be found at splice sites or places within the coding region that affected a significant number of amino acid sequences within a protein. They also found rare variants that affected multiple genes. These often dealt with genes in the same region and were associated with nearby duplications or deletions. Rare variants in promoter regions more often led to gene under expression, although the difference between over- and under-expression varied by promoter class.
This was not necessarily a powerful study, because not all variants are expected to affect the transcriptome.
The results of the massive and expensive research effort are well worth studying. Genomic complexity has always argued against Darwinism, which is perhaps why they have consistently minimized their understanding of it. We can also see how their assumptions led to false conclusions (e.g. ‘Once we sequence the genome, we will be able to cure disease and we will understand how the genome works.’). We are only now discovering the true complexity of genomic regulation and these papers give us but a glimpse into that world. God made an amazingly complex and functional machine when He fashioned Adam out of the dust. It is nothing short of amazing that He could take something as lowly as dirt and make it into something as complicated as the human body.
References and notes
- The GTEx Consortium, The GTEx Consortium atlas of genetic regulatory effects across human tissues, Science 369(6509):1318–1330, 2020. Return to text.
- Wilson, M.A., Searching for sex differences, Science 369(6509):1298–1299, 2020. Return to text.
- Oliva, M. et al., The impact of sex on gene expression across human tissues, Science 369(6509):1331, 2020. Return to text.
- Kim-Hellmuth, S. et al., Cell type-specific genetic regulation of gene expression across human tissues, Science 369(6509):1332, 2020. Return to text.
- Demanelis, K. et al., Determinants of telomere length across human tissues, Science 369(6509):1333, 2020. Return to text.
- Ferraro, N.M. et al., Transcriptomic signatures across human tissues identify functional rare genetic variation, Science 369(6509):1334, 2020. Return to text.
- For example, see Svensson, D. et al., A whole-genome sequenced control population in northern Sweden reveals subregional genetic differences, PLoS One 15(9):e0237721, 2020. Return to text.