Humans have only a fifth of the genetic material of an onion, and slightly more DNA than a mouse. Why would the Creator of life use five times more genetic information for an onion than for a human? And why would He create humans to be only slightly more genetically complex than mice? Aren’t long and aimless evolutionary processes a better explanation for these observations?
The enigma of C-value
It has long been known that no meaningful correlation can be found between the size of an organism’s genome and its apparent complexity. Given that humans have 3.2 billion nucleotides in their genome, we would expect them to be special compared to other species on the planet.
In reality, however, protozoa—a class of microscopic organisms with unspecialised cells—can have very large genomes. For instance, Amoeba dubia contains 670 billion nucleotides—over 200 times more than the human genome. Amoeba proteus has 290 billion nucleotides—almost 90 times more than humans.[1] The genome of an amphibian, such as a frog, is at least twice the size of the human genome, but can be up to ten times larger. Plants also have genomes that vary enormously in size. Paris japonica, a simple but rare flower from Japan, has a genome 50 times larger than that of humans and holds the record for largest genome.
These examples highlight what is known in molecular biology as the “C-value enigma”. The “C-value” refers to genome size, and the C-value enigma concerns the difficulty in explaining the significant discrepancies between the genome sizes of different species and their apparent complexity.
Nevertheless, there is some correlation between genome size and species complexity. When organisms are divided into families and categories, it becomes apparent that variations in genome size are generally much smaller within the same group.
Excluding plants[2], amphibians, and a few other categories, the remaining families of organisms show smaller variations in genome size and can be ordered broadly from simple to complex according to genome size. Interestingly, the mammal family, to which we belong, has a remarkably compact size range. A correlation can therefore be observed, but it is weak and contradictory. The C-value enigma is therefore linked to both the weak correlation between genome size and species complexity, and the blatant contradictions that are known about it.
The functions of DNA
In a previous article entitled “Non-functional DNA: the playground of evolution?”, we showed how the human genome can be visualised as a string comprising over three billion letters (A, C, G, and T). Only a small proportion of this string comprises genes. Genes are substrings that can vary in length from a few hundred to tens of thousands of letters, seemingly arranged at random throughout the genome without overlapping.[3]
DNA sequences in genes are used as templates to produce proteins through an intermediate molecule called RNA. Proteins are considered the primary functional products of genes and perform most of the essential functions in the cell. This flow of genetic information—from DNA to RNA to protein—is known as the central dogma of molecular biology.
DNA in the genome can be divided into two broad categories: functional and non-functional, based on whether researchers believe certain sequences serve a purpose in the cell or not. DNA sequences that define proteins within genes are called “coding sequences” and, together with sequences that control the expression of these genes, are considered functional (in that the primary function of a cell is to produce proteins). The remaining sequences are called non-functional by exclusion. In complex multicellular organisms, over 95% of the genome is generally considered non-functional. In humans, over 90% of the genome is non-functional, and the proportion of non-coding DNA rises to 98%.
Interestingly, neither the overall size of the genome, nor the number of genes it contains correlates well with the complexity of an organism. In fact, the number of genes in an organism has almost no correlation with its complexity. One might be tempted to believe that if humans have approximately 22,000 genes, then other organisms must have far fewer. However, rice has almost twice as many—41,000—and the roundworm C. elegans, which is commonly used in genetic studies, has almost as many genes as a human: 20,000. Nonetheless, given that the size of the functional genome (including genes) is considered so small, the C-value enigma essentially concerns the large areas of the genome that are considered non-functional.
God and the onion test
According to evolutionary theory, the C-value enigma has already been solved: there is no strong correlation between genome size and organism complexity because genomes are largely composed of non-functional DNA—a kind of evolutionary “junk” that has accumulated over hundreds of millions of years of random copying and mutation. This junk DNA can be present in any quantity, regardless of the organism’s complexity. The presence and quantity of non-functional DNA is an idea contested not only by those who argue for the necessity of an intelligent Creator of life, but also by many evolutionary researchers.
To stimulate debate, evolutionary biologist T. Ryan Gregory devised a simple test for anyone who believes they have discovered a universal function for non-functional DNA. The test is this: regardless of the proposed function of non-functional DNA, its proponent must justify why an onion requires five times more of it than a human.[4] There is nothing special about onions; they are just a random example, familiar to everyone, that illustrates the importance of the C-value enigma in the context of the controversies surrounding non-functional DNA. The so-called “onion test” quickly gained exposure and popularity within the evolutionary scientific community, becoming a reference point for theories about the “functions” of non-functional DNA. The onion test is one of the strongest, and most frequently invoked genetic arguments in favour of evolution.
The evolutionary perspective
A very important concept that evolutionary theory uses to identify potentially functional DNA sequences is conservation. For example, suppose that a certain DNA sequence is found identically or similarly in both the human and mouse genomes. Evolutionary thinking assumes that this sequence was present in a common ancestor long ago, and that both humans and mice inherited it from this ancestor. If this sequence has been retained in the genomes of humans and mice for so long, despite continuous degradation caused by random mutations, it must serve a useful function. This is called negative selective pressure: a sequence with a useful function resists degradation caused by random mutations over generations because the loss of that function would reduce the organism’s ability to survive and reproduce.
In other words, when analysing two similar DNA sequences from two different species, evolutionary theory interprets them as indicating a common ancestor, a long time since that common ancestor, negative selective pressure, and, most importantly, functionality. Although this way of interpreting similarity may seem cumbersome and counterintuitive, the role of this concept in evolutionary theory and the evolutionary interpretation of the genome cannot be overstated. It is one of the most “sacred” concepts in evolutionary genetics.

An interesting general observation related to non-functional DNA is that it is “conserved” between species to a much lesser extent than DNA that encodes proteins and regulates gene expression. This provides another significant argument in favour of the theory of evolution, alongside the onion test: if “conserved” means functional, then non-functional DNA is rightly so named because it is largely unconserved—it represents a remnant of evolution across inactive areas of the genome.
Recently, the journal Nature published a study by a team of 29 international researchers on Utricularia gibba, a carnivorous plant similar to orchids.[5] Remarkably, this plant has an extremely small amount of non-functional DNA in its genome. This discovery was so unexpected that it was immediately published in one of the most prestigious scientific journals. Although it contains more genes than the human genome, the U. gibba genome is over 40 times smaller than the human genome, and it is estimated that only 3% of its DNA is non-functional.
Furthermore, comparative analysis of the DNA of U. gibba with that of other related plants, ranging from tomatoes, orchids, and traditional snapdragons, shows that U. gibba has somehow managed to remove important sections of its genome without losing its viability. In other words, this is a plant that survives despite lacking much of the non-functional genome present in its relatives. While the list is not exhaustive, these are some of the most significant arguments in support of the theory of evolution concerning the non-functionality of much of an organism’s genome.
A paradigm problem
The relationship between similarity and conservation. As we have seen, evolutionary theory ascribes greater significance to similarity between two DNA sequences, viewing it as evidence of descent from a common ancestor over a long period of time under negative selective pressure. Clearly, however, two sequences can be similar simply because a Creator used similar components to perform similar functions in different organisms. Which explanation is more plausible in terms of simplicity and everyday experience?
The computer science perspective. When we talk about the genome, we are talking about information in an active, executable form. A legitimate analogy can therefore be made with computer science, a field in which knowledge is much more extensive. Firstly, when observing the functioning of a computer program, it becomes apparent that less than 10% of the code accounts for over 90% of the program’s running time. This has been a generally valid observation for a long time, but even if 90% of the code is rarely active, this does not mean that it is unimportant or non-functional. In fact, it can be critical to the integrity of the process. Specific structures in non-functional DNA, such as repetitive sequences and transposons, can also be found in the executable code of a computer program and have a well-defined functional role. A program’s code may also contain rarely used sections (e.g. for compatibility with old or rarely used systems) but, again, this does not render it non-functional. Executable code may appear to be a random collection of letters, and an analysis of its operation may reveal large sections that are not “used” at all. This situation is not at all unusual for a computer program, which is in fact as functional and non-random as can be. The same type of behaviour observed in the genome of organisms is interpreted radically differently.
Limited knowledge. Contemporary molecular biology suggests that the developmental and organisational processes of multicellular organisms are orchestrated by a highly complex network of proteins and interactions that challenge full comprehension due to their intricacy. In this model, cascades of reactions and mutual conditioning between local protein concentrations give rise to global intercellular organisation and coordination. A computer scientist would argue that this approach attempts to understand the purpose and functioning of complex software, such as an operating system, by statistically analysing the patterns of electrical signals transmitted between the computer’s various electronic components. While it is true that computers work with electrical signals and that a faithful model of an operating system could theoretically be built from this approach, it is not practically feasible, intelligible, or useful. There are levels of abstraction that are useful for humans to understand complex things, such as how a computer works: the level of basic physical principles, the level of basic electronic components, the level of large components, the level of system architecture, and so on, up to the level of software, where we finally find the behaviour and function of the system. Is the attempt to describe the functionality of an organism in terms of molecular interactions between proteins produced by cells not ultimately as absurd as describing a Windows operating system in terms of electrical impulses through transistors?
The Creator’s “whims”. We would expect a Creator God to act in a way that generally pursues a goal with maximum efficiency. However, He also has the privilege of creative “whims” that are not subject to the imperative of efficiency or functionality, and it is wonderful that this is so. Is there any reason to doubt that this privilege of the Creator manifests itself at all levels, including the genetic level? Therefore, from the perspective of a Creator God, although we would expect to find as much functionality as possible in organisms’ genomes, it would be unreasonable to claim that we can find and understand complete functionality. God has His prerogatives.
The RNA universe
In the aforementioned article, we discussed the ENCODE project,[6] a substantial research endeavour focusing on non-functional human DNA. Much to the surprise of many biologists, the project revealed frenetic biochemical activity in virtually the entire previously considered non-functional genome. Of course, intense activity does not necessarily equate to useful functionality, just as ocean waves, although very active, mean nothing. However, the ENCODE researchers achieved more than merely documenting the activity of non-functional human DNA. They found clear signs that this activity is far from random and that much of it may be linked to well-established cellular processes. The idea that non-functional DNA may ultimately be functional has irritated the evolutionary scientific community, which has responded with numerous critical articles. However, one of the evolutionary biologists to defend the ENCODE results is John Mattick, a respected scientist who specialises in non-coding RNA.
As mentioned, according to the central dogma of molecular biology, the RNA molecule is the intermediate molecule between DNA and protein. However, DNA is transcribed into RNA within the cell in many more scenarios than just protein production. Basically, ENCODE found that the entire non-functional human genome is covered by areas that, under various conditions, are transcribed into RNA.
Mattick first notes that the number and types of genes that encode proteins are relatively constant for most animals. C. elegans, a worm with only 1,000 cells, has almost the same number of genes as humans. Most amazing of all, however, is that the two species have a high degree of similarity in the genes that produce proteins, including almost all of the key proteins that regulate embryonic development and growth. Furthermore, most living organisms share a large number of genes, while unique genes are extremely rare, with current estimates putting their number at around 10,000. Therefore, at least for animals, the set of basic components, i.e. genes, is largely common.
However, it is important to note that the number of genes that encode proteins (functional DNA in general) does not correlate with the complexity of an organism. As we have seen, the only part of the genome whose size correlates somewhat with the complexity of organisms, albeit weakly and with numerous exceptions, is the part considered non-functional. This part of the genome is not transposed into proteins, but is nevertheless intensely transcribed into RNA molecules. The production of this non-coding RNA is differentiated with precision according to environmental factors, tissue specificity, or the developmental stage of the organism. This activity could represent either random background noise from a non-functional genome or real functionality. Nevertheless, all experiments to date reveal that the production of non-coding RNA is highly precise in terms of the location of cells in the body, cell type, and the organism’s stage of development. Furthermore, the zoning and transport of these molecules within the cell are also extremely precise. These characteristics provide important evidence that this is not background noise.
Mattick wonders whether current theory fundamentally misunderstands the structure of genetic programming in multicellular organisms due to the overly simplistic model by which gene expression is thought to be regulated. There is clear evidence of a vast, hidden level of gene control networks based on non-coding RNA that are involved in directing embryonic development, and growth, and this RNA derives from DNA that is considered non-functional. Moreover, almost all human non-functional DNA is transcribed by cells into RNA molecules. It is already known that a widespread class of non-coding RNA called microRNA is deeply involved in directing multicellular developmental processes in both plants and animals. Additionally, the complex roles of long and very long non-coding RNA sequences are only just being discovered.
Evolution or creation?
We are now in a position to formulate answers to, and criticisms of, the evolutionary perspective on non-functional DNA and the C-value enigma.
From an evolutionary theory perspective, the weak correlation between genome size and organism complexity (known as the “C-value enigma”) is caused by vast and arbitrary quantities of non-functional DNA in the genome. However, if this were true, the size of the functional part of the genome would correlate well with organism complexity. This is not observed in reality, however. The number of genes an organism has bears no relation to its complexity, and many of these genes are found in a wide variety of other organisms. For example, most mammals have roughly the same number of genes, whereas amphibians generally have many more. If a fly has 15,000 genes, how can we account for the difference in complexity compared to humans, who have only 22,000? How can a fertilised egg develop into an adult human through a process guided by only 22,000 genes, when over 70% of these are identical or very similar in sequence to a mouse’s homologous genes? Our view of what is functional and what is non-functional in the genome seems profoundly wrong.
Even the much more pronounced conservation of genes compared to non-functional DNA does not require an evolutionary explanation if we view genes as components of a mechanism. From the same set of parts, you can build radically different mechanisms. These mechanisms will be very similar in terms of the parts used, but very different in terms of how they are combined.
Another problem for evolutionary theory is the dynamics of non-functional DNA. The theory is that it can accumulate freely because it has no effect on the organism—or rather, it does not harm it. However, isn’t all the energy wasted on replication, maintenance, and the multitude of reactions involving a genome that is 95% larger than necessary a major handicap in itself? And if we insist that this is not a disadvantage and that non-functional DNA is indeed the product of millions of years of copying errors, mutations, and selfish DNA sequences, what limits the infinite accumulation of such junk? Why do some organisms, such as U. gibba, seem to have accumulated almost no “junk” in their DNA?
However, if non-functional DNA is not actually non-functional, what can be said about U. gibba, the plant that survives very well with almost no functional DNA? Firstly, we should note that it is a very simple plant, so a small genome is not surprising. We might also ask ourselves: if we remove the areas of the wild cabbage genome that are “active” only for kale, broccoli, and cauliflower, would the resulting plant still be wild cabbage—perfectly viable, but with a smaller genome? Perhaps this remarkable U. gibba gives us a clue as to the extent to which such a “slimming” process can be taken, but it tells us nothing about how functional or non-functional the removed parts are.
So why does an onion have five times more genetic material than a human? The safest answer, given our current state of knowledge, is that we don’t know yet. The theory of evolution does not provide a satisfactory answer to this question. However, from the perspective of an intelligent creator of life, there are some intriguing clues: what variations in form and taste can we achieve through artificial selection, patience, and other means, starting from the humble onion? After all, we have obtained so much from wild cabbage, which has a much smaller genome.
Conclusions
In conclusion, comparing the complexity of organisms is not a simple matter that can be reduced to counting letters or genes in the genome, from either an evolutionary or a creationist perspective. The size of an organism’s genome is not an indicator of its apparent complexity, nor is a large genome an indicator of non-functionality. As far as our studies of the human genome and a few other model organisms have revealed, the genome is a bustling place. However, just as there are very large genomes, it is reasonable not to categorically rule out the possibility that significant portions of DNA may not have a functional role. God has His prerogatives as Creator; His actions do not always have to be clear to us. We must also remember that we do not live in the world He originally created. While God’s precise design choices may be unclear to us, based on the evidence, it is clear that we are talking about design, not chance, in the genetic makeup of organisms.