Comparison of morphology-based and genomics-based baraminology methods
New genomics-based statistical approaches have helped us in baraminology research. There is currently much genomic data available in the public databases suitable for baraminology studies. This paper discusses the strengths and weaknesses of both morphology-based and genomics-based methods. It is hoped that the use of both types of methods will complement one another in future baraminology research. With more than one line of evidence, baramin membership can be determined with more conﬁdence. This also allows us to classify a greater number of species, since if one type of data (i.e. morphological) is unavailable, another data type (such as genomic data) may still be available for analysis.
Baraminology is the study of created kinds, as presented in the book of Genesis. Genesis 1, verses 11, 12, 21, 24, and 25 describe how organisms, plants, and animals were created to multiply according to their kind. Baraminology aims to group species into created kinds, which were each created separately from one another during Creation Week. Created kinds are also known as baramins, which comes from the Hebrew words for ‘create’ and ‘kind’. Species within one kind may be capable of breeding with one another, but between created kinds no interbreeding is possible. A holobaramin is equal to the complete species membership of a created kind.
Methodological baraminology has been in existence for several decades, but it has recently come to a new point of development. Before now, the morphology-based baraminic distance (BDIST) method has been widely used and it has been generally successful in the prediction of holobaramins.1,2 Recently a new genomics-based algorithm has been developed to predict holobaramins, which may be used to complement the existing morphology-based algorithm.3,4 Thus, multiple lines of evidence can now be used in helping to determine baramin membership. In this paper, we assess the strengths and weaknesses of these two methods in deﬁning baramin membership. Using either one or the other, according to data availability, also helps increase the number of species which can be subjected to baraminology studies. Table 1 shows a summary of advantages and disadvantages of both kinds of algorithms.
Morphology-based methodology: BDIST
The BDIST method has been used widely in the creation science community in the last 15 years. BDIST is a quantitative method of comparing both living and fossil specimens based on phenetics (i.e. observable traits). The software calculates the pairwise correlation and baraminic distance of the species under examination based on the character data set used as input and outputs these values into a matrix. It then maps out a statistical graph of how these creatures relate to one another on a baraminic distance correlation matrix. Optimally, species within a given baramin are highly similar (higher correlation, lower distance) to one another and dissimilar to species from another baramin (lower correlation, higher distance). Three-dimensional coordinates can also help depict species relationships in three dimensions.
The method has generally been useful, but it has received its fair share of criticisms.5 In the creation science community, we are aware of the differences between those who would lump species together into a single holobaramin, thereby reducing the number of holobaramins at Creation Week, versus those who split up holobaramins, thereby increasing their number.6
BDIST tends to lump species into a smaller number of baramins than there seem to actually be. This was seen in a study of cephalopods where the BDIST method predicted only three baramins out of 104 species, whereas mitochondrial genome alignments predicted up to seven.7 A recent analysis of dinosaur species using BDIST brought down the number of predicted dinosaur kinds from 50 to just eight.5 Furthermore, the BDIST method incorrectly classiﬁed both Homo habilis and Australopithecus sediba as members of the human holobaramin,8 despite the fact that these two species are merely commixtures of human and apelike extinct primate bones.9 The method also misclassiﬁes small-sized humans into a different holobaramin to the human holobaram.10 Therefore, a caveat should be added to the method. It should be used exclusively on only healthy, adult members of a given species. Juveniles and deformed individuals and skeletal remains that represent assemblages of individuals should be avoided.
One of the strengths of multivariate analytical approaches such as BDIST is that it allows us to compare and quantify hundreds of phenetic measurements. But statistical output depends heavily on the input data. For example, in a recent paper using a multivariate approach, Doran et al., concluded that “some Jurassic and Cretaceous avians grouped with dinosaurs”, grouping Archaeopteryx and Wellnhoferia within Deinonychosauria.11 In their ‘Feathered dinosaurs reconsidered’ article, McLain et al.12 likewise used a multivariate analysis method and concluded that there are “multiple holobaramins of feathered dinosaurs” and that the “old dichotomy of bird versus dinosaur is unhelpful and incorrect”. They go on to say: “Birds could rightly be viewed as a specialized type of dinosaur without implying birds evolved from dinosaurs.” 12 The ﬁrst part of the McLain et al. paper surveyed the literature on feathered dinosaurs and concluded that many dinosaurs did indeed have feathers. The second part involved an extended multivariate analysis. The conclusion was that “many species of dinosaurs were indisputably feathered. The available fossils have moved us permanently beyond questioning whether some dinosaurs were feathered and onward to interpreting the implications of feathered dinosaurs.” 12
While space does not allow for a more comprehensive rebuttal of every example listed in McLain et al.’s 42-page paper, we are convinced that most of the examples listed in the ﬁrst half of their paper as ‘feathered dinosaurs’ are either examples where dermal collagen was misidentiﬁed as true feathers or examples of true feathered birds that have been misclassiﬁed as dinosaurs.13,14 Most of the examples presented as true ‘feathered dinosaurs’ have already been adequately refuted in the scientiﬁc literature.15-17 The error of conﬂating dermal collagen with that of true feathers is especially signiﬁcant, since the second half of the paper involved a multivariate analysis related to the ﬁrst part of their paper. Since the ﬁrst part of the paper mistakenly conﬂates bird feathers with dinosaur dermal collagen, it is not surprising that the second half ends up conﬂating birds as a type of dinosaur. Wrong data in, wrong data out.
As stated earlier, one of the limitations of BDIST is that it tends to lump too many unrelated species into a single holobaramin. Part of the reason why this is the case has to do with the method itself. In the article ‘How to think (not what to think)’, Carter illustrates a helpful paradigm for interpreting evidence.18 Using a Venn diagram with two overlapping circles (see ﬁgure 1 in Carter 18), he illustrates how evidence often falls in what he calls Zone II. This area contains evidence that is consistent with two contrary views. Evidence that falls in Zone II cannot be used as evidence for or against two contrary positions.
How does this paradigm apply to our use of BDIST and other multivariate methods? BDIST indiscriminately evaluates a wide range of phenetic measurements. In this sense, BDIST is, methodologically speaking, a form of ‘hyper-phenetics’. Applied to the study of baraminology, Zone I and Zone III represent diagnostic traits that are unique to two different creatures (i.e. autapomorphy), and Zone II refers to traits that are shared by both creatures but are not unique to either one. Traits that are unique to speciﬁc holobaramins are necessarily going to be rarer than traits that are non-diagnostic and shared between organisms. In other words, if we compare a whole series of traits between two organisms without regard to whether those traits are unique, we are likely to end up with a situation where two different organisms are grouped close together on a continuum. The more shared characteristics we include in the analysis, the closer the two creatures will cluster as one. In this sense, the more shared traits (from Zone II) we include in the analysis, the less accurate the analysis becomes—resulting in the tendency to over-lump different organisms into one single holobaramin. But if current BDIST approaches almost always include Zone II characteristics in their analysis, how reliable are they? As we have already mentioned, when compared with the more precise method of genomic-based algorithms, BDIST has consistently failed because it has a tendency to over-cluster different holobaramins as one.
Let us take for example the question of how dinosaurs and birds relate to one another. If in the Venn diagram in ﬁgure 1, we place dinosaurs in the left circle, and birds in the right circle, Zone I would represent traits that are unique to dinosaurs, while Zone III would represent traits unique to birds. Zone II would represent non-unique traits that are shared by both organisms.
If we compare birds and dinosaurs, as the number of shared traits is increased, birds and dinosaurs cluster closer and closer together. This problem can be avoided if we deliberately exclude Zone II characteristics from our analysis, and only compare Zone I and Zone III characteristics. So, for example (in ﬁgure 2), we list several unique characteristics of either dinosaurs and birds (Zone I and Zone III).17
By limiting BDIST to only unique Zone I and Zone III characteristics, we can avoid artiﬁcially conﬂating two potentially different holobaramins as one. And if in the scenario above, birds are indeed dinosaurs, excluding Zone II characteristics would actually allow us to identity this even more readily than if we include zone II characteristics in our analysis. For example, if birds are indeed dinosaurs, we should be able to ﬁnd consistent examples where a creature exhibits a mixture of both dinosaur-only and bird-only traits. So, we should be able to ﬁnd examples of a bird that has ﬂight feathers (Zone III) as well as a completely perforated acetabulum (Zone I). The completely perforated acetabulum is a diagnostic trait unique to dinosaurs, and ﬂight feathers are, as far as we can tell, unique to birds. While some have argued otherwise, the evidence does not support the conclusion that there are dinosaurs with pennaceous feathers. Future analysis of BDIST can be improved by deliberately identifying Zone II characteristics and by excluding these characteristics from our study. In other words, the exclusion of Zone II characteristics actually allows us to both identify instances where traditional BDIST has over-clustered creatures together and identify instances where we have artiﬁcially separated one holobaramin into two different clusters.
For example, a new character matrix ﬁltering method based on ﬁltering out low-entropy characters ﬁltered out Zone II characters from the data matrices used in the baraminology analysis of cephalopods7 using data from studies by Lindgren et al.19 and Sutton et al.20. A low-entropy value for characters means that character values for all species in the data set are very much uniform. After re-running the BDIST method on the ﬁltered data set, and selecting highly correlated species pairs with a 95% bootstrap value, the Decapod group was split up into three groups: Oegopsina, Myopsina, and Sepiida+Sepiolida+Spirulida (see ﬁgure 3).
The precision of BDIST also suffers if there are only a few species available for comparison, or if only an incomplete data set is available.21 For example, the method has been used on many data sets for which we only have craniodental characters, but this does not provide a holistic view of the baraminic relationships between species based on their entire morphology.
In a ﬁrst analysis of hominin craniodental characters, Wood concluded that A. sediba was part of the human holobaramin.8 His position changed somewhat after using the BDIST method to analyze not just craniodental characters, but also post-cranial characters,22 as well as hand characters.23 Wood found that with added post-cranial characters (present for six species), signiﬁcant discontinuity was only demonstrated between Australopithecus afarensis and Homo sapiens/erectus.24 Whereas A. sediba did not show any signiﬁcant continuity with the human holobaramin any more, it did not show discontinuity with this holobaramin. These are exactly the results which one would expect, since A. sediba has been classiﬁed as a basket taxon,9 containing characters of both human and apelike extinct primates, causing it to cluster away from both holobaramins.
As Wood correctly noted, “samples of very few taxa are not likely to exhibit signiﬁcant correlation even if clusters are present”.24 Wood had to reduce the number of species in his analysis which included post-cranial data. He claimed that due to post-cranial characters being available for only a few hominins, “This trade-off between character sample size and taxon sample size may inhibit rather than enhance the detection of taxon clusters.” The ideal case would be if we had craniodental and post-cranial characters for all desired species. We may have to accept the limit that such characters may never be available, unless found in fossils by further paleontological excavations. Here data quality and availability are key issues to be balanced.
Furthermore, the BDIST method also has the potential of misclassifying many species into the same baramin, just because they resemble each other phenotypically. In contrast with the more accurate approach of using genomics-based algorithms, BDIST does not account for homoplasy. By eliminating zone II traits, we are at the same time also eliminating homoplastic traits, allowing the method to be more accurate. However, in an analysis of cephalopod species by O’Micks, the method does correctly classify Argonauta nodosa as an octopod rather than a species of the nautiloid kind.7
Baraminology researchers using this method should be wary of the underlying assumptions that evolutionists have when assembling their data sets (i.e. when H. habilis and A. sediba were misclassiﬁed as real taxa). To bring us back to the illustration used earlier: if creationists would have excluded Zone II traits from the analysis and only performed a BDIST on unique traits of humans and australopithecines, (since H. habilis contained both unique traits of humans and australopithecines) it would have been apparent to the creationist researcher that he is dealing with a basket taxon, since the Bible makes it plain that humans belong to their own separate holobaramin, which is separate from all animal species. This means that humans were created separately from the apes. Thus, a mix of Zone I and Zone III traits in H. habilis would be indicative that it is a basket taxon and not a real individual. Furthermore, when selecting characteristics for analysis, only non-fragmentary and non-ambiguous characters should be considered. In other words, those characters should only be chosen based on clear diagnostic traits that are unique to a holobaramin.
The genomics revolution has caused quite a bit of controversy, invalidating and qualifying multiple older morphometric studies. Yet, genomics-based algorithms should always be given preference over morphology-based algorithms, for several reasons. First, the genotype determines the phenotype, meaning that genetic factors are ultimately responsible for determining the morphology of a given species. Second, morphologically similar organisms may be genetically different, and vice-versa—genetically similar organisms may be different morphologically. For example, the kingsnake and the coral snake look very similar externally, but are quite dissimilar internally. As a further illustration, if morphology-based techniques alone would have been used in the analysis of cephalopods, the octopod species Argonauta nodosa could have been classiﬁed as a nautiloid on (rough) morphological grounds, yet genetically it is an octopod (and was correctly classiﬁed as such by genetics-based algorithms).7 Third, with the break-up of the archebaranome (that is, the genome of the archebaramin which was created during Creation Week), new species arise via vertical descent. In other words, the genomes of all species within a created kind can be derived from the original archebaranome. Because these methods study the genome, any kind of organism may be studied, irrespective of its morphology. Even individuals with hardly any morphological remains can be included (e.g. Denisovans).
Early on, a number of mitochondrial studies were used to aid in morphology-based baraminology studies, for example to help determine the number of turtle baramins 25 or to measure the diversity of the cat, dog, and horse baramins.26 Mitochondrial comparison is useful because gene order is the same across a great number of species, and gene paralogy and diploidy do not complicate the picture, as in the case of the nuclear genome. Furthermore, it is much easier to sequence the mitochondrial genome, and it is usually available for species for which the nuclear genome is not yet available.
Genomics-based methods, such as the Gene Content Method (GCM) could potentially harness the vast quantities of genomic data in public databases, such as NCBI, the UCSC Genome Browser, UniProt, and others. For example, bacterial genomes can be sequenced in a matter of hours, based on the latest technologies. Whole genome sequences have been created for an estimated 50 bacterial and 11 archaeal phyla, amounting to more than 14,000 total species by 2014.27 Since it deals with gene content, not the speciﬁc nucleotide sequence, GCM depends on the availability of annotated genomic data. Even if the whole proteome of a species is unknown, dozens of gene/protein prediction algorithms exist, such as Augustus, GeneMark, and others which can predict protein sequences for them. Furthermore, databases such as the Pathosystems Resource Integration Center (PATRIC) already contain data for orthologous gene content in dozens of bacterial species.28 In addition, databases such as MetaRef contain data on the core- and pan-genomes of numerous bacterial species.29
It would also be highly interesting to analyze the genome of archebaramins. The whole genome sequences of Neanderthal and Denisovan have already been determined.30 Such analyses could shed light on intrabaraminic relationships and could possibly resolve certain issues regarding the baraminic status of certain species, such as the recently discovered Homo naledi, which is held in creationist circles to be either an ape, human, or a mixture of the two. 9,31,32 Fossils may be interpreted in many different ways, but genome sequences are less subjective and more easily quantiﬁable.
Despite the seeming utility of genomics-based algorithms, they do suffer from some drawbacks. First, the mitochondrial genome represents only 0.01% of the entire human genome. Thus, certain conclusions about the entire human genome in general cannot be made by analyzing the mtDNA alone. Second, genomics-based methods are sensitive to the type of data used, whether it be an incomplete proteome or a genome sequence with low-coverage (i.e. lower-quality) data. A related problem is where to draw the boundary of protein sequence homology, which is necessary to determine protein homologues between two species, which is used as input for the Gene Content Similarity Method, a recently developed genomics-based baraminology method.4 The twilight zone is a protein sequence similarity limit above which common functionality can be inferred between two protein sequences. This is because sequence determines structure, which in turn determines function.33 Finally, whether or not non-genic elements such as pseudogenes or non-coding RNAs should be used is still an open question. These methods are also sensitive as to the number of species, and what kinds of species are selected for study. For example, if species are selected from a wide range of taxonomic categories (such as species from different phyla), the algorithm will discover clusters of species, no matter what. Therefore, it is advisable to select many species for study which appear to be within the same lower taxonomic category (family or order).
Summary and outlook
In conclusion, morphology-based methods may have their drawbacks, so it may be time to rethink and redevelop such algorithms. However, they are still useful. Overall, the BDIST method uses an intuitive concept and a descriptive mode of visualization, as does the genomics-based GCM method. BDIST is complementary to genomics-based algorithms, and while it is useful under the right circumstances, it should only be relied upon if genomic data is not available. Moving forward, special care must be given to separating possible ambiguous (Zone II) and diagnostic (Zone I and III) characters from one another in future baraminology studies. This should be done as a part of our creationist presuppositions in order to avoid over-lumping species into smaller numbers of clusters with large species membership as opposed to many clusters with small membership.
With the gigantic amount of genomic data that is currently available, genomics-based baraminology methods seem to be a useful tool, which can be put to much use. Much of the available genomic data has yet to be tapped for use in baraminology. As an idea for future genomics-based baraminology methods, it would also be useful to measure the correlation between the k-mer/motif content, instead of the protein content, of the genomes of different species to measure similarity. Such a method is currently being developed.
In sum, current baraminology algorithms and methods of any kind are ﬁnite and imperfect, reminding us that God’s perfect truth never changes. We must strive to improve our methods so as to be able to “think God’s thoughts after Him” (Johannes Kepler, 1571–1630).
Materials and Methods
Figure 3 was created using Cytoscape version 3.6.1.
References and notes
- Robinson, D.A. and Cavanaugh, D.P., A Quantitative approach to baraminology with examples from the catarrhine primates, CRSQ 34(4):196–208, 1998. Return to text.
- Wood, T.C. and Murray, M.J., Understanding the Pattern of Life: Origins and organization of the species, Broadman & Holman, Nashville, TN, 2003. Return to text.
- Yaugh, A., Baraminological analysis of a set of archaea species based on genomic data, CRSQ 53(2):140–154, 2017. Return to text.
- O’Micks, J., Baraminology classiﬁcation based on gene content similarity measurement, CRSQ 54(1):27–37, 2017. Return to text.
- Senter, P., Using creation science to demonstrate evolution 2: morphological continuity within Dinosauria, J. Evol. Biol. 24:2197–2216, 2011. Return to text.
- Tomkins, J.P. and Bergman, J., Developmental gene regulatory networks—an insurmountable impediment to evolution, J. Creation 32(2):96–102, 2018. Return to text.
- O’Micks, J., A preliminary cephalopod baraminology study based on the analysis of mitochondrial genomes and morphological characteristics, ARJ 11:193–204, 2018. Return to text.
- Wood, T.C., Baraminological analysis places Homo habilis, Homo rudolfensis, and Australopithecus sediba, in the human holobaramin, ARJ 3:71–90, 2010. Return to text.
- Rupe, C. and Sanford, J., Contested Bones, FMS Publications, Waterloo, NY, 2017. Return to text.
- O’Micks, J., Further evidence that Homo naledi is not a member of the human holobaramin based on measurements of vertebrae and ribs, ARJ 10:103–113, 2017. Return to text.
- Doran, N., McLain, M.A., Young, N., and Sanderson, A., The Dinosauria: baraminological and multivariate patterns; in: Whitmore, J.H. (Ed.), Proceedings of the Eighth International Conference on Creationism, Creation Science Fellowship, Pittsburgh, PA, pp. 404–457, 2018. Return to text.
- McLain, M.A., Petrone, M., and Speights, M., Feathered dinosaurs reconsidered: new insights from baraminology and ethnotaxonomy; in: Whitmore, J.H. (Ed.), Proceedings of the Eighth International Conference on Creationism, Creation Science Fellowship, Pittsburgh, PA, p. 508, 2018. Return to text.
- Lingham-Soliar T., The evolution of the feather: Sinosauropteryx, life, death, and preservation of an alleged feathered dinosaur, J. Ornithology 153(3):699–711, 2012. Return to text.
- Sarfati, J., ‘Feathered’ dinos: no feathers after all! J. Creation 26(3):8–10, 2012. Return to text.
- Feduccia, A., Riddle of the Feathered Dragons: Hidden birds of China, Yale University Press, New Haven, CT, 2012. Return to text.
- Saitta, E. and Fletcher, I. et al., Preservation of feather ﬁbers from the Late Cretaceous dinosaur Shuvuuia deserti raises concern about immunohistochemical analyses on fossils, Organic Geochemistry 125:142–151, 2018. Return to text.
- Thomas, B. and Sarfati, J., Researchers remain divided over ‘feathered dinosaurs’, J. Creation 32(1):121–127, 2018. Return to text.
- Carter, R., How to think (not what to think), creation.com/how-to-think, 1 November 2016. Return to text.
- Lindgren, A.R., Giribet, G., and Nishiguchi, M.K., A combined approach to the phylogeny of Cephalopoda (Mollusca), Cladistics 20:454–486, 2004. Return to text.
- Sutton, M., Perales-Raya, C., and Gilbert, I., A phylogeny of fossil and living neocoleoid cephalopods, Cladistics 32(3):297–307, 2016 | doi:10.1111/cla.12131 Return to text.
- Wood, T.C., Taxon sample size in hominin baraminology: a response to O’Micks, ARJ 9:369–372, 2016. Return to text.
- Berger, L.R., de Ruiter, D.J., Churchill, S.E. et al., Australopithecus sediba: a new species of Homo-like australopith from South Africa, Science 328:195–204, 2010. Return to text.
- Kivell, T.L., Kibii, J.M., Churchill, S.E., Schmid, P., and Berger, L.R., Australopithecus sediba hand demonstrates mosaic evolution of locomotor and manipulative abilities, Science 333:1411–1417, 2011. Return to text.
- Wood, T.C., Australopithecus sediba, statistical baraminology, and challenges to identifying the human holobaramin; in: Horstemeyer, M. (Ed.), Proceedings of the Seventh International Conference on Creationism, Creation Science Fellowship, Pittsburgh, PA, 2013. Return to text.
- Robinson, D.A., A mitochondrial DNA analysis of the Testudine apobaramin, CRSQ 33:262–272, 1997. Return to text.
- Wood, T.C., Mitochondrial DNA analysis of three terrestrial mammal baramins (Equidae, Felidae, and Canidae) implies an accelerated mutation rate near the time of the Flood; in: Horstemeyer, M. (Ed.), Proceedings of the Seventh International Conference on Creationism, Creation Science Fellowship, Pittsburgh, PA, 2013. Return to text.
- Land, M., Hauser, L., Jun, S.-R. et al., Insights from 20 years of bacterial genome sequencing, Functional & Integrative Genomics 15(2):141–161, 2015. Return to text.
- Antonopoulos, D.A., Assaf, R., Aziz, R.K. et al., PATRIC as a unique resource for studying antimicrobial resistance, Brief Bioinform.| doi: 10.1093/bib/bbx083, 2017. Return to text.
- Huang, K., Brady, A., Mahurkar, A. et al., MetaRef: a pan-genomic database for comparative and community microbial genomics, Nucleic Acids Res. 42(Database issue):D617–24, 2014. Return to text.
- Cserhati, M.F., Mooter, M.E., Peterson, L. et al., Motifome comparison between modern human, Neanderthal and Denisovan, BMC Genomics 19(1):472–491, 2018. Return to text.
- O’Micks, J. Homo naledi probably not part of the human holobaramin based on baraminic re-analysis including postcranial evidence, ARJ 9:263–272, 2016. Return to text.
- Wood, T.C., Identifying humans in the fossil record: a further response to O’Micks, ARJ 10:57–62, 2017. Return to text.
- Ponting, C.P., Biological function in the twilight zone of sequence conservation, BMC Biol. 15(1):71, 2017. Return to text.