Journal of The Royal Society Interface
You have access Review article

Progress, pitfalls and parallel universes: a history of insect phylogenetics

Karl M. Kjer

Karl M. Kjer

Department of Entomology and Nematology, University of California-Davis, 1282 Academic Surge, Davis, CA 95616, USA

[email protected]

Google Scholar

Find this author on PubMed

,
Chris Simon

Chris Simon

Department of Ecology and Evolutionary Biology, University of Connecticut, 75 North Eagleville Road, Storrs, CT 06269-3043, USA

Google Scholar

Find this author on PubMed

,
Margarita Yavorskaya

Margarita Yavorskaya

Institut für Spezielle Zoologie und Evolutionsbiologie, FSU Jena, 07743 Jena, Germany

Google Scholar

Find this author on PubMed

and
Rolf G. Beutel

Rolf G. Beutel

Institut für Spezielle Zoologie und Evolutionsbiologie, FSU Jena, 07743 Jena, Germany

Google Scholar

Find this author on PubMed

Published:https://doi.org/10.1098/rsif.2016.0363

    Abstract

    The phylogeny of insects has been both extensively studied and vigorously debated for over a century. A relatively accurate deep phylogeny had been produced by 1904. It was not substantially improved in topology until recently when phylogenomics settled many long-standing controversies. Intervening advances came instead through methodological improvement. Early molecular phylogenetic studies (1985–2005), dominated by a few genes, provided datasets that were too small to resolve controversial phylogenetic problems. Adding to the lack of consensus, this period was characterized by a polarization of philosophies, with individuals belonging to either parsimony or maximum-likelihood camps; each largely ignoring the insights of the other. The result was an unfortunate detour in which the few perceived phylogenetic revolutions published by both sides of the philosophical divide were probably erroneous. The size of datasets has been growing exponentially since the mid-1980s accompanied by a wave of confidence that all relationships will soon be known. However, large datasets create new challenges, and a large number of genes does not guarantee reliable results. If history is a guide, then the quality of conclusions will be determined by an improved understanding of both molecular and morphological evolution, and not simply the number of genes analysed.

    1. Introduction

    We like to think of scientific research as insulated from human bias and personality. Like other fields of science, phylogenetics follows trends as ideas are rejected or accepted, influenced by new information. However, collective consensus comes not just from a series of technological advances and discoveries, but also from human interactions. New ideas are often rejected for years, even if they are supported by strong evidence. These are exciting times for evolutionary biologists as new technologies give us hope that the resolution of the tree of life is within sight. However, times have been exciting for decades and this optimistic sentiment has arisen with every new technology. It was only 25 years ago that phylogenetic trees (box 1) generated with a few hundred nucleotides were considered revolutionary, just as the application of cladistic (box 1) principles with a defined methodology was revolutionary a decade before that. With the large datasets we have today, some previously intractable questions now appear solved. The authors of this work have witnessed many of these changes, and we present our insights on this history, recognizing that others may remember things differently. We focus this review on the relationships among insect orders, missing many fine works on arthropod phylogeny, and intra-ordinal studies. We attempt to maintain a rough chronological order, considering three main periods: morphological phylogenetics, when morphology was the only source of data (roughly before 1990); the Sanger (box 2) sequencing period, where a few genes dominated most studies (roughly 1990–2005); and the current state of the art with datasets so large that traditional ways of analysing them are no longer feasible. New challenges will doubtless arise in the age of big data, but at least we can look back at previous trends with hindsight in order to learn from history.

    Box 1. Phylogenetic terms: cladistics.

    Phylogenetic trees. Graphical representations of evolutionary relationships. Synonyms: evolutionary trees, phylogenies, genealogies.

    Monophyletic group. A group of organisms (taxon) that is defined by a most recent common ancestor, and all of its descendants. Also known as a clade.

    Cladistics. An approach in systematics that bases all classification on ‘clades’ (i.e. monophyletic groups). Cladistics was developed by Hennig and insists that all named groups (taxa) be monophyletic, as evidenced by shared derived characters (‘synapomorphies’). After Hennig's death, a group of cladists started using the term to refer to a set of numerical analytical procedures which aim to reconstruct a phylogeny based on character state matrices and parsimony. ‘Cladistics’ can mean two distinctly very different things: Hennig's, focusing on monophyly and synapomorphy, or the cladists', based on parsimony methods.

    Parsimony. A broad scientific principle that prefers simple over complex explanations. In a phylogenetic context, parsimony refers to preferring a tree with the fewest possible character state transformations. Thus, whenever possible, transformations are assumed to be shared among taxa and thus placed on internodes as synapomorphies, rather than as homoplasies.

    Sister group. The most closely related taxon to a group of interest.

    Ingroup. A taxon under investigation. For this review, the ingroup is Hexapoda.

    Outgroup. A taxon outside the group under study. For this review, the outgroups could be any non-hexapod, but the best would be other arthropods.

    Character polarization. Determination of the evolutionary direction of a character, which determines whether a character state is ancestral (plesiomorphic) or newly derived (apomorphic).

    Box 2. Phylogenetic terms: analytical.

    Synapomorphy. A shared, derived character (feature) that can be used as an argument for a group being monophyletic (box 1).

    Homoplasy. Character state evolving more than once on a tree or changing back to its original state (redundant evolution). Parsimony (box 1) attempts to minimize homoplasy. Homoplasy creates phylogenetic noise (misleading signals).

    Distance analysis. Methods that reduce all character differences between pairs of taxa to a single value, their pair-wise distance. Trees are then constructed by grouping the most similar taxa. Distance methods are criticized by cladists as being phenetic.

    Phenetics. Organisms are grouped or classified based on overall similarity in their phenotype or appearance, rather than on derived character states only.

    Likelihood analysis. A statistical method of selecting among possible trees based on the probability of the data under a model of evolution.

    Long branch attraction. A phenomenon that misleads phylogenetic reconstruction. On long branches, shared phylogenetic noise (homoplasy) accumulates and overrides the true phylogenetic signal on short internal branches of a phylogenetic tree.

    Bootstraps. A subsampling of phylogenetic data which creates a number of pseudoreplicate datasets. These pseudo-replicates are then analysed individually, and their results are summarized on a consensus tree in order to estimate conflicting signal and provide an assessment of support for individual clades. (Jack-knifing is similar, but the new pseudo-replicate datasets are generated by random deletions of columns of characters.)

    Branch support. Quantitative measures to assess confidence for particular clades in a phylogeny. Examples include bootstraps, jack-knifing, posterior probabilities and Bremer support. Congruence among independent datasets could also be considered as branch support, but is seldom quantitatively expressed.

    Root. A hypothetical taxon assigned as the most recent common ancestor of all the taxa in a phylogeny. A root is used to assess the polarity of a phylogeny. Outgroups (box 1) can be used to help estimate the position of the root.

    Node. The point at which an ancestral lineage splits into two lineages in a phylogeny.

    Internode. The lines in a branching diagram between nodes (internal branches on a tree). In a phylogeny, an internode represents an ancestral lineage. Synapomorphies occur on internodes. The longer an internode exists, the more chance for synapomorphies (either molecular or morphological) to accumulate. Short internodes are generally the source of controversy, because they have a lower probability of accumulating informative substitutions.

    Substitution. An observed change in a character. For molecular data, substitutions are related to mutations, but because lethal mutations are seldom observable, substitutions are mutations that have survived the filter of selection.

    Sanger sequencing. The dominant method of DNA sequencing during the 1980s to 2005.

    Restriction sites. Short unique motifs scattered throughout the genome which can be cut by certain restriction enzymes, yielding fragments that can be visualized on a gel providing snippets of the DNA sequence information.

    The numerous names of orders and other higher-level taxa for a group as diverse as insects pose a significant challenge to the non-entomologist reader. Common names like ‘angel insect’ or ‘gladiators’ are often as obscure as the scientific ones in Latin or Greek. For this review, we direct the reader to the figures for the common names of the orders and to appendix A for a translation of super-ordinal names. We focus especially on four controversial deep-branching taxa: Entognatha, Palaeoptera, Polyneoptera, and Holometabola. The controversy arises from persistent conflicting evidence that suggests contradictory groups. The entognathous hexapods with internalized mouthparts include mostly tiny, wingless, litter-dwelling species that appear very early in the fossil record. The palaeopteran insects comprise mayflies, dragonflies and damselflies, characterized by wings that cannot be folded. Polyneoptera is the name given to a diverse group of insects, such as grasshoppers and close relatives, walking sticks, roaches, mantids, earwigs, stoneflies and some other groups, usually but not always characterized by leathery forewings. The holometabolous orders exhibit complete metamorphosis where the larva undergoes an amazing reorganization of the body during the pupal stage before it changes into the winged adult form. A confusing convention for the non-entomologist is the inconsistent use of the names Hexapoda and Insecta. Hexapoda (insects in the widest sense) comprise all six-legged arthropods, including the three entognathous orders, whereas Insecta excludes the entognathous orders (appendix A). Phylogenetic terms that many readers might not be familiar with are defined in numbered boxes, which are referenced at the first usages of the term.

    2. Pre-Hennigian concepts in insect taxonomy and phylogeny

    The roots of insect systematics go back to the sixteenth, seventeenth and eighteenth centuries. Important pioneers of entomology were the Italian naturalist Ulisse Aldrovandi (1522–1605), the Dutch doctor and microscopist Jan Swammerdam (1637–1680) and the German naturalist August Johann Rösel von Rosenhof (1705–1759) [1,2]. In the middle of the eighteenth century, the Swedish botanist Carolus Linnaeus (1707–1778) described more than 10 000 species in his Systema Naturae [3], including over 2000 insects. His ordinal names refer to the characteristics of the wings, e.g. Heteroptera (heterogeneous forewing), Hymenoptera (membranous wings) and Coleoptera (sheath-like forewing). Although his views evolved, Linnaeus was an essentialist in his early works, embracing the—at that time—commonly held belief that organisms were given an ‘essence’ by the Creator, which could be slightly modified but never fundamentally changed. Linnaeus' system remains useful to this day because it was based on characters that, unknown to him, are heritable and hierarchically organized through evolution. The Danish entomologist Johann Christian Fabricius (1745–1808) described 9776 insect species. Unlike his mentor Linnaeus, he emphasized the importance of mouthparts and the potential usefulness of genitalia [4]. Another prominent entomologist of the era was Pierre André Latreille (1762–1833). In his major work [5] he outlined insect families for the first time and used a broad spectrum of characters [2]. Together with explicit criteria for homology, this was an important step towards an evolutionary concept of classification.

    The evolutionary theory developed by Charles Darwin and Alfred Russell Wallace [6,7] laid a new foundation for classifying organisms, but had limited immediate impact on insect systematics [1]. Ernst Haeckel (1834–1919), an energetic promoter of Darwin's ideas in Germany, dealt with insects among many other groups. His classification included five ‘legions’ based on how insects feed [8]; we see today that it only partly reflected phylogenetic relationships. However, Haeckel presented the first explicit phylogenetic tree of insects [8, p. 710]. In 1904 [9], a remarkable study covering the entire Hexapoda was published by Carl Börner (1880–1953). Börner was a specialist on grape phylloxera (Daktulosphaira vitifoliae), an almost microscopic aphid-like insect that is a major pest of grapes. He was also a collector of springtails (Collembola), small hexapods that are common in leaf litter. As a young scientific assistant, he discussed cephalic structures in great detail. He focused on the hypopharynx, a central element of insect mouthparts and one of the most difficult character systems to explore. Even though his approach lacked a repeatable methodology, his phylogenetic tree (figure 1) comes close to concepts developed decades later. Naturally, since our current cladistic (box 1) concept of reserving names for monophyletic (box 1) groups [10,11] was not developed until the 1950s and 1960s, Börner's classification is partly inconsistent with the branching pattern shown in the tree. For instance, he placed the phenotypically similar Archaeognatha and Zygentoma in the Order Thysanura. (Figure 1; see appendix A here, and throughout, for a definition of taxon names.)

    Figure 1.

    Figure 1. Phylogeny modified from Börner 1904. Taxa are named by modern convention.

    A highly productive North American entomologist of the early-twentiethth century was G.C. Crampton [12,13], whose phylogenetic tree from 1938 [14] was another hypothesis that came remarkably close to modern concepts (see fig. 1 in Engel & Kristensen [2]). Important works were published by Imms [15], Snodgrass [16], Weber [17,18] and also by Handlirsch, who was frequently cited in Hennig's later work [11] (see [19]). Handlirsch attempted a classification reflecting phylogeny, but believed that a purely phylogenetic system was not possible [11]. Even in studies published posthumously in 1937 [20] and 1939 [21] Handlirsch considered the extinct winged Palaeodictyoptera as the ancestors not only of Pterygota but of all other insects including the wingless (apterygote) orders [11].

    3. Hennig's breakthrough

    Willi Hennig (1913–1976) revolutionized systematics and classification [22] in the last century with his theoretical work, offering clear and repeatable methodology. Works published prior to Hennig are often referred to as ‘intuitive’. This is perhaps unfairly pejorative when you consider that their remarkably accurate phylogenetic insights were often based on expertise gained through meticulous observation, rather than intuitive hunches. However, before Hennig's methods were widely adopted, systematists would postulate relationships based on shared characters that they deemed particularly important. In this respect, phylogenies could be considered as imparted wisdom, rather than science. Hennig's method involved distinguishing ancestral (plesiomorphic) and derived (apomorphic) features. He also developed a more precise concept of monophyly (box 1), under which no descendants of the most recent common ancestor could be excluded from a named group (clade). Hennig reconstructed phylogenies with an iterative, stepwise approach. Using putative shared-derived character states (synapomorphies), he successively established sistergroup (box 1) relationships. Distinguishing ancestral from derived character states required the definition of a taxon outside the group of interest for comparison (an outgroup (box 1)). The outgroup method was introduced as a formal procedure in the early 1980s [23,24] even though it had already implicitly been used by Hennig [22]. Hennig's phylogeny [11], published in 1969, is widely considered to be the starting point of modern insect phylogenetics (figure 2). Despite his precise methodology, his hypotheses were quite similar to earlier trees. They changed with time, as can be seen by comparing the phylogenetic concept presented in Hennig's 1969 phylogeny (figure 2) [11] with his earlier work [10].

    Figure 2.

    Figure 2. Hennig's 1969 phylogeny [11], combined and modified from the original figures. Numerals indicate fossils as Hennig listed in his figures: 1. Rhyniella; 2. Eopterum (no longer considered an insect); 3. Rhyniognatha; 4. Monura; 5. Triassomachilis; 6. Triplosoba pulchella; 7. Permoplecoptera; 8. alleged subgroups of Ephemeroptera; 9. Erasipteron; 10. Protodonata (Meganisoptera); 11. Protanisoptera; 12. Protozygoptera; 13. stemgroup of Anisozygoptera+Anisoptera; 14. Sheimia sojanensis; 15. Protoelytroptera; 16. Mesoforficula and others; 17. Puknoblattina; 18. Palaeozoic ‘Problattoidea’ and Blattodea; 19. Oedischia; 20. Glosselytodea; 21. Sthenaropodidae; 22. Oedishiidae, Elcanidae; 23. Tettavus; 24. Triassolocusta; 25. Tcholmanvissia ; 26. ‘Paraplecoptera’ sensu Sharov (now Eoblattida Handlirsch 1906); 27. Protoperlaria (now Prothorthoptera Handlirsch 1906); 28. Perlopsis and other definitive Plecoptera; 29. Permopsocodea; 30. Procicadellopsis; 31. Archipsyllidae; 32. Permothrips longipennis; 33. Permaphidopsis; 34. Mesococcus asiaticus; 35. Archescytinidae; 36. Cicadopsyllidae; 37. Permaleurodes rotundatus; 38. Auchenorrhycha; 39. Paraknightia; 40. Boreocixius; 41. Permosialis; 42. Palaeohemerobiidae and Permithonidae, sensu Carpenter; 43. Tshekardocoleus and other branches; 44. Archezyela; 45. Mecoptera from Australia; 46, 47. Paratrichoptera; 48. Microptysma and 49. Microptysmodes.

    Hennig's ‘Phylogenetische Systematik’ [22], was not a completely new concept when it was published in 1950. The botanist Zimmermann developed similar ideas in the 1930s, and Sturtevant used a very similar approach in his taxonomic studies of fruit flies (Drosophilidae; [25]). Moreover, it is apparent that ideas similar to Hennig's were implicitly used before his methods were formalized. It is impossible to consider Börner's phylogeny without recognizing an approach that went beyond intuition. Aside from the primacy of synapomorphies, a major point of Hennig's concept was that classification should be strictly linked to phylogeny. The requirement that named taxa be monophyletic originated with Hennig, but the unique value of synapomorphies was loosely recognized by systematists earlier. Herbert Ross (1908–1978) [26], for instance, was polarizing (box 1) characters relative to a hypothetical ancestor in 1937, and he indicated derived states with marks on the internodes (box 2) of his insect phylogeny in 1955 [27]. The phylogeny in his 1965 textbook [28] is almost as close to current concepts as morphology has ever been. However, as advocated in general by Ernst Mayr [29] (see also Nelson's reply [30]), Ross gave names to paraphyletic groups (groups that do not include all descendants of the deepest ancestor). If your concept of ‘dinosaur’ does not include birds, then you accept paraphyletic taxa too. Systematists today consider, for example, birds to be a subgroup of Sauropsida, a clade that also contains dinosaurs and extant reptiles such as turtles, lizards and crocodiles. Mayr and followers understood that birds had been derived from a paraphyletic assemblage of reptiles, but still found ‘reptilia’ to be a useful term representing a different evolutionary level, just as we sometimes use ‘apterygotes’ as a name for the ancestrally wingless hexapods, even though we understand that they are not a monophyletic group. Generally, when systematists put a name in quotes, it is to indicate that they understand it to be a paraphyletic group, and are waiting for the term to fade into disuse.

    A remarkable study was published by the Argentinian entomologist Alvaro Wille [31] in 1960. Although he distinguished ‘primitive’ from ‘specialized’ or ‘unusual’ features, he also characterized groups by a mixture of plesiomorphies and apomorphies. The major clades on his tree, however, were characterized by evolutionary innovations (figure 3). Another important work of the time was Hinton's 1958 review [32]. Hinton made some bold statements that appear untenable today, such as ‘the polyphyletic nature of the old groups Myriapoda and Hexapoda’, but his evaluation of morphological characters, taken largely from the head, including a detailed scrutiny of larval muscles, helped elucidate the evolution of Holometabola.

    Figure 3.

    Figure 3. Modified from Wille 1960 [31]. Taxa are named by modern convention.

    Gerhard Mickoleit, who graduated under the insect morphologist Hermann Weber at the University of Tübingen and attended seminars given by Hennig in the early 1970s, investigated several groups of insects, with a focus on genital structures, especially the ovipositor. This included thrips [33], Neuropterida (lacewings and close relatives), beetles [34], and fleas, flies and scorpion flies [3538]. In 1973 [34], he provided specific evidence for the first time for a close relationship between neuropteroids and beetles.

    4. Geographical isolation and ‘parallel universes’

    The importance of the contributions made by Russian entomologists and insect palaeontologists is reflected by numerous citations by Hennig [11]. Formal names for important higher ranking taxa such as Palaeoptera, Neoptera and Polyneoptera were introduced by Russian scientists [39,40]. Moreover, Russian palaeontologists, notably A. V. Martynov, B. B. Rohdendorf, V. V. Zherikin and A. G. Ponomarenko [41,42] (reviewed in 2002 [43] and 2009 [1]) made immense contributions to the knowledge of fossil insects that provided a critical window for observing past morphology.

    Through the entire twentieth century, most Russian entomologists maintained a conservative approach, with traditional descriptions based on morphology, without formal cladistic phylogenetic character evaluations. International collaboration was partly impeded by linguistic barriers, but also by the isolation of the Soviet Union and the pseudo-scientific Lysenkoism, an antigenetic view that was politically favoured [44]. The limited exchange and cooperation is still reflected by the strikingly different nomenclature for high-ranking taxa, such as Scarabaeona for Pterygota, Scarabaeiformes for Holometabola and Scarabaeoidea for Coleoptera [4,43].

    A prominent and highly efficient Russian entomologist of the nineteenth century was Victor I. Motschulsky, who published numerous works on biogeographic, faunistic, or systematic aspects of entomology, most of them on beetles [45]. Georgij G. Jacobson had a crucial impact on the development of Russian systematics in the early twentieth century. He is best known as the author of the 1905 magisterial ‘Beetles of Russia, Western Europe and neighbouring countries’ [46], including an impressive catalogue with keys for the identification of all known Eurasian genera. More recently, the palaeoentomologist Alexandr P. Rasnitsyn described approximately 250 new genera and over 800 new species of fossil insects. He suggested a sistergroup relationship between Hymenoptera (sawflies, bees, wasps and ants) and the remaining Holometabola [43] before this was established with formal analyses of extensive morphological or molecular datasets [4750] (see Ronquist's and others' reanalysis [51]). Rasnitsyn [43] suggested that insect flight originated from gliding [52]. Phylistics, his alternative approach to cladistics, as discussed by Brothers [53] explicitly accepts paraphyletic groups (e.g. †Caloneurida [43])

    Before the Internet, geography and language also played a role in isolating phylogenetic communities from Europe, America and East Asia. For example, a profound treatment of insect morphology was presented by René Jeannel in 1949 [54], but has rarely been used outside of the French community. Hennig's work was unknown to most Americans until it was translated into English in 1966 [55]. Although systematists were aware of work in other countries, the Meetings on Insect Phylogeny in Dresden played a major role in fostering collaborations between workers from different parts of the world, although they have not seen significant Russian or Latin American participation. These meetings were organized for the first time in 2003 by Klaus-Dieter Klass and Niels Peder Kristensen [56]. Most members of the ‘1000 insect transcriptome evolution’ (www.1KITE.org) initiative first became acquainted at these meetings. The 1KITE team created our current best estimate of insect ordinal phylogeny (figure 5) with the largest dataset assembled to date. Europe was an ideal meeting place, because the particular brand of cladistic fervour in America that was characterized by name-calling and personal insults was less pronounced there. The Europeans absorbed new ideas quickly, and went about their business in developing new centres of insect phylogenetics based on emerging techniques in morphology, and the refinement of model-based molecular phylogenetics (reviewed in [57]).

    5. Post-Hennigian approaches

    The classical tradition of insect morphology and phylogeny was upheld on a high level by Niels P. Kristensen (1943–2014) of the Zoologisk Museum in Copenhagen. He published outstanding morphological treatments of lepidopteran key taxa [5861], profound reviews of insect phylogeny [6264] and landmark volumes on systematics and morphology of Lepidoptera in the Handbook of Zoology series [6568]. Even though he never performed computer-assisted analyses, his critical contributions helped refine character interpretations and pointed out problematic phylogenetic issues. A characteristic feature of Kristensen's approach was a deep-rooted scepticism, reflected by largely or completely unresolved parts of his phylogenetic trees. His display of polyneopteran relationships [63] became known as ‘Kristensen's comb’, and if polytomy is preferable to error, then Kristensen's phylogeny was not bested until genomic resources were brought to bear. However, Kristensen remained sceptical even after the publication of large transcriptomic (box 4) works [49,50] (NP Kristensen 2015, personal communication to R.G.B.).

    Controversy was common in morphology-based insect phylogenetics, and results were strongly affected by the selection of characters, before very large and well documented datasets emerged in the twenty-first century. Boudreaux's 1979 book [69] on arthropod phylogeny, though criticized by some [43,63,70,71], was cited frequently by others [7277]. Even though Boudreaux adopted the methods of phylogenetic systematics, according to Kristensen [78] his interpretations often differed from those of Hennig [10,11]. As in the case of the controversial Zoraptera [79], phylogenetic conclusions were often based on characters that were ancestral, ill-defined or homplasious. Jarmila Kukalová-Peck published a summarizing account of insect palaeontology [80], numerous specific studies on extant and extinct insects [8183] and comprehensive analyses of characters of the wing base and wing venation [84,85]. She advocated the origin of wings from gill-like appendages [86,87] and proposed a clade Cercophora (Diplura + Insecta) for the first time [87]. Her groundplan approach challenged standard cladistic procedures [88] and was criticized by some authors [89]. Her phylogenetic hypotheses, usually based on wing characters, yielded some results inconsistent with earlier [11] and most recent concepts [50].

    As in earlier attempts to classify insects (like Haeckel 1896), studies based entirely on wing venation [84] show the weakness of limited character systems, especially when strong functional constraints drive convergent evolution. Nevertheless, in-depth studies of specific body parts, organs or developmental stages can yield important insights. Examples are the circulatory system investigated by Günther Pass [90,91], the female genitalia of polyneopteran groups studied by Klaus Klass [92,93], and embryology, with important contributions made by Ryuichiro Machida and others [94,95]. Throughout the first decade of this century, it was more common in presentations to see these characters mapped onto molecular phylogenies than to have explicit, data-matrix-based phylogenies constructed from these systems. This is understandable, given the recognition that subsets of characters were only part of the whole picture, the general lack of coordination in taxon sampling, and the enormous effort involved in constructing a unified combined data matrix.

    Classical Hennigian studies relied on detailed anatomical information obtained for few selected taxa with informal character discussions without data matrices [3537,62,63,9698]. These studies treated all taxa within larger groups as a single hypothetical ancestor, reducing characters to reconstructed groundplan states. Modern computer-based analysis is better suited to entering characters of individual representative species into data matrices. Even so, earlier computer-based studies extracted data from the literature [62] and coded entire orders with identical groundplan states [72,74,77]. This was almost inevitable, because thorough anatomical studies using microtome sectioning of a single species [99] could take years. In the early 2000s, new technologies such as micro-computed tomography (µCT) and computer-based three-dimensional reconstructions greatly accelerated the acquisition of high-quality anatomical data [19]. The coordinated efforts of international research teams, using both new and traditional methods [100,101], have yielded matrices of hundreds of characters from different body parts and life stages. For example, a study of Holometabola [48] contained 365 well-documented characters that corroborated current molecular phylogenies.

    6. Insect morphology and cladistics

    In the late 1970s and 1980s, cladistics ‘evolved’ as a transformed version of Hennigian phylogenetic systematics [102,103], arguably linked with the development of suitable computers and software programs. The Hennigian method of searching for sister taxa required great care in polarizing each character. Polarity in this context refers to the assignment of character states as either ancestral or derived. However, character polarity is automatically determined based on outgroups (box 1; i.e. rooting (box 2)) with computer-based analysis [104]. The first computer program capable of estimating phylogenies was Felsenstein's PHYLIP in 1980. Mickevitch and Farris were developing their program ‘PHYSIS’ near the same time and released it in 1982. It saw limited use, perhaps because of the $5000 price tag. Farris' updated program Hennig86 became available in 1989 [105], and Swofford's PAUP [106] was released free of charge the same year. Along with new molecular data, Whiting et al. [77] presented a morphological matrix-based analysis of most major insect groups in 1997, which was extended to include all hexapod orders by Wheeler et al. in 2001 [74]. That same year, Beutel & Gorb [72] presented a matrix-based morphological analysis of the entire Hexapoda. These morphological phylogenies were largely consistent with earlier hypotheses [10,11,28,31,62,63,64,78]. Wheeler's insect ordinal phylogeny [74] emphasized molecular data, but, without the morphological data, their results were largely unresolved and implausible [107].

    7. The dawn of molecular systematics in the early 1990s—molecular work in the Sanger days

    A number of studies in the late 1980s explored animal phylogeny, including insects, using direct RNA sequencing of the nuclear small subunit ribosomal RNA gene (18S rRNA) [108,109]. Turbeville et al.'s 1991 work [110] used parsimony (box 1), distance (box 2), and other methods [109]. Their distance analysis grouped the annelids with the molluscs as opposed to the previous assumption that annelids should group with the arthropods based on segmentation. They also recovered Pancrustacea, a group that unites the traditional crustaceans with hexapods. At the time, the Tracheata hypothesis (Myriapoda + Hexapoda) was heavily entrenched, and they suggested that the position of the crustaceans may have been the result of bias introduced by long-branch attraction, and the limited number of characters. Earlier work [108] also recovered Pancrustacea, and suggested that the annelids were distant from the arthropods. These works remind us to be careful of what we dismiss as ‘wrong’, because we now understand Pancrustacea to be strongly supported. Turbeville et al. [110] were aware of branch-length artefacts and alignment (box 3) ambiguity, and made careful, if arbitrary, decisions about data exclusion. Unlike some who followed them, they considered suboptimal trees to be worth discussing. However, given that few were impressed with confirming arthropod monophyly, and still fewer believed that crustaceans should group with Hexapoda, this study, as insightful as it was, did not become a model for future analyses.

    Box 3. Phylogenetic terms from the Sanger days.

    Nucleotide compositional bias. When nucleotide frequencies stray significantly from 25% of each DNA base (A, C, T and G). This phenomenon is particularly problematic when bias differs among taxa.

    Among-site rate variation (ASRV). When different sites along a sequence vary in their substitution rates. For example, when the substitution rates are higher for third codon positions than for second codon positions, this difference in rates is important to capture in model-based analyses, and argues against equally weighted parsimony. ASRV is also extremely problematic when it varies across lineages in a tree.

    Multiple sequence alignment. The process of lining up DNA or amino acid data into columns of presumed homologous positions.

    Consensus tree. A graph summarizing a set of trees by showing only clades which are shared among multiple equally favoured solutions or even multiple analyses. A strict consensus tree shows only those relationships found in all trees, whereas a majority-rule consensus depicts the most common resolution.

    Sensitivity analysis. A means of exploring the robustness of a conclusion by altering the analytical details that influence it. For example, if one were interested in exploring how alignment parameters (like the penalty for inserting a gap in an alignment) influenced a phylogeny, one could change the input values to create new phylogenies from the new alignments and explore how the resulting trees differ.

    Input parameters. Many complex analyses, such as alignment of DNA sequences or phylogeny reconstructions, require a priori specification of a number of parameters. Common input parameters include values for costs or ratios, or parameters of a specific evolutionary model. Input parameters are often derived from empirical data and can drastically alter phylogenetic results.

    A 1984 review of insect molecular systematics by Berlocher [111] focused on allozyme gel electrophoresis studies, with discussions of methods used at the time. Before the invention of the polymerase chain reaction (PCR) in 1985, direct sequencing of rRNA was possible, but rare, and only one study of the molecular structure of rRNA, presenting 3 insect 5.8S sequences [112] was mentioned in the review. A 1988 paper by Simon [113] included a table of 30 molecular phylogenetics projects underway at the time, but all of them were as yet unpublished. Thus, the first molecular study that we are aware of that specifically addressed insect phylogeny was published in 1989, when Wheeler [114] discussed separate analyses of insect 18S sequences and restriction sites (box 2). The restriction site analysis (his figure 6) included more taxa than the DNA sequence tree, and supported Metapterygota (damselfly + Neoptera), and Neuropteroidea (beetle + lacewing). It seemed from this work that the 18S rRNA gene was a promising source of characters, especially given that restriction sites alone could result in a reasonable tree. In 1992, Carmean et al. [115] used 18S rRNA to explore relationships among holometabolous insect orders and noticed that flies had an elevated substitution rate, and long regions that had to be excluded from the analysis because they could not be aligned (box 3) with confidence. They surmised that the flies (Diptera) were being drawn to the root (box 2), and, thus, excluded them in most of their analyses. Pashley et al. in 1993 [116] published distance and parsimony analyses of a fragment of 18S rRNA from nine orders of Holometabola. They were able to recover Mecopterida and Amphiesmenoptera, but bootstrap support (box 2) for most groups was very low. Pashley et al. concluded that using different outgroups (box 1) yielded different topologies for poorly supported ingroup taxa. The failure of these analyses to converge on strongly supported results from a few taxa, and fragments of 18S is not surprising. This gene alone has never resolved relationships among all orders of Holometabola, even with many more taxa, although von Reumont's work [117] came very close with combined complete 18S and 28S but without the confounding Strepsiptera (see below).

    Mitochondrial data were also explored in the early days of Sanger sequencing (box 2). Liu and Beckenbach [118] explored the mitochondrial cytochrome oxydase II (COII) gene in 10 orders of insects, using a genetic-distance-based analysis [119] and parsimony. Trees from various analyses grouped the cockroach and the termite, and the three species of Hymenoptera (ichneumonid wasp, bee, and ant), and not much else. In a study of arthropod phylogeny, a small fragment of mitochondrial 12S rRNA gene was analysed, and it was proposed that onychophorans (velvet worms), are in fact modified arthropods [120]. As onychophorans are generally considered to be arthropod outgroups, this study was published with fanfare in the journal Science. The statement that ‘These data demonstrate that 12S … can resolve arthropod relationships…’ is strongly contradicted by the highly unusual (and since rejected) phylogeny they recovered.

    Through the Sanger sequencing period, molecular phylogenetics focused largely on 18S, 28S and a few mitochondrial genes, mostly 12S rRNA, 16S rRNA and COI. However, rRNAs were difficult to align [121,122] and model [123], and it seemed that the mitochondrial genes were biased and full of misleading signal [118]. Single-copy nuclear genes were seen as a solution, but remained difficult to sequence. The standard markers were easier to amplify, because universal primers were available [123,124,125], and both mitochondrial genes and nuclear rRNAs were present in multiple copies in every cell. A review in 2000 [126] called for coordinated efforts in selecting genes that were compatible across studies, and supported the continued use of 18S rRNA and commonly sequenced mitochondrial markers. In a contrasting opinion, in order to move beyond rRNA and mitochondrial genes, a group of workers at the University of Maryland embarked on a programme to locate and sequence single-copy nuclear protein-coding genes [127]. Of 14 ‘promising candidates’ they identified, several (EF-1α, DDC, POLII and, to a lesser extent, PEPCK) that saw extensive use in insect intraordinal phylogenetics. They would continue to develop useful protocols for amplifying genes such as wingless, CAD rudimentary and others [128135]. Additional contributions to the arsenal of nuclear genes for insect phylogenetics soon followed [136138]. Histone H3 and U2 snRNAs were examined [139,140], with the former used extensively despite the fact that neither gene could recover any reasonable higher level groups [141]. Practically all major higher level insect phylogenetic studies in the past decade have relied, at least in part, on single-copy nuclear genes, and these have now become the dominant markers in transcriptome (box 4) analyses. The markers developed by the Maryland workers and others were put to good use across arthropods, among a few orders [142], and within orders such as Lepidoptera, Hymenoptera, Diptera and Coleoptera (see below). However, they were not applied broadly across orders until Wiegmann's 2009 work [47].

    Box 4. Modern terms in molecular phylogenetics.

    High-throughput sequencing. A variety of methods that generate huge amounts of DNA sequence data. These methods were called ‘next generation’ during the Sanger sequencing period.

    Transcriptome. All the genes that are active in a cell, tissue, or whole organism at the time a specimen is collected. These genes are sequenced by isolating messenger RNA from the cells.

    Assembly. High-throughput sequencing typically produces millions of fragments of relatively short sequences. These sequences are aligned, and stitched together into longer fragments called ‘contigs’, which are then sorted into orthologues.

    Orthology prediction. An orthologue is a homologous gene. Gene duplication produces non-homologous, similar-looking genes (paralogues) that can disrupt phylogenetic analyses. Orthology prediction is the process of identifying orthologues and distinguishing them from paralogues.

    Data masking. Identifying ‘bad’ data, and throwing it out. Problematic data come from a variety of sources. Data can be misaligned, randomized, mostly missing or violate model assumptions.

    Partitions. Subsets of the data. Because models are used to analyse data, it is common to subdivide the data into subsets that are assumed to share similar properties so that the models are a closer match to biological reality.

    Quartet mapping. A method of branch support that examines a particular node by sampling replicates of four randomly selected taxa that surround the node of interest. The results summarize the percentage of times that these quartets recover alternative topologies.

    Pipeline. A sequence of analytical procedures. In computer science, a pipeline is a chain of computer programs that perform analytical steps in series.

    Taken as a whole, it is not difficult to see why morphologists would be less than excited by the state of molecular phylogenetics in the early 1990s. In historical context, this was a time when university hiring priorities favoured molecular workers who could sequence a couple of hundred nucleotides from backyard insects, and ‘discover’ relationships that were either already widely accepted or hard to believe. Grant money seemed to be reserved for molecular work. Still, morphological workers seemed to be at an impasse. They agreed on the general outlines of Hennig and Kristensen, and had established Dictyoptera, Mecopterida, Amphiesmenoptera, Antliophora and Neuropterida as monophyletic. However, they were unable to resolve the relationships among the entognathous, palaeopteran, polyneopteran, or holometabolous orders.

    The development of PCR made possible the rapid collection of nucleotide sequence data throughout the 1990's and beyond [143,144]. Most molecular workers at this time recognized the limitations of their own data, and were proposing ways to address them. Two early reviews [123,145] discussed strategies for modelling DNA to account for known biases in the way it evolves. Swofford and co-workers influential chapters in the ‘Molecular Systematics' books [146,147] had laid a foundation for understanding the analytical issues, and the programs PAUP* [148] and PAML [149] were available for running model-based (likelihood (box 2)) analyses at a time when computers were finally up to the task of implementing complex substitution models. PAUP is an acronym for ‘Phylogenetic Analysis Using Parsimony’, so the asterisk after the new release referred to ‘and other methods’, such as likelihood and distance. The pull-down menu (graphical user interface or GUI) in PAUP* was an ideal platform for newcomers to learn the complexities of models of DNA evolution and statistically based likelihood phylogeny-building methods. At this time, the need to accommodate biases with either models or differential weighting were becoming obvious [146]. By the mid-1990s the field of insect molecular phylogenetics looked promising, but it had begun to splinter into camps based on analytical methods. There was a brief honeymoon where the logic of cladistic taxonomy was universally adopted, but then co-opted by some who conflated cladistics with parsimony, as they transitioned from intuitive Hennigian methods to computer-based parsimony analyses.

    8. The problem with ‘the Strepsiptera problem’: 1995–2010

    There was probably no question that occupied the minds of insect systematists more during the late 1990s than the ‘Strepsiptera problem’. This is surprising because strepsipterans are neither diverse nor conspicuous. However, like many parasites with highly modified structural features, these fascinating and unusual insects were difficult to place morphologically. Their ribosomal data would prove to be the battleground over which likelihood and parsimony practitioners would argue, which in turn helped to reveal the problems inherent in parsimony. Most morphological studies placed Strepsiptera (figure 4) as the sister taxon to the beetles (Coleoptera), based on hindwing flight (posteromotorism), and a few other characters [72,150152], or within a subgroup of Coleoptera [153] (see also [154]). The first of the molecular-based studies addressing this, (but without published data or analytical detail) was a Scientific Correspondence that appeared in Nature in 1994 by Whiting & Wheeler [155]. They proposed that Strepsiptera were the sister taxon of Diptera (flies; the two groups combined named Halteria [77]). Halteria refers to halteres; the gyroscopic reduced hindwing stubs found in flies that are superficially similar to strepsipteran forewings. They surmised that the grouping of flies with strepsipterans was in itself evidence of a homeotic mutational transformation that resulted in halteres flipping from the third thoracic segment in flies to the second in strepsipterans (figure 4). Most morphologists doubted this assertion. Scepticism came from the molecular perspective as well. Carmean & Crespi [156] responded almost immediately that ‘long branches attract flies’, which was unambiguously demonstrated by Huelsenbeck [157], with a likelihood (box 2) analysis of available data [156] (the Whiting data were not yet public). Chalwatzis et al. [158] were the first team to actually publish an analysis of this problem and make available their data. Like Whiting and Wheeler, they used 18S rRNA sequences and recovered Halteria. In a 1996 follow-up work [159], increasing the number of taxa to 26 and including all holometabolous orders, they again recovered Halteria. In addition to parsimony, these authors used the neighbour-joining distance method with a model [160] designed to accommodate nucleotide compositional bias (box 3) among lineages and among-site rate variation (box 3). Strepsipteran 18S was found to be about 1000 nucleotides longer than the next longest 18S and shared extreme and similar AT nucleotide compositional bias with Diptera. All their analyses favoured Halteria, including those designed to correct for the bias they had observed. However, in an analysis they did not show, when site-specific rates were used to correct for among-site rate variation [161,162], the bootstrap (box 2) value for Halteria dropped from 100% to 77%. They cautioned that 18S could be artificially clustering long-branched taxa and looked forward to investigating other genes not linked to rRNA to test their findings.

    Figure 4.

    Figure 4. Main image: electron micrograph of a male Stylops ovinae (Strepsiptera). All insects have a three-segment thorax, each with a pair of legs. Wings, when present, are found on the second and third segments. Strepsipterans have reduced forewings modified as sense organs (arrows) attached to the small middle thoracic segment. Their anterior thoracic segment is greatly reduced. Flies have similarly reduced hindwings, attached to the third thoracic segment. The third thoracic segments of Strepsipterans and beetles are highly expanded, containing the functional wings and associated flight muscles. (Image copyright Hans Pohl, used with permission. Insert: Wikipedia creative commons.)

    The largest dataset of the 1990s exploring the phylogeny of Holometabola was presented in 1997 by Whiting et al. [77]. Approximately 1100 18S and 400 28S rRNA positions were aligned with the multiple sequence alignment (box 3) program Malign [163], and analysed with parsimony. Molecular data were then combined with morphological data taken from the literature. Sensitivity to alignment was explored by evaluating trees both with, and without hypervariable regions. Both beetles and neuropteroids were polyphyletic, owing to contamination. The paper is best remembered for its recovery of Halteria, a hypothesis that they would vigorously defend [74,164167] until morphology [48,168] and additional genes [47,154,169,170] overturned it 15 years after it had been proposed.

    9. Long-branch distraction?

    Whiting [77,164] rejected the suggestion that Halteria was an artefact of long-branch attraction (box 2), arguing that 18S and 28S corroborated one another. However, these are not independent genes but rather different regions of the same transcript. Countering Whiting's other arguments, Huelsenbeck showed in 1998 [171] that the length of the branches leading to Strepsiptera and Diptera were ‘virtually unparalleled in phylogenetic analysis’. Huelsenbeck's analysis was among the first to apply likelihood to a large number of insect orders. Whiting [77] had argued that the branch leading to the amphiesmenopterans was ‘not far out of range’ of those leading to the Strepsiptera and Diptera. However, this statement missed the central tenet on which long-branch attraction is based. In parsimony analyses, as independent changes are transferred away from the terminal branches leading to both Strepsiptera and Diptera where they occurred, to the internode (box 2) that falsely links them together, the observed terminal branch lengths are underestimates of the real number of independent changes. In other words, parsimony takes two independent changes and assumes they are shared-derived character states. By removing either Strepsiptera or Diptera from the analysis, each remained in the same position relative to the remaining taxa. Their argument was that Strepsiptera could not have been attracted to Diptera given that it ends up in the same place in the tree when Diptera is removed [165]. However, removing either taxon simply caused the remaining long-branch taxon to attract to the next longest branch, the Amphiesmenoptera, a branch that they recognized as almost as long [77] as those leading to Strepsiptera and Diptera (although they had underestimated these branch lengths). Huelsenbeck [171] showed that, given the taxon sampling used by Whiting et al. [77], the branches leading to both Strepsiptera and Diptera were long enough to attract one another with parsimony, and that likelihood analyses could not distinguish among hypotheses. It was recognized that taxa at the end of long branches may actually be sister groups [107,157,171,172], but that the rRNA data in hand could not support any conclusion including Halteria. Friedrich & Tautz [173], confirming the observations of Chalwatzis [158,159], showed in 1998 that there had been an extreme change in both substitution rate and compositional bias (box 3) in the stem dipteran lineage that would pose problems for phylogenetic analyses. These biases were further explored in 2000 by Steel et al. [172] and others [174]. Hwang et al. [189] sequenced additional large subunit rRNA fragments in order to test the Halteria hypothesis. Their parsimony analyses recovered Halteria, which they attributed to long-branch attraction (box 2), whereas their likelihood analyses placed the strepsipteran with the scorpionfly. All these authors recommended that the Halteria hypothesis be dropped. So why did it take phylogenomics to settle the issue? It didn't, and this is not the wisdom of hindsight. For likelihood practitioners, it was settled in 1998. It had been conclusively demonstrated that Halteria was the result of inappropriate methods and obvious predictable bias. Halteria has only been found from the analysis of nuclear rRNA data, and contradicted by every other source of data [47,50,154,168,175]. Genomic [175,176] and transcriptomic (box 4) [50,177] analyses now leave very little room for debate: Strepsiptera belongs as sister to the beetles. The debate was never really about Halteria, but rather, about the philosophical merits of parsimony versus likelihood.

    10. Alignment issues

    In addition to disagreements over the merits of parsimony, the methods of nucleotide alignment (box 3) played an important role in insect phylogenetics during the Sanger sequencing period [107,163,178186]. The definition of cladistics had been transformed, and, in the new sense, was characterized by strict and exclusive adherence to parsimony analyses. The first work in insect molecular systematics that covered all orders came from a group of cladists (in the new sense) who were centred at the American Museum of Natural History in New York [74]. They extended the dataset of Whiting with the same rRNA fragments as in the 1997 work [77], but with additional taxa, particularly outside Holometabola. The principal analytical difference was the implementation of simultaneous alignment and tree building [187]. They called this method ‘direct optimization’ when implemented by their program, POY [179]. The molecular data by themselves, presented in their figs 12a 13, and 14, suggested many implausible relationships [107]. However, it seems that the morphological data provided a stabilizing scaffold that mediated the misbehaviour of the molecular data. In order to explore the influence of different analytical assumptions, they presented six combined data trees. The analysis that minimized incongruence among datasets (their fig. 11) would seem to be the favoured hypothesis, although this was not explicitly stated. The summary trees of all assumption sets, shown in their fig. 18a,b, were largely unresolved consensus trees (box 3). However, often their results are cited as their fig. 20, which did not come from an analysis but rather was a ‘discussion tree’ created from nodes the authors favoured from different datasets. The deepest parts of their trees seemed robust to analytical assumptions, whereas relationships among polyneopterans and holometabolans were unstable. Despite published papers that pointed toward branch effects and compositional bias (box 3) for these data [137,156,157,171,172,188,189], they did not take these problems into consideration, favouring parsimony on philosophical grounds.

    Many of the differences among phylogenetic hypotheses were the result of differing analytical approaches and ambiguous alignment (box 3) of rRNA data. The history of alignment disputes has been described in detail elsewhere [181,182,185], and some researchers likely turned to nuclear single-copy genes simply to avoid the problems of rRNA alignment altogether, and perhaps the tedious bickering from both sides of this issue [190]. POY has since fallen out of favour with systematists, owing to numerous and diverse criticisms [121,183,184,191], although the possibility of simultaneous alignment and tree building remains [192]. The problem with model-based simultaneous alignment and tree building is that it is difficult to create a biologically reasonable model for gaps.

    11. The sensibility of sensitivity

    By 2001, the insect systematics community was strongly divided into parsimony, and likelihood camps. It was another set of parallel universes, with different journals (Systematic Biology versus Cladistics), different heroes (Felsenstein versus Farris), different branch support measures (bootstraps versus jack-knifing and Bremer support (box 2)) and even different computer systems (Mac versus PC) brought about by the platforms of their different programs (PAUP versus Winclada, NONA, and TNT). Many in both camps basically dismissed the ideas from the other side as flawed and without merit. Most morphologists found themselves in the parsimony camp, likely because of tradition and the fact that morphological characters are less amenable to modelling than molecular characters. Much of the error in parsimony analysis could have been mediated by differential weighting of characters, upweighting slow sites, and downweighting fast sites [123,193]. Morphologists had always weighted their data, if only by selection of characters that they deemed reliable. However, with DNA there was little interest in differential weighting, as one side rejected it based on the insistence that equal weights were assumption free, and the other side preferred likelihood, because models mimic differential weights with the added benefit of being grounded in statistics [194]. Weights and other parameters (box 3) upon which phylogenetic conclusions depend must be selected by the user. If their selection is arbitrary, then subjectivity remains, but it is transferred from a thinking person to a machine [181,182,185]. Many in the molecular-parsimony camp were dedicated to POY analyses, and believed that they were removing as much subjectivity as possible. In order to deal with the problem of objectively selecting analytical parameters, they developed a brand of sensitivity analyses [74,195] that was based on incongruence length difference (ILD) tests [196]. ILD testing involves comparing subdivisions of the data with combined data, seeking parameters that minimize incongruence. Criticism of ILD testing is beyond the scope of this review, but can be found in many works [182,197201]. However, even if ILD tests were legitimate, then one must decide which among many parameters should be evaluated, each with an infinite space to explore, with each influencing the behaviour of the others [182]. Grant & Kluge [202], in a particularly radical application of their own view of epistemological consistency, reject the whole idea of sensitivity analyses (box 3), and suggest that all parameters should be equal, and set to 1 on philosophical grounds. Ogden & Whiting [203] applied sensitivity analysis to the ‘Palaeoptera problem’—the phylogenetic positions of dragonflies and damselflies (Odonata) and Mayflies (Ephemeroptera) relative to insects that have the ability to fold their wings (Neoptera). They showed that the results were indeed sensitive to input parameters. In a justification for using a single analytical method, and counter to exploring the influence of the application of model-based methods, they stated that they ‘…do not consider congruence among different methodologies to be a suitable measure of robustness because agreement among inferior methods is nebulous at best’. It seems that this attitude was shared by both sides, as model-based analyses were not explored by the cladistics group until Terry & Whiting in 2005 [204], and parsimony analyses were virtually abandoned by the practitioners of likelihood. We see now that short, ancient internodes (box 2) are always sensitive to assumptions and input parameters. Thus, the nodes unseen by Börner in 1904 collapse with sensitivity analyses (box 3) as it was applied, as seen, for example, in fig. 18 in Wheeler et al. [74].

    Concerning the relationships among insect orders, the entire Sanger period provided few if any new insights that were widely agreed upon. Cockroach paraphyly, an apparent exception, had been suggested based on morphology [205]. Part of the lack of resolution came about, because the parsimony and likelihood schools were so far apart, and non-specialists could not choose between them. Even a hypothesis that was supported by practitioners on both sides of the analytical divide—Nonoculata (Protura + Diplura; appendix A)—now seems to be an error (but see [206]). In addition, the common result of finding snow fleas (Mecoptera: Boreidae) closer to the fleas than other mecopterans is now in question. These are very difficult phylogenetic problems. The current prevailing opinion is that model-based analyses outperform parsimony even when parsimony is weighted to be more realistic [123] (despite the editorial in 2016 in Cladistics [207]). Accepting this premise, the philosophically driven parsimony-based insect molecular phylogenies that dominated the literature in the 1990s and 2000s were an unfortunate detour, especially when compounded by the failure to recognize the errors resulting from DNA compositional bias, non-homogeneous substitution rates, alignment error and the inconsistencies of rRNA analysis with POY [107,171,172, 181184,208].

    12. The likelihood camp

    The basic principles and performance of likelihood [209215] were laid out by Felsenstein long before they became standard practice. Likelihood was first introduced into phylogenetic systematics by Cavalli-Sforza and Edwards in 1965 but it was not widely applied because user-friendly programs were not available until PHYLIP was developed in 1980 [216]. Even after likelihood programs were available, it took a while for people to develop an understanding of how models of evolution could lead to an estimate of phylogeny. Parsimony was far easier to grasp. Many were uncomfortable with the number of assumptions required for model-based analyses. However, although it was claimed that the assumptions required for equally weighted parsimony were fewer or non-existent, it is clear that they were simply not defined. If they were defined, then they would be exceedingly complex and unacceptably unrealistic [121,201]. Even if there were fewer assumptions, these few would still lead to error with certainty under common branch length combinations [188]. However, a major obstacle to using likelihood was that it was difficult to analyse more than 10 taxa in a reasonable time frame. Sophisticated models of evolution could not be implemented until computers gradually gained the speed to analyse datasets of typical size, more than 10 years after PHYLIP was introduced. Throughout the 1990s, as computational speed increased, phylogenetic methods based on likelihood as an optimality criterion grew in importance and implementation. In 2000, it could still take weeks on a desktop computer to analyse 500 nucleotides for 50 taxa. Bootstrapping or any kind of branch support (box 2) was difficult if not impossible for likelihood analyses with more than 25 taxa until fast maximum-likelihood programs were developed—PhyML [217], Garli [218] and RAxML [219].

    The motivation for implementing likelihood was strong. Felsenstein had demonstrated in 1978 [188] that, under parsimony with some branch length ratios, the addition of data would strengthen support for the wrong tree. He speculated that parsimony would work if rates of evolution were low or sufficiently equal among lineages. Hendy & Penny [220] extended Felsenstein's work to show that neither of these conditions for the success of parsimony would hold once the number of taxa exceeds four. They concluded that rather than unequal rates it was the long branches that were the problem, and introduced the concept of long-branch attraction. The idea that adding more data could not overcome this bias was hard to accept, and we still find the idea expressed that more genes or increased taxon sampling is a panacea. Into the 1980s, most cladists were still basking in the glow of defeating the numerical taxonomists (whom they called pheneticists). Although likelihood is clearly based on individual characters (like parsimony), its statistical underpinnings were incorrectly assumed by some cladists to link it to phenetics (box 2). This is ironic, because, as Tuffley & Steele [221] demonstrated, under certain (unrealistic) models of evolution, likelihood can be equated with parsimony.

    Bayesian analysis, which shares many properties with likelihood, was introduced into phylogenetics in 1999 [222], and could be implemented in the user-friendly program MrBayes [223]. By 2001, many in the likelihood school rapidly adopted MrBayes, because it calculated branch support in the form of posterior probabilities at lightning speed [141]. It was later realized that much longer Bayesian runs were necessary to be sure that the program had converged on the optimal answer, especially when data required complex models of evolution. However, the problems with model-based analysis were not entirely because of ignorance or the lack of computing power. The influence of long-branch attraction was debated [164,165,224], but its ubiquity was not fully understood. Models that did not accommodate key elements of reality, such as among-site rate variation (box 3), could be as error prone as parsimony, without parsimony's comfortable philosophical footing based on the perception that it minimized unjustified assumptions. It was uncomfortable to use a method so dependent on models, if one could not justify which model to use. Model selection became a major focus of phylogenetic studies [225227]. At first models of evolution were tested manually using likelihood ratio tests [228,229], but this became automated in 1998 with ‘Modeltest’ [230]. It seemed that this program almost always suggested the most complex model, which led to the development of decision theory (reviewed in Sullivan & Joyce [231]), including a stronger penalty for increasing the number of parameters (box 3).

    In hindsight, if we are to judge by current standards, and our ability to assess accuracy in the light of phylogenomic data, likelihood analyses were both more accurate and philosophically grounded. Most early likelihood practitioners in entomology confined themselves to intraordinal relationships, and their work has become relatively robust, as datasets have become larger. Friedrich & Tautz, in 1995 [232], were among the first to use likelihood to estimate deep arthropod relationships with PHYLIP [216]. Their analysis included three hexapods, and recovered Pancrustacea, and crustacean paraphyly. Likelihood (among other methods) was also used by von Dohlen & Moran [233] in 1995 to demonstrate the paraphyly of ‘Homoptera’. Frati et al. [228] used maximum-likelihood analyses of mitochondrial COII gene data in 1997 to examine relationships among springtails, and demonstrated that including a correction for among-site rate variation (box 3) in the analysis had more of an effect on likelihood scores than the substitution models themselves. Flook & Rowell [234] used likelihood methods to explore the properties of mitochondrial data among orthopterans. Whitfield & Cameron [235] found in 1998 that likelihood outperformed parsimony in their study of hymenopteran 16S rRNA. Lo et al. [236] used likelihood methods in 2000 to demonstrate the paraphyly of cockroaches. In 2001, Kjer et al. were the first entomologists to use Bayesian methods in their study of caddisfly (Trichoptera) phylogeny, and Kjer [107] was the first to include many insect orders with Bayesian methods.

    It was not until the mid-2000s that consensus among entomologists in the USA swung towards likelihood analyses for molecular data, but parsimony analyses are still being published because of cultural/historical factors, and is still being actively favoured by the journal Cladistics [207]. It can take a long time for attitudes to shift, and sometimes recollection is subject to ‘retrospective meaning change’ (see discussion by Hull [237]). As with debates over creationism, or climate change, the fact that there are two sides to an issue does not mean that both sides are equally supported. Parsimony for molecular data seems to be supported by faith. Sometimes progress in science comes, not from evidence or flashes of insight, but through strong personalities fading into retirement. (Paraphrasing Max Planck: ‘Science advances one funeral at a time'.)

    13. The dominance of ribosomal RNA

    Although many papers included small fragments of 28S or histone H3, it was the 18S that dominated results from the Sanger sequencing period, sometimes stabilized by morphological characters [107]. This was partially owing to historical artefact and partially due to the ease of amplifying and sequencing nuclear rRNA. The 18S gene suffers from extensive among-site rate variation [193] and severe alignment problems within some regions, whereas (unlike the 28S) the alignable regions are practically invariant. Kjer [107] explored the properties of the 18S, structurally aligned, using a model-based analysis that accommodated rRNA covariation [238]. His phylogeny was much closer to current consensus than previous parsimony analyses of 18S. In 2005, a large insect phylogeny was presented by Terry & Whiting [207], using histone H3, larger portions of the 18S and 28S, and a modified morphological data matrix from Wheeler et al. [74]. They focused on polyneopterans, including Mantophasmatodea for the first time, and included a Bayesian phylogeny along with their POY-based parsimony [204]. Their Bayesian analysis was a great leap forward. They recovered many of the nodes we now find with larger datasets; many for the first time with molecular data, including Xenonomia, Eukinolabia and Haplocerata (appendix A), which they named, as well as Polyneoptera, Neuropteroidea (Strepsiptera was not included) and Antliophora. In a counterpoint to POY-based analyses, Kjer et al. [141] provided a review of the data of the time, with a commentary on methods. They reported the results of a 15 000 nucleotide multi-gene supermatrix, put together from complete 18S, a large fragment of 28S, EF-1α, histone H3 and mitochondrial 12S, 16S, COI and COII, along with 170 morphological characters from the older sources, such as Hennig, and Kristensen [11,6264]. Their results came very close to our current consensus, particularly within the polyneopterans. In all Kjer's analyses, Strepsiptera and Zoraptera were excluded, because these taxa exhibited extreme substitution rate accelerations in their rRNA. He was also suspicious of the published Zoraptera sequences. Zoraptera was resequenced and a modified structural alignment [107] was used by Yoshizawa & Johnson [75] in order to place this difficult taxon. They found it to be sister to Dictyoptera, as did Ishiwata [170] with nuclear single-copy genes. They cited morphological support for this relationship from Boudreaux [69] and Kukalová-Peck [82]. Misof's group [239] provided an insect-specific secondary structural model, and re-evaluated Kjer's [107] analysis of 18S, with increased taxon sampling. They found similar results to those from earlier structural alignments although they found Zoraptera grouped with stoneflies (Plecoptera). This analysis included Strepsiptera, which, as in other likelihood analyses of rRNA [157,189] did not group with Diptera, but instead, in this case as sister to an implausible Diptera + ‘Coleoptera’ group, with the long-branch Diptera acting as a second internal root that rendered beetles paraphyletic. For the first time since 1997 [77], the molecular data recovered Hymenoptera as sister to the rest of Holometabola (appendix A; Aparaglossata). The most thorough exploration of rRNA-based insect phylogeny was completed by von Reumont et al. in 2009 [117]. They used an automated alignment algorithm that incorporated rRNA secondary structural information [186], eliminated randomized sites (phylogenetic noise) with the program Aliscore [240], and used more realistic substitution models. Thus, none of the previous criticisms over manual manipulation of alignments and manual data exclusion could be applied to this study, recovering Nonoculata, Ectognatha, Dicondylia, Pterygota, Chiastomyaria, Neoptera, Holometabola, Aparaglossata, Amphiesmenoptera and Mecopterida.

    Published phylograms (trees with branch lengths proportional to the number of estimated substitutions) [75,117,141,171,172] illustrate the extreme heterogeneity of rRNA substitution rates among orders, and this property causes problems with standard methods [172], even under likelihood. Protura and Diplura share extreme branch lengths relative to neighbouring Collembola and Archaeognatha, and the rRNA of both has extremely long regions of hypervariability that are difficult to align [241]. Phylograms show that Zoraptera, Strepsiptera and Diptera are also extreme. Odonata evolve more slowly than their neighbours in the tree. Ribosomal RNA analyses frequently recover Nonoculata [74,75,107,117,239,242245] Chiastomyaria [117,141,239], Dermaptera sister to Plecoptera [107,117,239], and mecopteran paraphyly [74,75,77,107,117,141,246]. The consistency of these results despite the differences in alignment and optimality criteria indicate that rRNA supports these relationships when analysed with existing methods, even though much larger datasets now contradict Nonoculata and mecopteran paraphyly.

    14. Other types of data

    Besides rRNAs and the few nuclear protein-coding gene studies, there were other novel character systems explored for insect phylogenetics, such as locations of introns, and mitochondrial gene order. Rokas et al. reported in 1999 that an insertion in a homeobox gene [247], shared by Diptera and Lepidoptera, was not found in Strepsiptera, contradicting Halteria. Carapelli et al. [248] noted that Collembola and Diplura shared the loss of an intron within EF-1α. Intron positions in EF-2 were mapped [249], showing that Coleoptera, Lepidoptera and Diptera shared a derived arrangement that was absent in Hymenoptera, predicting our current understanding. A survey of intron positions in EF-1α [250], found a remarkable tree from only six informative characters, but intron positions did show homoplasy (box 2), largely because of independent loss. A study of ecdysone receptors by Bonneton et al. [251] showed a significant rate acceleration that countered Halteria, as did their sequence analysis. Predel & Roth put their analysis of neuropeptides to use in studies of cockroaches, grasshoppers and Mantophasmatodea [252255]. Xie et al. tabulated the distributions and lengths of 18S hypervariable regions [241], and they reported that some of them could be used as synapomorphies for insect groups. In addition to updating a secondary structural model for insects, they found that Zoraptera and Dermaptera shared the greatest number of hypervariable regions of identical lengths. Boore et al. [256] examined mitochondrial gene order among arthropods in 1995, and they found that Pancrustacea was supported by a mitochondrial gene order character [257]. After this discovery, it was hoped that mitochondrial gene order might help resolve difficult nodes among insect orders, because it was assumed that gene order was highly conserved and unlikely to be homoplastic (box 2). However, the most controversial internodes are likely to be short. We can think of internodes as targets where the size of the target is proportional to the length of time an ancestral lineage exists before it splits. As in archery, small targets are hard to hit. Thus, the probability of hitting extremely short internodes with extremely rare events is extremely low. An understanding of this phenomenon is currently important in genomic studies, where it is hoped that, with an abundance of characters, extremely short internodes may be hit by extremely rare changes in the structure of genomes. Unfortunately, mitochondrial gene order is remarkably conservative in insects, except within Paraneoptera [258261] and Hymenoptera [262,263], with groups supported by changes in gene order, summarized in a recent review by Cameron [190].

    15. Mitochondrial genomes

    Mitochondrial data have been the subject of two recent reviews [190,264]. The accumulation of mitochondrial genomes continued through the 2000s [265269], at a slow pace, but picked up rapidly after 2003 with concerted efforts from Cameron, Song, and Whiting [190]. Currently, whole mtDNA genomes are accumulating very rapidly because they can be efficiently targeted with high-throughput (box 4) methods [270], and are often recoverable as accidental ‘by-catch’ in high-throughput sequencing. It was not until preliminary results from the full-scale efforts to sequence entire mitochondrial genomes were published [271274] that the extent of the problems with mitochondrial data became clear. Cameron and others concluded that mitochondrial data were promising, but that nucleotide compositional bias among lineages, unequal substitution rates among groups, and other long-branch effects must be carefully considered. Many of the relationships recovered with complete mitochondrial genomes were implausible, and they recommended that mitochondrial genomes be combined with other sources of data. Talavera and Vila found the same problems in 2011 [275], and proposed that deep nodes cannot be reconstructed with the methods of the time (which included Bayesian and likelihood analyses). Simon & Hadrys [274] recommended a similarly cautious view, finding many implausible relationships among orders even when using dense taxon sampling, and careful modelling. Chen et al. [276] found that including a projapygid helped recover dipluran monophyly, but they were still unable to recover hexapod monophyly with extensive taxon sampling among basal hexapods and arthropod outgroups. Cameron's more optimistic review of the phylogenetic implications of insect mitochondrial genomics, summarized model violations and made thorough recommendations for the appropriate treatment of mitochondrial genomes for phylogenetics. The issue of whether mitochondrial data are ‘good’ or ‘bad’ is clearly a gross oversimplification. Many ancient nodes that are accepted and corroborated by other data are recovered from mitochondrial data [274,277281], and many relationships among polyneopterans are shared between nuclear and mitochondrial analyses [190]. For example, using mtDNA genomes, Wan et al. [281] found many of the nodes that Misof et al. [50] recovered from transcriptomes (box 4) and some of these nodes (figure 5: P,Q,R) have only rarely been seen before. Mitochondrial data consistently recover Mantophasmatodea with Phasmatodea [273,281]. Cameron et al. [282] found Megaloptera sister to Neuroptera, reflecting the results from transcriptomes [50]. The number of nucleotides in any analysis is strongly correlated with branch support. However, as mitochondrial genes are all linked and thus inherited as a unit, once the gene tree is accurately recovered, there is little more in terms of corroboration that the full mitochondrial genomes can add. The motivations and disagreements today, in the era of ‘big data’ phylogenomics, are sometimes centred around those who advocate for more data (in terms of both longer sequences and more taxa), and those who advocate ‘better data’. This disagreement has been with us since the beginning of molecular systematics, and it misses the point that more data, better data and better models are all good things.

    Figure 5.

    Figure 5. Current consensus, modified from Misof et al. [50]. Previous studies mentioned in this review are numbered and colour coded on the left, with nodes they supported on the right. Red; morphology, without formalized data matrices. Orange: Morphology, with computer analysis. Blue: Sanger sequenced data in which rRNA played a predominant role. Black: Sanger sequenced multiple nuclear protein-coding genes. Green: large genomic or transcriptomic data. (b) The sizes of datasets, plotted through time. The y-axis is on a log scale. Colours as in (a) except that Liu & Beckenbach [118] were mitochondrial data. Data size is calculated by multiplying the number of taxa by the number of characters. For works where amino acids were used as characters, we multiplied the number of characters by 3, so that these datasets were comparable to nucleotide datasets. Transcriptome work often has many missing data, so that character numbers were multiplied by the proportion of data present.

    16. Work on individual orders

    This review has given short shrift to the vast majority of insect phylogenetics papers because of our focus on works addressing higher-level insect phylogeny. Given their almost unimaginable diversity, it is impossible for any individual to be considered an expert for all Hexapoda, and most workers spend their careers exploring particular groups. Here we list a sample of the recent advances from various authors in the phylogeny of Odonata [283290,292], Ephemeroptera [278,293,294], Plecoptera [295], Dermaptera [296], Embioptera [297], Phasmatodea [298], Dictyoptera [236,279,299301], Mantodea [302], Orthoptera [234,303306], Hemiptera [307,308], Psocodea [309], Hymenoptera [310317], Neuropterida [318320], Coleoptera [321323], Diptera [324,325], Lepidoptera [326334], Trichoptera [335337], Mecoptera [246,338,339] and Siphonaptera [340,341].

    17. Beyond the standard toolbox: multiple genes, transcriptomes and genomes

    Although many useful studies are still published with a few genes and morphology, phylogenetics today frequently involves the analysis of very large datasets. Savard et al. [342], in an early use of genomic phylogenetic resources in 2006, analysed 185 nuclear genes from four holometabolous orders, rooted with a grasshopper and an aphid, found results that are consistent with our current best estimates: (Hymenoptera, (Coleoptera, (Lepidoptera, Diptera))). At that time, most studies found Hymenoptera to be weakly supported as sister to Mecopterida; the group including Mecoptera, Siphonaptera, Diptera, Trichoptera and Lepidoptera, so the strong support for their alternative result led the authors to suggest that large datasets could resolve long-standing controversies in insect phylogenies. One of the first studies on Holometabola to break free of the standard rRNA and mitochondrial genes for interordinal analyses reported results from six single-copy nuclear genes [47]. A similar study using nine nuclear genes [154], found nearly identical results, both predicting our current understanding of relationships within Holometabola. Even though the datasets were no larger than previous rRNA-dominated analyses, and significantly smaller than the transcriptomic analyses to come (figure 5), the fact that both papers rejected Halteria independent of rRNA gave them extra impact. Three new nuclear protein-coding genes (DPD1, RPB1 and RPB2) were used in 2011 [170], further rejecting Halteria. Sasaki et al. [343] sequenced over 10 000 nucleotides from these same three genes, and focused their attention on the early splits among hexapods, with significant arthropod outgroups, and polyneopterans. They recovered the unusual result of (Protura, ((Collembola, Diplura), Insecta)), which has not been subsequently corroborated.

    18. Data-mining and big, automated phylogeny pipelines

    Behaviourists, ecologists and other biologists rely on phyloenetic trees to understand the evolution of complex characteristics. GenBank is data-rich, and the temptation to create pipelines (box 4) to download, combine, filter and analyse these data to produce a tree is strong. Building upon work by Hunt et al. [344] to generate large datasets from public databases, Peters et al. [312] developed a ‘proof-of-concept’ pipeline that mined GenBank for data from Hymenoptera, in order to construct a phylogeny with over 1000 taxa. The concept worked, but the phylogeny suffered from the quality of the original data in GenBank. Bocak et al. [322] constructed a phylogeny with public databases for more than 8000 beetle species. Again, this study proved that such a thing can work, and supported several disputed internal relationships. Zhou et al. [345] produced a phylogeny of over 16 000 barcode haplotypes from Trichoptera, but this study differed in that constraints were used to insulate the phylogeny from predictable errors. These studies provide evidence that producing huge phylogenies from public databases is feasible. However, based on our experience with genomic and morphological data, we caution that without analytical expertise for the specific properties of the data, as well as the insights of taxonomic specialists for a particular group of insects, it is impossible to reconstruct and evaluate the plausibility of phylogenetic relationships. This idea exemplifies the balance between skilled analyses that produce reasonable phylogenies, and the concern that unjustified or capricious decisions could bias phylogenetic conclusions.

    18.1. The Palaeoptera problem revisited

    An early transcriptomic analysis of seven pterygote orders, rooted with Collembola, grouped the mayflies with Neoptera (the Chiastomyaria hypothesis; appendix A) [346], but the Palaeoptera problem was far from solved, because Regier et al. [347], in a study focusing on arthropods, recovered a contradictory node Palaeoptera. Thomas et al. [348] evaluated the standard Sanger data in 2013, and found support for Palaeoptera. The first very large EST (expressed sequence tags = partial transcriptomes) dataset to evaluate arthropod relationships was published by Meusemann et al. in 2010 [349]. In addition to the size of the dataset, this paper was groundbreaking in terms of filtering the data. Randomized or phylogenetically uninformative sites were algorithmically identified and masked (box 4) with a program called Aliscore [240], and the matrix was optimized with the MARE program [350]. MARE eliminates both genes and taxa that are problematic owing to missing data, resulting in a smaller, but more dense matrix. Meusemann et al. recovered both Palaeoptera and Chiastomyaria using alternative analytical parameters (box 3). In addition, they also found that Hymenoptera was the sister taxon of other Holometabola, and a monophyletic Nonoculata, as in rRNA-dominated analyses.

    A consistent pattern with large datasets is that they tend to recover either Palaeoptera [117], Chiastomyaria or both [50,349], but rarely the morphologically favoured Metapterygota (but see [351]). While Misof et al. [50] reported the monophyly of Palaeoptera, the quartet mapping (box 4) analyses reported in their supplementary materials favoured Chiastomyaria and rejected the morphologically favoured Metapterygota. The resolution of Palaeoptera was predicted to be among the most difficult nodes to recover [352], and even now, with millions of nucleotides applied to the question, it must be considered unresolved (figure 5). We continue to see that the problem nodes from morphology still exist, with continued conflict for the placement of Diplura and Zoraptera and the status of Palaeoptera.

    18.2. Strepsiptera revisited

    Among the first of the truly genomic analyses was Niehuis et al. in 2012 [175], who sequenced the Strepsiptera nuclear genome, and compared it with previously sequenced genomes of two beetles, four hymenopterans, three flies and Bombyx (silkmoth), with two outgroups. They tested four hypotheses regarding the placement of Strepsiptera, and concluded that it belonged with beetles, as originally placed by morphologists. This was further strengthened by McKenna [176], who added a neuropteran genome to the analysis, and by Boussau et al. [177], who added transcriptomes from additional key taxa to genomic sequences, and analysed them with models designed to avoid branch-length effects. They ruled out the possibility of a close relationship between Strepsiptera and either of the beetle families, Rhipiphoridae or Meloidae (among others).

    18.3. Insect phylogeny resolved

    Many laboratories are now collecting large datasets from hybrid capture techniques, such as anchored hybrid enrichment [353], or ultraconserved elements (UCEs) [354]. Both methods allow for the recovery of data from degraded museum specimens. Anchored hybrid enrichment has the advantage that probes can be designed from transcriptomes, making data from the two sources completely combinable. UCEs have the advantage that probes can be designed without the need for sequences from closely related taxa. The Weirauch (Heteroptera), Johnson/Dietrich (Hemiptera and Psocodea), McKenna (Coleoptera), Wiegmann (Diptera), Kawahara (Lepidoptera), Ward, Borowiec, Schultz and Brady (Formicidae) and Kjer/Frandsen (Trichoptera) laboratories currently have large hybrid-enriched datasets in progress, and, according to their conference presentations, these data are largely resolving long-standing problems with strong bootstrap support.

    The Misof et al. insect phylogenomics study based on transcriptomes of 1478 genes [50], published in Science in November 2014, is by far the largest analysis to date of insect relationships and their phylogeny, and provides our current consensus (figure 5). Multiple technical advances occurred between Meusemann et al. in 2010 [349] and the Misof study in 2014 [50]. Both studies involved many of the same authors. Although the Misof study is short in print, one of their strongest contributions is the 200 pages of supplementary materials, which include recommendations for careful assembly (box 4), orthology prediction (box 4), data masking (box 4) and signal optimization. Protein domains were considered as partitions (box 4), and site-specific rate models were developed. Diplura was sister to Insecta, in agreement with Letsch & Simon [355], despite the recovery of Entognatha in other large datasets [347,349]. The 2014 Misof et al. study was the first of the publications from the 1KITE initiative, which has now collected transcriptomes from over 1400 taxa. Subprojects in the works from 1KITE include large datasets targeted at ‘basal hexapods’, Odonata, Polyneoptera, Paraneoptera, Hymenoptera, Coleoptera, Neuropterida, Trichoptera, Lepidoptera and Amphiesmenoptera, along with over 100 side projects dealing with molecular evolution in insects. These side projects put insect phylogenomics at the forefront of the discovery of character systems based on genomic meta-characters, and their impact reaches beyond entomology with development of new phylogenomic approaches. The Misof et al. study has already had a visible impact on phylogenetics of higher order groups by facilitating target enrichment and providing open access data.

    19. Integrated phylogenetics

    Innovative approaches such as µCT [356] and computer-based reconstruction [357], an optimized combined application of different techniques, and the concept of evolutionary morphology [358] have led to a remarkable renaissance in insect morphology in the last two decades, especially in Europe and Japan. Recent years have been characterized by matrices of increasing size, and a distinctly improved documentation of the characters made possible by the use of a broad array of techniques and an optimized workflow [359]. The Bayesian results of the largest morphological character state matrix used in insect systematics up to that time [48] were fully compatible with transcriptomic studies [49,50], indicating that morphology can still play a role in estimating and corroborating molecular phylogenetics.

    Many examples of the value of integrated phylogenetics come from Misof et al. [50]. Perhaps their most unusual result was the grouping of Psocodea with Holometabola, but the possibility that model misspecification had influenced this placement could not be eliminated [50]. Morphological data provide another reason to be sceptical [152]. Zoraptera was not strongly placed in their study either [50], but it was reliably placed in a monophyletic Polyneoptera, as also suggested by recent morphological and embryological studies [79,360]. Despite recent progress, it is obvious that morphology has its limitations. Even large datasets of high quality create partially unsatisfying results, sometimes despite impressive lists of shared-derived character states (synapomorphies) for presumptive clades, as in [360]. Artefactual synapomorphies created by phylogenetic analyses—i.e. ‘cladistic noise’—often suggest results that are in fact insufficiently supported or not supported at all by any convincing features. An example is the ‘clade’ Dictyoptera + (Zoraptera + Plecoptera) supported by a recent parsimony analysis of morphological data by Matsumura et al. [360]. As the authors pointed out, some of the obtained presumptive synapomorphies were obviously the result of misleading redundant evolution. Correlated characters can also cause artefacts in morphology-based phylogenetic reconstructions [361] as addressed in the context of the Palaeoptera problem [362]. Solutions were suggested, based on modified weighting or the exclusion of characters.

    19.1. The closest relatives of hexapods

    A crucial question was apparently inaccessible to morphological approaches but was largely solved with molecular data: the systematic position of Hexapoda. A monophyletic Tracheata (Myriapoda + Hexapoda) was considered as granted in morphology-based studies, with Hexapoda placed either as the sistergroup of Myriapoda (millipedes and centipedes) or as the sister taxon of a myriapod subgroup [363,364]. Molecular datasets of different size and composition, and analysed with different approaches, consistently yielded a clade Pancrustacea (also called Tetraconata), usually with hexapods placed among paraphyletic crustacean lineages [50,257,346,347,349,366368]. Even though some morphological arguments for Pancrustacea have been presented [369,370], a formal character analysis is still lacking and the morphological evidence is far from convincing. Possible candidates for the closest relatives of Hexapoda within Pancrustacea include the highly specialized relict group Remipedia, Malacostraca, and possibly the miniaturized Cephalocarida [50,347], although other studies contradict this [117,371]. The tremendous morphological gap between these aquatic groups and the terrestrial hexapods hinders meaningful comparisons of morphological characters and hypotheses of homology. It was pointed out by Klass & Kristensen [73] that the monophyly of Hexapoda is not strongly supported morphologically, with basically only one character complex defining it—the regional specialization of the body into head, thorax and abdomen, with the thorax divided into three-segments and the abdomen into 11. However, as shown in Beutel et al. [372], the Pancrustacea concept has strong implications for the hexapod groundplan. The strongly supported placement of Hexapoda among crustacean groups implies an entire series of additional hexapod autapomorphies, such as terrestrial habits, simplified walking legs, fusion of the second maxillae (labium), the loss of the ventral food rim, the absence of midgut glands and nephridial organs, and others.

    20. Confidence and caution

    Figure 5b shows an exponential growth in the size of datasets since the late 1980s, and we expect this growth to continue. It is tempting to think that every part of insect phylogeny has now been resolved with large datasets. Almost every node has strong bootstrap support. However, bootstrap support was designed to evaluate stochasticity and, with large datasets, stochasticity is reduced or even eliminated. This would be considered a good thing, if models and assumptions were perfect. However, because of the size of current datasets, small biases in the data, or misspecifications of the model can result in strong bootstrap support for error. We predict that model refinement will be a rich source of discovery in the future. For example, the failure to resolve the Palaeoptera problem, as indicated by quartet mapping (box 4) [50], may point towards a true case of conflicting gene tree histories. However, we are reluctant to assume this biological explanation without appropriate analysis. Such a comfortable explanation for misbehaved data, or inappropriate models, can make analytical failures seem like new discoveries. Every caution available at the time was applied by Misof et al. [50], and we find no reason to doubt their results. Moreover, their cautions are reflected in the expansive supplementary material [50]. We would like to emphasize that phylogenies can never be more than hypotheses, subject to the limitations of models and assumptions. This statement is obvious, but bears repeating in the light of enormous datasets that are now available.

    With every advance, from cladistics to Sanger sequencing, and now genomics, we saw a wave of overconfidence that intractable conflicts would be solved, only to learn of new obstacles. We are only beginning to understand the behaviour of large datasets, which is why we cannot write about these innovations from a historical perspective. The discipline will continue to improve. We still look for confirmation and congruence from other sources of data, such as morphology and rare genomic events.

    Both morphological and molecular investigations focused on insect systematics have made tremendous progress in the last decade. Similarly, the investigation of extinct insects has increased its pace with advanced morphological techniques allowing a stunningly detailed reconstruction of amber fossils. Although improvements could still be made in communication across different lines of investigation, it is unlikely that we will see many major revolutions in deep insect phylogeny, outside the groups, we have flagged as unresolved. The disagreement over parsimony versus likelihood has been resolved. With genomics, we will likely continue to learn about the function of genes and links between developmental and phylogenetic processes and how these processes change over time and across lineages. Optimized pipelines (box 4) of processing and connecting different sources of evidence is presently a key target for the future. Such pipelines are one of the main aims of 1KITE and associated projects. Continued integration of different disciplines will likely lead to a much better understanding of the complex evolution of insects, revealing why this group of organisms reached unparalleled species diversity and successfully conquered virtually every terrestrial and freshwater environment on the Earth.

    Appendix A.Names of higher taxa used in the text, and the groups they define. Other groups are indicated in figure 5. Taxa in bold are supported in the current consensus. Taxa in italics have been strongly rejected. Those without indication are [in the opinion of these authors] targets for additional attention. Citations are numbered as in the literature cited section, with the name of the first author and the last two digits of the publication year, for quick reference: [9] = Börner 04; [10] = Hennig 53; [11] = Hennig 69; [14] = Crampton 38; [31] = Wille 60; [34] = Mickoleit 73; [39] = Martynov 25; [47] = Wiegmann 09; [48] = Beutel 11; [49] = Peters 14; [50] = Misof 14; [62] = Kristensen 75; [63] = Kristensen 81; [69] = Boudreaux 79; [72] = Beutel 01; [74] = Wheeler 01; [76] = Wheeler 93; [77] = Whiting 97; [100] = Wipfler 11; [101] = Blanke 12; [107] = Kjer 04; [108] = Field 88; [110] = Turbeville 91; [114] = Wheeler 89; [116] = Pashley 93; [117] = Reumont 09; [140] = Edgecombe 00; [141] = Kjer 06; [152] = Beutel 06; [154] = McKenna 10; [159] = Chalwatzis 96; [170] = Ishiwata 11; [171] = Huelsenbeck 98; [175] = Niehuis 12; [177] = Boussau 14; [204] = Terry 05; [232] = Friedrich 95; [239] = Misof 07; [242] = Luan 05; [243] = Giribet 04; [244] = Gao 08; [245] = Mallatt 09; [248] = Carapelli 00; [249] = Krauss 04; [251] = Bonneton 06; [256] = Boore 95; [281] = Wan 12; [276] = Chen 14; [308] = Cryan 12; [319] = Aspöck 02; [343] = Sasaki 13; [347] = Regier 10; [349] = Meusemann 10; [351] = Simon 12; [355] = Letsch 13; [367] = Cook 01; [373] = Blanke 14; [374] = Beier 69; [375] = Hadrys 12; [376] = Letsch 12; [377] = Savard 06; [378] = Staniczek 00; [379] = Bitsch 04; [380] = Kristensen 97; [381] = Bitsch 00; [382] = Shao 99; [383] = Giribet 01; [384] = Mallatt 06; [385] = Dell'Ampio 09; [386] = Hovmöller 02; [387] = Pisani 04; [388] = Regier 01; [389] = Rota-Stabelli [390] = Seeger 79.

    taxon taxa included evidence references
    Acercaria Thysanoptera, Hemiptera, Psocodea cerci absent, 1 abdominal ganglionic mass, lacinia chisel-like, four Malpighian tubules [9,11,14,31,39,62,72,74,77]mc[107,141,152,239,351,375]
    Aparaglossata Holometabola, minus Hymenoptera loss of paraglossae,ovipositor modified or reduced, max. eight Malpighian tubules [4750,77,117,154,170,175,177,239,342,343,349,351,355,375377]
    Amphiesmenoptera Lepidoptera, Trichoptera many (see Kristensen) generally accepted
    Cercophora Diplura, Insecta double claws, 9+9+2 axoneme, cerci [14,50,72,355]
    Chiastomyaria Ephemeroptera, Neoptera indirect flight musculature, copulation with aedeagus [31,107,117,239,281,355,378]
    Coleopterida Coleoptera, Strepsiptera posteromotorism [9,11,14,31,4750,62,72,154,170,175,177,343,375,376]
    Condylognatha Thysanoptera, Hemiptera mandibular stylet(s) [9,11,31,50,62,308]
    Dicondylia Zygentoma, Pterygota secondary mandibular joint, gonanglum [9,11,14,31,50,62,72,74]mc [107,117,141,170,276,343,349,347,375]
    Dictyoptera Blattodea (incl. termites), Mantodea secondary anterior tentorial bridge, female genital vestibulum, ootheca generally accepted, although some put roaches with the mantids (based on shared plesiomorphies)
    Dictyoptera+Zoraptera Dictyoptera, Zoraptera [69] tentatively [74]mc [251,355,374]
    Ectognatha (Insecta) Archaeognatha, Dicondylia ovipositor, flagellar antenna, Johnston's organ, corpotentorium, terminal filament generally accepted
    Ellipura Collembola, Protura specific entognathy, linea ventralis [10,11,62,63,72,74]mc [140,343,379382]
    Entognatha Collembola, Protura, Diplura entognathy, eyes partly reduced [11,31,62,107,117,141,242,248,320,347,349,374,375]
    Halteria Strepsiptera, Diptera rRNA (parsimony, NJ), substitution rate similarity, phenetic nucleotide compositional similarity [74,77,159]
    Haplocerata Plecoptera, Zoraptera transcriptomes [239]
    Hemiptera Acercaria excl. Psocodea and Thysanoptera four-segmented labial rostrum, labial endite lobes and palps absent, buccal pump generally accepted
    Holometabola Neuropteroidea Mecopterida Hymenoptera complete Metamorphosis, characters of larvae generally accepted
    Mecopterida Amphiesmenoptera, Antliophora ecdysone receptors, ovipositor absent, telescoping post-abdomen [10,11,14,31,4750,62,72,116,117,141,154,170,175,177,204,251,343,349,351,355,374378]
    Metapterygota Odonata, Neoptera secondary mandibular articulation as ball-and-socket joint, modified mandibular muscles, no subimago [14,62,72,74]mc[77,114,204,351]
    Neoptera Pterygota, minus Palaeoptera wings folded back over abdomen, modified wing base, arolium (?) generally accepted
    Neuropterida Raphidioptera, Megaloptera, Neuroptera third valve of ovipositor with intrinsic muscles generally accepted
    Neuropteroidea Neuropterida, Coleopterida modifications of ovipositor (?) [36,47,49,50,62,72,141,154,159,170,204,343,375]
    Nonoculata Protura, Diplura rRNA, EF-1α introns, eyes lost [107,141,239,242245,248,249,349,351,375,376,383385]
    Palaeoptera Odonata, Ephemeroptera bristle-like antennae, aquatic larvae [9,11,50,101,141,159,170,343,347,349,386]
    Pancrustacea ‘Crustacea’, Hexapoda rRNA, mitochondrial gene order, four-partite crystalline cone [50,76,107,108,110,117,141,232,242,244,245,256,276,343,349,347,384,385,368,387389]
    Paraneoptera Acercaria, Zoraptera 6/4 Malpighian tubules, 3/2 tarsomeres, 2/1 abdominal ganglionic complexes [11,31,62,152,376]
    Paurometabola Polyneoptera excl. Plecoptera fan-like folding of hind wing, enlarged euplantulae, terrestrial larvae [10,375]
    Polyneoptera Neoptera excl. Acercaria and Holometabola tegmina, enlarged anal field of hind wing, euplantulae [14,31,39,50,343,347,355]
    Psocodea Psocoptera, Phthiraptera cibarial water-vapour uptake apparatus, antennal rupture facilitating device [11,14,31,50,72,74]mc[77,107,170,177,239,343,375,390]
    Pterygota Winged insects (including secondarily wingless orders) wings, copulation (?) generally accepted
    Thysanura, s.l. Archaeognatha, Zygentoma phenetic similarity (symplesiomorphies) [9,375]
    Thysanura, s.s. Zygentoma sperm coupling, loss of superlinguae [50,239,373]
    Tracheata Myriapoda, Hexapoda tracheal system, Malpighian tubules, spermatophore, loss of second antenna etc. [140,380] generally accepted before 1990.
    Xenonomia Grylloblattodea, Mantophasmatodea rRNA, transcriptomes [50,100,141,152,204]

    mc, morphology and combined data.

    Authors' contributions

    K.M.K. and R.G.B. wrote about their respective areas of expertise. C.S. provided comments, fact-checking and additional historical and methodological insights. M.Y. interpreted and summarized Russian works.

    Competing interests

    We have no competing interests.

    Funding

    We received no funding for this study.

    Acknowledgements

    We thank Nicole Tam for preparing the illustrations. Phil Ward, Marek Borowiec, Harald Letsch, Charles Mitter, John Huelsenbeck, Bjorn v. Reumont, Duane McKenna, Günther Pass, Alexander Blanke, Bernhard Misof, and Sabrina Simon provided helpful comments. John Morse provided valuable insights about his advisor, Herbert Ross. K.M.K. thanks the Schlinger endowment for funding.

    Footnotes

    Published by the Royal Society. All rights reserved.

    References