To refer to this article use this url: http://www.ctoz.nl/vol71/nr01/a06


Contributions to Zoology, 71 (1/3) (2002)

Boolean logic and character state identity: pitfalls of character coding in metazoan cladistics

Ronald A. Jenner

Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Mauritskade 57, 1092AD Amsterdam, The Netherlands. Present address: University Museum of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, United Kingdom

Keywords: metazoan cladistics, Metazoa, character coding, character state identity, Boolean logic, non-additive binary coding, absence/presence coding

Abstract


A critical study of the morphological data sets used for the most recent analyses of metazoan cladistics exposes a rather cavalier attitude towards character coding. Binary absence/presence coding is ubiquitous, but without any explicit justification. This uncompromising application of Boolean logic in character coding is remarkable since several recent investigations have nominated absence/presence coding as the most problematic coding method available for standard cladistic analysis. Moreover, the prevalence of unspecified “absence” character states in the published data sets introduces a discrepancy between the theoretical foundations of phylogenetic parsimony and current practices in metazoan cladistics. Because phylogenetic parsimony assumes transformation of character states, its effective operation breaks down when not all character states are carefully delimited. Examples of resulting meaningless character state transformations are discussed in two categories: 1) when unspecified “absence” states are plesiomorphic; and 2) when unspecified “absence” states are apomorphic (character reversals). To facilitate future progress in metazoan cladistics, the mandatory link between comparative morphology and character coding needs to be reestablished through a more explicit study of morphological variation prior to character coding, and through a more explicitly experimental approach to character coding.

Introduction


The self-evident fact that the structure of the data matrix determines the outcome of a cladistic analysis hardly needs mentioning. Data matrix construction arguably is also the most difficult step of a cladistic analysis, and it is the only anchor that connects a cladogram to the empirical world. However, a remarkable paradox of cladistic practice in metazoan phylogenetics then becomes apparent. This most important and difficult aspect of cladistic analyses has received surprisingly little explicit attention, either theoretical or practical, since the first computer-assisted morphological cladistic analyses of the animal kingdom were published over a decade ago. This observation becomes especially striking when one compares the lack of explicit attention directed towards construction of a robust morphological data set with the intensive efforts to extract phylogenetic signal from a given matrix.

Several authors have noted a general trend in contemporary phylogenetic research where an increase in the emphasis on the phylogenetic analysis of a given data set is paralleled by a decrease in the explicit attention directed towards constructing that morphological data set (Grande & Bemis, 1998; Poe & Wiens, 2000; Rieppel & Kearney, 2002). This imbalance between two necessary aspects of cladistic analyses (character definition and phylogenetic analysis) is succinctly epitomized by the most recent comprehensive study of higher-level arthropod relationships based upon both molecular and morphological data (Giribet et al., 2001). The authors found it worth mentioning that they performed 120 independent phylogenetic analyses by varying sets of parameters and data partitions, “executed in parallel in the 256 processors, totaling 2 months of intense computation time using extremely effective tree search algorithms and an aggressive search strategy, equivalent to 42 years of computing time if analyses had to be conducted in a single-processor machine.” Nevertheless, not a single cladistic character or character state transformation is mentioned in their paper! Instead, the characters are listed only in a supplementary appendix that can exclusively be accessed online.

This imbalance may in some instances be a rather harmless reflection of space limitations in the medium chosen to report the results, as may be expected for the comprehensive study of arthropod relationships of Giribet et al. (2001). However, in other cases, the imbalance may signal a problem in the quality of the compiled morphological data set. Serious concerns about the quality of cladistic data matrices traverse a broad spectrum of taxa; these include parasitic flatworms (Rohde, 1996), hydrozoans (Marques, 1996), crustaceans (Watling, 1999; Fryer, 1999, 2001; Olesen, 2000), arthropods (Shultz & Regier, 2000; Schram & Jenner, 2001), reptiles (Lee, 1995; Rieppel & Reisz, 1999; Rieppel & Zaher, 2000; Rieppel & Kearney, 2002), fishes (Patterson & Johnson, 1997; Grande & Bemis, 1998), and metazoans (Nielsen, 1998; Jenner & Schram, 1999; Jenner, 2001, submitted). This article addresses this problem in detail for cladistic analyses of the Metazoa.

In the following, I will critically review the current practice of data matrix construction by focusing on character coding in the five most recently published cladistic analyses of the Metazoa that: 1) used a morphological data set (sometimes part of a total evidence analysis), and 2) sampled most of the major animal taxa: Giribet et al. (2000) (based on the data set of Zrzavý et al., 1998), Sørensen et al. (2000), Nielsen (2001), Peterson & Eernisse (2001), and Zrzavý et al. (2001). This paper for the first time addresses the critical issue of whether current practice in the coding of characters in metazoan cladistics is consistent with the theoretical underpinnings of phylogenetic parsimony analysis.

Some fundamentals of phylogenetic parsimony: primary homology, transformational homology, and character state identity


First, some terms need to be defined. By character coding I mean the definition (delimitation) of a character and its character states, i.e., the construction of columns in a data matrix. By character scoringI mean the assignment of different character states to the terminal taxa, the filling in of the columns of a data matrix. Many authors designate both these steps as character coding. Both steps should be rooted in careful morphological study. The selection of characters can be understood as an aspect of character coding as well (see Jenner & Schram, 1999, and especially Jenner, submitted, for discussions of character selection in metazoan cladistics).

Second, it is necessary to summarize the fundamental assumptions of phylogenetic parsimony analysis in order to appreciate the merit of currently adopted character coding schemes in metazoan cladistics. At this point the reader should note that the following selective references to the literature on the theoretical foundations of cladistics are not meant to be anything close to comprehensive, instead I focus chiefly on the most recent discussions and syntheses.

The very first step in any morphological cladistic analysis is a study of morphology/anatomy in a comparative framework to identify comparable features in different taxa on the basis of conjectures of similarity (De Pinna, 1991; Brower & Schawaroch, 1996; Hawkins et al., 1997; Rieppel & Kearney, 2002). This study establishes so-called hypotheses of primary homology that are codified as characters and character states in a cladistic data matrix. The aim of character coding is to represent as accurately as possible observed organismic variation in a format amenable to cladistic analysis. This is the foundation of all computer-assisted cladistic analyses of metazoan morphology published to date that employ phylogenetic parsimony analysis, also variously known as standard cladistic analysis or phylogenetic systematics. In the context of phylogenetic parsimony analysis, a character is defined as a set of attributes or alternative conditions called character states that vary between taxa, are considered as “the same but different” forms of the same thing (character), and can evolve or transform into each other (e.g., Brower & Schawaroch, 1996; Hawkins et al., 1997; Hawkins, 2000; Maddison & Maddison, 2001). This establishes a framework of transformational homology that forms the conceptual basis of all widely used phylogenetic parsimony software, such as PAUP (Swofford, 2002), Hennig86 (Farris, 1988), and MacClade (Maddison & Maddison, 2002). All metazoan cladistic analyses discussed herein employ these programs. The presumed independence of cladistic characters (Emerson & Hastings, 1998) therefore contrasts with the potential of evolutionary transformation between character states within a character. Accordingly, phylogenetic parsimony is implemented by the counting of character state transformations (steps) on a cladogram aimed at minimizing ad hoc hypotheses of homoplasy (not the minimization of natural processes such as evolutionary change, as claimed by some; Kluge, 2001).

Out-group comparison is the accepted and universally used method for polarizing character state transformations, and it is employed in all analyses of metazoan cladistics considered here. This logically implies that prior to the phylogenetic analysis, the plesiomorphic and apomorphic character states remain unknown. Thus “all shared, identical character states represent conjectures of potential homology, and count as evidence in phylogenetic analysis, even if they are subsequently discovered to be symplesiomorphic” (Brower & Schawaroch, 1996: 267-268). Since all identified character states are potentially phylogenetically informative and are treated as such by computer-assisted cladistic analyses, and since cladograms are constructed and evaluated on the basis of character state transformations, it is crucial to precisely delimit and define alternative character states on the basis of careful morphological study. In other words, in order to represent morphological variation as accurately as possible, and in order to be able to meaningfully interpret character state transformations within the context of transformational homology it is crucial to properly establish character state identity (Brower & Schawaroch, 1996). Careful study of morphological similarity (Patterson, 1982; Hawkins et al., 1997; Hawkins, 2000; Rieppel & Kearney, 2002) is imperative for constructing character states that correspond to precisely defined primary homology conjectures. The characters states of all characters submitted to standard cladistic analysis need to be carefully delimited, irrespective of whether the coded characters are binary or multistate.

Having so represented the universally accepted theoretical foundations of phylogenetic parsimony, it becomes necessary to ask whether current practices of character coding in metazoan cladistics are consistent with the assumptions of cladistic parsimony.

Seeing the world through Boolean eyes: the prevalence of binary character coding in metazoan cladistics


Coding morphological variation for use in a cladistic analysis of the animal phyla [I use the term “phylum” as a general descriptor of higher-level taxa without any Linnaean rank connotations] is beset with difficulties, leading at least one author to exclaim in apparent desperation that the “choice and interpretation, i.e., coding, of characters pose enormous problems”, and that the “the choice and definition of taxa and choice and coding of characters become a complete quagmire.” (Nielsen, 2001: 499). In view of these despairing remarks, it is only natural to expect that cladistic data matrix compilation has received ample explicit attention to ensure that the data maximally reflects variation observed in organisms. As will be demonstrated here, the reality is quite different.

Typically we are presented with variations upon the minimally transparent statement that morphological characters “were compiled from the phylogenetic literature” (Zrzavý et al., 1998: 251). A larger section of the paper subsequently discusses aspects of cladogram construction, and finally the resulting topology is compared to topologies supported by other analyses. Strikingly, none of the five cladistic analyses of metazoan morphology published in the new millennium provided any balanced justification for their choice of coding method. For example, Giribet et al. (2000) simply recycled the data set of Zrzavý et al. (1998), thereby incorporating several important shortcomings of the data matrix of the latter into a new phylogenetic analysis (Jenner, 2001). At most, a statement was offered that a choice of coding method had in fact been made, e.g., Nielsen (2001), but for the most part adopted character coding strategies are never convincingly defended with explicit reference to the known strengths and weaknesses of different coding techniques. The most comprehensive and detailed justification for the choice of a coding method is given in Peterson & Eernisse (2001: 173), who “acknowledge that these coding issues are contentious but feel that at the moment this [binary absence/presence coding] is the most conservative coding scheme available.” The reader, however, is left guessing as to why this should be so. This relative neglect of a fundamental aspect of metazoan cladistics is unjustified in view of the insights yielded by more than a decade of theoretical and experimental studies into the logic of character coding (Pimentel & Riggins, 1987; Pogue & Mickevich, 1990; Pleijel, 1995; Wilkinson, 1995; Hawkins et al., 1997; Scotland & Pennington, 2000).

Table 1 summarizes the adopted character codings across the five studies considered here. One is immediately impressed by a strong predilection for Boolean logic. Boolean logic is a useful tool for the ordering of diversity, and it is widely applied in structuring a wide range of phenomena, including concepts and items. Boolean logic is a symbolic logic system based on a form of algebra in which values or statements are reduced to being either true or false, and it functions by means of Boolean operators, the most familiar of which are AND, OR, and NOT. These operators are essential for the proper operation of digital computers, and they are habitually used, for example, to structure the actions of search engines on the internet. The simplest form of Boolean logic is a binary set, each component of which can have a value of “0” or “1.” The characteristics of Boolean algebra that make it so useful for representing complex command strings in computing, also make it suitable for representing the diversity of metazoan morphology in a manner amenable to computer analysis. It is, therefore, perhaps not so surprising to find that the great majority of characters employed in cladistic analyses of the animal phyla are coded according to simple Boolean logic, i.e., as binary characters (96.4% of the total number of characters), and 92.1% of these binary characters are coded as absence/presence (a/p) characters. This contrasts sharply with the statement of one of the chief proponents of a/p coding in cladistics: “Consistently applied a/p coding is rarely seen” (Pleijel, 1995: 313).

 

Table 1. Character coding across five morphological data matrices published since the year 2000. The table summarizes absolute numbers of characters and their relative percentages coded by one of three methods: binary a/p = absence/presence; binary p.h. = binary with paired homologues; multistate. See text for discussion.

Studies

Binary a/p

Binary P.H.

Multistate

Total/study

Giribet et al. (2000)

223 (80.8%)

42 (15.2%)

11 (4.0%)

276

Sørensen et al. (2000)

64 (97.0%)

-

2 (3%)

66

Nielsen (2001)

64 (100%)

-

-

64

Peterson & Eernisse (2001)

129 (93.4%)

-

9 (6.6%)

138

Zrzavý et al. (2001)

56 (93.3%)

4 (6.7%)

-

60

Total across studies

536 (88.7%)

46 (7.6%)

22 (3.6%)

604

The percentage of characters coded as a/p in metazoan cladistics is also much higher than that observed for a variety of matrices published in botanical journals (less than 1% of 1404 characters in 34 matrices is coded strictly as a/p) as reviewed in Hawkins (2000).

This preference for binary character coding, especially a/p coding (non-additive binary coding), and the seeming reluctance to code multistate characters to represent the diversity of metazoan morphology is typical for cladistic analyses of the animal phyla in general, such as Schram (1991), Eernisse et al. (1992), Nielsen et al. (1996), and Wallace et al. (1996). However, exceptions do exist. Haszprunar (1996a), for example, coded 11 out of 40 characters multistate (27.5%), and Rouse & Fauchald (1995) coded 4 out of 13 characters (30.8%) multistate.

Can we identify a general reason for this remarkable preference for binary character coding in metazoan cladistics? Is the universal preference for binary character coding, in particular a/p coding, defensible within the context of standard parsimony analysis? Is the choice of a particular coding method inconsequential for the outcome of a cladistic analysis? What is the effect of employing a/p coding for the interpretation of character state transformations in the study of the evolution of animal body plans? The following attempts to provide some answers to these questions.

A/p coding and metazoan cladistics: general problems


The five most recently published cladistic analyses of the Metazoa employ three distinct kinds of character coding:

1) binary a/p coding (non-additive binary coding) (536 characters)

2) binary coding with paired homologues (46 characters)

3) multistate coding (22 characters)

The distinction between binary a/p characters and binary paired homologue characters (categories 1 and 2 above) is not important in the context of transformational homology used in standard cladistic analysis, but it becomes crucial if a taxic view of homology is adopted (Carine & Scotland, 1999; Scotland, 2000a). Although the cladistic analyses of metazoans considered here necessarily operate within a framework of transformational homology, the distinction between a taxic and transformational approach to homology needs to be outlined in some detail in order to unveil a number of serious shortcomings of character coding in these recent analyses of metazoan cladistics.

The logical justification for a/p coding in cladistic analysis is outlined in Patterson (1982), who advocated a taxic approach to homology with the express purpose of using homologies to find monophyletic groups. Patterson’s taxic approach to homology should not be confused with the “taxic” approach to evolutionary theory as formulated earlier by Eldredge (1979), which marries the taxic and transformational components of Patterson’s logic (Farris et al., 2001). “Taxic” in this paper strictly refers to Patterson’s conceptualization. Patterson contrasted the taxic approach with a transformational approach to evolution, which is concerned with evolutionary change of characters. Although Patterson (1982: 36) admitted that “the transformational approach to homology may be more informative and a lot more interesting than the taxic approach,” he nevertheless advocated the taxic perspective because “concentrating on transformations at the expense of taxa is not fruitful” (p. 36). Interestingly, this led to Patterson’s explicit denial of a relationship between taxic homology and evolutionary change: “If phylogeny has to be about evolution, homology has nothing to contribute to it” (Patterson, 1982: 67). This is in stark contrast to the logic of phylogenetic parsimony as outlined above, which is concerned with nothing if not the recovery of the evolutionary history of life.

Under the taxic perspective to homology a distinction can be made between the complement relation of homologues, i.e., the presence of a homologue versus its absence, and the presence of paired homologues, i.e., homologues present in two or more distinct forms (Patterson, 1982; Carine & Scotland, 1999; Scotland, 2000a). The complement relation is the basis of binary a/p coding, while the paired homologue relation underlies the logic of binary paired homologue coding, as well as multistate coding. The distinction between the complement relation and the paired homologue relation is important under a taxic approach to homology because only presence of a character (state) carries potential phylogenetic weight (Carine & Scotland, 1999; Scotland, 2000a). This means that in a/p characters the “absence” character state is empirically empty, and consequently is not strictly defined as a character state, while in paired homologue characters all different “presence” character states furnish potential phylogenetic evidence.

The importance of a distinction between taxic and transformational approaches to homology was recently debated in relation to the value of modified three-taxon analysis (Carine & Scotland, 1999) as an alternative to standard cladistic analysis based upon phylogenetic parsimony for grouping taxa (Carine & Scotland, 1999; Scotland, 2000a; Kluge & Farris, 1999; Farris et al., 2001; Kluge, 2001). While proponents of the taxic approach to homology consider it a strength of a/p coding that no assumptions are made concerning the homology of characters states within a character, and thus potential transformations between them (Pleijel, 1995; Carine & Scotland, 1999; Scotland, 2000a, b), this interpretation fundamentally contradicts the theoretical foundations of phylogenetic parsimony analysis, as implemented in programs such as PAUP, Hennig86, and MacClade, where character state transformations assume a central role.

Within the framework of transformational homology the distinction between the complement and paired homologue relations is insignificant because phylogenetic parsimony operates by counting character state transformations, be it between absence and presence of a feature (a/p characters), or alternative forms of a feature (binary paired homologue and multistate characters). This implies that all character states within a character need to be explicitly constructed on the basis of a careful study of morphological similarity, or else character state transformations may become meaningless. A special difficulty is thus introduced for “absence” characters states, in particular when a feature is present in various distinct forms, and is inapplicable for certain taxa. The need to carefully delimit all characters states of a character, including the “absence” states, is one feature that distinguishes the transformational from the taxic approach to homology. As will be discussed below study of character state identity in the most recent analyses of metazoan cladistics indicates that more than 40% (see Appendix) of the included a/p characters have problems with character state identity of “absence” states.

A/p coding is perfectly legitimate when the goal is to express whether a feature is simply absent or present among the taxa of interest. However, when a feature shows morphological variation between terminal taxa a number of different coding decisions can be made. It then becomes crucial to recognize that different coding methods have distinct strengths and weaknesses, and given a certain set of taxa and morphological features different coding methods may yield different phylogenetic results. This is amply illustrated by various detailed studies for diverse taxa and characters where distinctly different cladograms may emerge for the same taxa under different character coding strategies (Rouse & Fauchald, 1997; Rouse, 2001; Strong & Lipscomb, 1999; Forey & Kitching, 2000; Hawkins et al., 1997; Hawkins, 2000).

Although all available coding methods have their idiosyncrasies, a strong conclusion from recent studies is that a/p coding of multistate variation suffers from several flaws that strongly compromise the value of this coding method (Hawkins et al., 1997; Strong & Lipscomb, 1999; Hawkins, 2000; Forey & Kitching, 2000). Among the acknowledged weaknesses of a/p coding are the introduction of: 1) redundant characters, 2) logical dependence between different characters, 3) grouping on the basis of non-homologous absences, and 4) the negation of the central role of comparative morphology in cladistics (see Pleijel, 1995; Wilkinson, 1995; Hawkins et al., 1997; Strong & Lipscomb, 1999; Hawkins, 2000; Forey & Kitching, 2000 for detailed discussions). This study focuses on the latter two categories, specifically discussing difficulties introduced by a/p coding for the identity of character states, and the interpretation of character state transformations.

In view of problems such as these with a/p coding, it is surprising to find that the bulk of characters (88.7%) coded in cladistic analyses of the Metazoa published during the last two years are a/p characters. If morphological variation within the Metazoa is strictly dichotomous, with features either being absent or present, than the uniform adoption of Boolean logic to represent this variation in a data matrix may be entirely justified. Obviously that is not the case. The next section discusses the pitfalls of a/p coding of multistate variation in metazoan cladistics.

The failure of Boolean logic: character state identity and unspecified “absence” character states in metazoan cladistics


Within the context of phylogenetic parsimony and transformational homology it is essential to perform a detailed study of morphological similarity in order to properly delimit all character states within a character as “the same but different” (Hawkins et al., 1997; Rieppel & Kearney, 2002). In the cases of binary coding with paired homologues and multistate coding, it is obvious that a certain degree of morphological study underlies the identification of character states. Both coding types try to accurately represent observed character variation as separate expressly delimited character states within a single character. In contrast, the Boolean reduction of morphological variation that defines a/p coding slights the central role of comparative morphology in phylogenetics (Pogue & Mickevich, 1990; Hawkins et al., 1997; Hawkins, 2000; Forey & Kitching, 2000). This becomes especially clear in recent studies of metazoan cladistics when the “absence” character states are considered.

A striking asymmetry is apparent in the care taken to define “presence” and “absence” character states. The appendix summarizes examples of characters from the five most recent cladistic studies of the Metazoa that suffer from a specific character coding problem: the “absence” character states are unspecified, and they are scored for taxa with very dissimilar morphologies. This fundamentally contradicts the assumption of standard cladistic analysis outlined above that all character states should be conjectures of primary homology rooted in morphological similarity. As De Pinna (1991: 377) concluded, unspecified character states are empirically empty: “conjectures of primary homology that do not conform to the criterion of similarity simply do not exist.” Because only cladogram rooting will determine what the plesiomorphic and apomorphic character states are, states that are not clearly delimited cannot be meaningfully analyzed within the context of transformational homology.

This problem affects more than 40% of all coded a/p characters (Appendix), and this is merely a conservative estimate. For example, it appears to be no problem to code a character for gap junctions a/p for metazoans, and to score all phyla as either having or lacking them: characters 5 in Sørensen et al. (2000) and Nielsen (2001), and character 4 in Peterson & Eernisse (2001). This character appears to be applicable to all animal phyla. Similarly, the adopted coding and scoring of chitinous chaetae/setae in phyla such as Annelida and Brachiopoda is not considered as a problem here. However, because chaetae/setae are modifications of chitinous cuticles, strictly speaking this character is only applicable in taxa that possess cuticles with chitin. Other phyla would then have to be scored for this character as “inapplicable”. Yet, one might conceive of ultrastructurally similar chaetae as being constructed from a non-chitinous cuticle. The examples included in the appendix are restricted to those instances where the unspecified character states are obviously coded and scored as a default whenever the “presence” state did not occur for a phylum. Such a procedure is compatible with a taxic approach to homology as is implemented, for example, in modified three-taxon analysis, where the “absence” states carry no empirical information (Carine & Scotland, 1999; Scotland, 2000a; Williams & Siebert, 2000). Nonetheless, the coding and scoring of such “trash can” character states is incompatible with standard cladistic analysis based upon phylogenetic parsimony. The character state transformations that are at the heart of such analyses are only meaningful when all states are carefully defined.

The lack of attention directed towards defining the “absence” character states is particularly puzzling in view of the obvious care taken to properly code and score subtle variations for some of the characters. For example, for at least 62 characters in the data set of Peterson & Eernisse (2001) it appears that all care in character coding has been directed towards the “presence” states, while the “absence” states are essentially unspecified. It is therefore puzzling to find that for some of their characters morphological study apparently did play an important role in coding and scoring. For example, their character 67 proposes a primary homology of different cell types that are thought to be part of filtration nephridia (terminal cells, podocytes, and nephrocytes, with terminal cells a component of protonephridia, podocytes and nephrocytes components of metanephridial systems, and nephrocytes used to refer to nephridial systems present in arthropods and onychophorans). Phyla lacking any of these cell types are scored as “absent,” while phyla showing any of these cell types are scored as “present.” This scoring is supported by data on the ontogenetic continuity of protonephridia and metanephridia in certain polychaetes and phoronids, and by a continuum in cytological differentiation and function between the different cell types involved (Ruppert & Smith, 1988; Smith & Ruppert, 1988; Bartolomaeus & Ax, 1992; Smith, 1992; Ruppert, 1994; Haszprunar, 1996b). A third character state is erected and uniquely scored for Nematoda, which possess no elements normally part of filtration nephridia. Although Nematoda could simply be scored as “absent” for nephridial cell types, Peterson & Eernisse (2001) instead chose to create a separate character state for them because they felt that “coding nematodes as equivalent to cnidarians, ctenophores et al. is not appropriate” (p. 199). Irrespective of the justification of this decision, it clearly shows that for this character Peterson & Eernisse (2001) felt the need to more precisely code the morphological variation otherwise subsumed within an unspecified “absence” state. The coding and scoring of the remaining a/p characters could have benefited from similar attention. This indicates that Peterson & Eernisse (2001) were to a certain extent arbitrary in the amount of care they took to code and score different characters within their data set.

The general problem of unspecified character states in phylogenetic parsimony is the incorrect suggestion of similarity in morphologically dissimilar taxa, and the unsupported assumption that the disparate morphologies united within a trash can character state represent a clear alternative to the other coded character state. Pogue & Mickevich (1990: 353) qualify broad character states as a “common practice obscuring observed variation.” As long as it is realized that propositions of primary homology have to be based upon morphological similarity leading to character states that are “the same but different,” the only effect of lumping distinct morphologies under one state may be to obscure potentially useful variation. In the worst case, character states that are not properly delimited and scored actually propose primary homology of dissimilar morphologies without empirical basis. That would negate the central role of comparative anatomy in cladistics. The unspecified “absence” states discussed here fall into this category.

Problems with unspecified “absence” states can be of two types:

1) taxa for which inapplicable character states are not recognized and are simply scored as “absent”

2) multistate variation of a character is not recognized and part of it is inappropriately forced into the “absence” state of a binary character

A failure to recognize the hierarchical nature of morphological variation may lead to inappropriate character scorings (see Strong & Lipscomb, 1999, and Lee & Bryant, 1999 for recent reviews of the treatment of inapplicable characters). For example, when an attempt is made to code the morphological variation of the lophophore present in phoronids and brachiopods, these characters are logically inapplicable for phyla lacking lophophores. Several approaches have been proposed for coding inapplicable states in cladistic analyses. Although no method is entirely free of interpretational problems, currently the best way to code inapplicable character states is to score them as “?s” or “-s” which are treated the same (Hawkins et al., 1997; Strong & Lipscomb, 1999).

All authors of the cladistic data matrices considered here have properly applied inapplicability scoring for some characters, but none of these studies has done so accurately and consistently for all included data. The data set of Peterson & Eernisse (2001), however, is an exception. It is the clearest illustration of consistently applied a/p coding with no attempt at all to correct for inapplicable character states. For example, seven synapomorphies for the monophyly of brachiopods (characters 59-65) code for morphological variation of the lophophore. For these characters, taxa lacking this variation are all scored as “absent,” irrespective of whether they possess a lophophore, such as Phoronida, or not. This introduces character dependence and redundancy into the cladistic analysis. Inapplicability coding and scoring should be consistently applied because different treatments of inapplicable data may differentially affect the outcomes of an analysis. Waggoner’s (1996) analysis of arthropods and problematic fossil taxa, and Zrzavý et al.’s (2001) analysis of bilaterian phylogeny potently illustrate that different codings of inapplicable character states may produce substantially different phylogenetic results.

Unspecified “absence” states may also result from not recognizing the multistate nature of variation. In this case, variation not coded by the “presence” state may be inappropriately united with unrelated morphologies. Such a practice misrepresents available evidence and misses potentially useful phylogenetic information. For example, character 29 in Peterson & Eernisse (2001) codes spiral cleavage a/p, but the phyla scored as “absent” for spiral cleavage exhibit a great heterogeneity of cleavage types that cannot be subsumed within a single character state alternative to spiral cleavage. The scoring of the “absence” state assumes primary homology of cleavage types ranging from the lack of a stereotypical cleavage pattern such as in Cnidaria (Davidson, 1991; Martindale & Henry, 1998; Martin, 1997), bilateral cleavage such as in Urochordata (Jeffery & Swalla, 1997) or even Deuterostomia (Nielsen, 2001), and forms with unique or more difficult to interpret cleavage types such as Nematoda and Acoela (Henry et al., 2000; Nielsen, 2001). Obviously, this character coding and scoring does not properly represent the diversity of metazoan cleavage types. Moreover, since Peterson & Eernisse (2001) do not include any other character on cleavage geometry, phylogenetically significant variation is not coded. The introduction of a multistate character, or the scoring of “?s” would be a more sensible way of representing observed variation in cleavage types for the purposes of phylogenetic parsimony.

Many examples included in the appendix exhibit a mixture of problematic “inapplicability” scoring, and the improper binary coding of multistate variation, with the common result of creating default, or trash can, character states that do not reflect comparative morphology. Although such practice may be defensible within the non-evolutionary context of taxic homology, it is incompatible with the purported goals of standard cladistic parsimony analysis (Scotland, 2000a; Kluge, 2001). In order to remove this striking inconsistency between theory and practice in metazoan cladistics, character coding and scoring should receive far more explicit attention than is current practice. The following sections discuss several examples of the difficulty to interpret character state transformations when “absence” character states are not defined.

Reconstructing body plan evolution with Boolean logic: narrating history without looking back


The optimization of characters on cladograms is an essential source for insights into the evolution of organismic complexity on all taxonomic levels, including the animal phyla (e.g., Valentine, 1997; Jenner, 1999, 2000). The morphological data sets assembled for cladistic analyses of the Metazoa therefore seem to be an ideal source for insights into the origin and diversification of phylogenetically significant parts of animal body plans. Paradoxically, however, the widespread use of a/p characters with unspecified “absence” states severely compromises the value of current data matrices for understanding the evolution of animal body plans.

Since most unspecified “absence” states are optimized as plesiomorphies, the reconstructed ground patterns of stem species (nodes) on a cladogram are for many characters entirely ambiguous. For example, consider the scoring of character 54 in Peterson & Eernisse (2001), non-muscular peritoneal cells in lateral regions of coelom a/p, with Annelida, Echiura, and Sipuncula scored “present,” the remaining taxa as “absent.” The evolution of this apparently complex character is optimized (ambiguous) as a synapomorphy of a clade including these three phyla in addition to Mollusca (Neotrochozoa). According to this scoring all reconstructed stem species up to the clade Neotrochozoa are comparable and plesiomorphic with respect to this character as they lack coeloms with laterally located peritoneal cells. Unfortunately, this particular coding and scoring obscures the most important steps that must have occurred during the evolution of this character, i.e., the evolution of a coelom and peritoneocytes. The sister phylum to the Neotrochozoa is Nemertea, which already possess coelomic cavities, the rhynchocoel and vessels of the circulatory system, as well as peritoneal cells that line the rhynchocoel (Turbeville & Ruppert, 1985; Turbeville, 1991). Thus, the most important components of this ostensibly complex character already evolved at the base of a larger clade that minimally includes Nemertea and Neotrochozoa. However, the origins of these features remain unaddressed by the analysis of Peterson & Eernisse (2001) because no characters are included that code for the presence of either a coelom or peritoneocytes.

For a proper elucidation of character transformations that underlay the evolution of non-muscular peritoneocytes additional characters need to be included in the analysis. For example, the distribution of peritoneocytes within coeloms should either be coded if the lateral location of these cells is the important part of this character, or when the relative prominence and function of the coeloms is important, their multifarious differentiations should be coded (a spacious hydrostatic body coelom or eucoelom sensu Haszprunar (1996a) in annelids, echiurans, and sipunculids, but more restricted and differently specialized in molluscs and nemerteans). This example clearly demonstrates that it is impossible to unambiguously interpret character state transformations, for example as heterochronic shifts or as evolution of genuine novelties, when plesiomorphic “absence” states are unspecified, as is the case for the majority of examples listed in the appendix. Furthermore, the arbitrary selection of input data may strongly impair the effectiveness of a cladistic analysis as a test of alternative hypotheses (Jenner, 2001, 2002; submitted).

Generally, the problem of reconstructing history without specification of antecedent states is a problem for all characters with “absence” as a character state, whether these are unspecified or not. According to Kluge & Farris (1999: 209) a/p coding is “not normally used as input to a parsimony program,” because “doing so can lead to nonsense all zero reconstructed stem species that … have no state whatever” (their italics). This problem is most potently illustrated by Zrzavý et al. (1998) who included an “all-zero” hypothetical outgroup to root their morphological cladogram (see also Zrzavý et al., 2001). In this case, the problem is worsened by the inclusion of binary paired homologue characters in Zrzavý et al’s (1998) data set. An attempt to interpret the paired homologue characters for this “out-group” conjure up a grotesque organism with a body plan composed of illusory, and in several instances, contradictory features. These include a schizocoel, an anterior/dorsal anus but also lacking an anus, a biphasic life cycle but at the same time lacking a primary larva, dominant asexual reproduction, post-adult molting but also lacking cuticular molting, hollow polyp tentacles, uniramous limbs, and a non-centralized nervous system. Needless to say, such a chimeric outgroup does not help to determine the polarity of character change.

The interpretation of character state transformations (steps) on a cladogram breaks down when only “presence” character states are defined. A final illustration of the problem is the introduction of circularity into cladistic analysis. In what must be regarded as one of the most meticulously researched cladistic studies published during the 20th century, Grande & Bemis (1998: ix) consider the “a priori assumption of “primitiveness” for a character state” as a one of the diagnostic features of “authoritarianism in systematic biology” in generations past. Strikingly, this problem is also prevalent in recent cladistic analyses of the Metazoa. Not specifying the “absence” state of a character can only be taken to suggest that the phylogenetically informative derived character states are already known before the congruence test. Optimizing characters on the resulting cladograms indeed indicates that for most a/p characters in studies of metazoan cladistics it is the “presence” state which is the derived state. Such a procedure logically foregoes the function of outgroup taxa to distinguish plesiomorphic and apomorphic character states. The a priori assumption of the evolutionary polarity of a character introduces unwarranted circularity into the cladistic analysis, and it makes out-groups effectively non-functional.

As illustrated above, in order to yield meaningful character state transformations, the current penchant for Boolean character coding with unspecified “absence” states should be thoroughly reconsidered. As will be discussed in the next section, this becomes especially clear when unspecified “absences” attain phylogenetic significance through character reversal.

When congruence becomes meaningless: unspecified “absence” states and uninformative character reversals


A cladistic analysis subjects primary homology propositions to a character congruence test in order to separate corroborated secondary homologies from provisionally refuted homologies, i.e., homoplasies (De Pinna, 1991). However, when the character states are not based upon morphological similarity, secondary homologies supported by character congruence become meaningless. As discussed above, unspecified “absences” are most often resolved as the plesiomorphic character states, but the logical flaws of unspecified character states are most clearly expressed when unspecified states are optimized as apomorphies, i.e. when they attain phylogenetic significance. Several examples illustrate the fallacies of grouping phyla on the basis of unsupportable homology of dissimilar morphologies. All character state transformations discussed below are optimized on the morphological cladograms of their respective studies, except as noted otherwise.

Character 26 in Nielsen (2001) codes for the absence or presence of mesoderm derived from the archenteron. Archenteron derived mesoderm is scored for Ctenophora, Deuterostomia (Echinodermata, Pterobranchia, Enteropneusta, Chordata, Brachiopoda, and Phoronida), and Chaetognatha, and a reversal to “absence” of this source of mesoderm supports the monophyly of a clade of all protostomes minus chaetognaths. The phylogenetic significance of this character transformation, however, is compromised by the morphological disparity of the scored absences. The adopted scoring implies that the absence of mesoderm derived from the archenteron in phyla including Gastrotricha, Nematoda, Ectoprocta, Annelida, and Arthropoda is uniquely homologous, and a proper alternative to archenteron-derived mesoderm. This interpretation is scarcely supported by morphological evidence. For example, in addition to micromere-derived ecto-mesoderm, endo-mesoderm has been reported to arise from mesentoblast 4d in phyla such as Annelida. However, even accepting spiral cleavage in the arthropod ground pattern (but see Scholtz, 1997), the sources of mesoderm are not in agreement with those in the trochozoans, i.e., mesoderm does not arise from mesentoblast 4d (Anderson, 1973; Siewing, 1979; Scholtz, 1997; Nielsen, 2001). Furthermore, nematodes lack any 4d-mesoderm, and according to their cell lineage (see table 37.1 in Nielsen, 2001) it must be concluded that they possess only ecto-mesoderm. Endo-mesoderm also appears to be absent from gastrotrichs (table 35.1 in Nielsen, 2001), although this is a much more tentative conclusion based on studies older than those for nematodes. Although an archenteric origin of mesoderm is so far unknown for ectoprocts, the precise source of their larval, and especially adult mesoderm, remains poorly known (Reed, 1991; Zimmer, 1997; Lüter, 2000; Nielsen, 2001). Consequently, uniting all these phyla on the basis of simply lacking archenteron derived mesoderm, while they do not show any unique morphological similarity that could justify scoring them for the same character state, fails to reflect comparative morphology. Character 20 in Giribet et al. (2000) (based on the data set of Zrzavý et al., 1998, and optimized on the total evidence topology), and character 36 in Peterson & Eernisse (2001) also code for the archenteric origin of mesoderm with a/p characters, and the same nonsensical transformations to an unspecified state are found in their studies (ambiguous for Peterson & Eernisse, 2001).

Moreover, the frequently adopted dichotomous coding of mesoderm source in the Metazoa (mesoderm from archenteron versus mesoderm from 4d, blastopore rim, or ecto-mesoderm) is more reflective of order created by the very process of character coding than that it faithfully reflects observed organismic variation (Jenner, 2002). This problem principally results from the use of different criteria to diagnose the origin of mesoderm in protostomes and deuterostomes. Whereas the origin of mesoderm in protostome phyla such as nematodes and molluscs is determined by the onset of mesodermal cell fate specification, which becomes apparent during early cleavage, in deuterostome phyla mesoderm origin is typically pinpointed by the onset of morphological differentiation, primarily the differentiation of coelomic pouches such as in echinoderms. However, when initial mesoderm specification is also used as a criterion in deuterostomes, e.g., echinoderms, then it becomes clear that mesoderm specification is already established during blastula stages, long before the first signs of morphological differentiation of mesoderm become apparent (Davidson et al., 1998; Davidson, 2001; Sweet et al., 1999). This lowers the confidence we might have in one of the characters that has been regarded in support of a dichotomy between protostomes and deuterostomes (see Nielsen et al., 1996; Nielsen, 2001; Sørensen et al., 2000; Zrzavý et al., 1998, 2001).

Character 45 in Nielsen (2001) codes for the absence or presence of an adult brain derived from, or associated with, the larval apical organ. The reversal of this character at the base of the Bilateria is one of the four unambiguous synapomorphies of Bilateria in Nielsen (2001). Comparison of the nervous system morphologies of the bilaterian phyla clearly reveals that there is no empirical basis for this transformation. Among the phyla scored as lacking an adult brain either derived from, or associated with, the larval apical organ are: a) taxa without adult brains but with larval apical organs, such as Echinodermata; b) taxa with adult nerve concentrations that are formed separate from the larval apical organ, such as Entoprocta and Phoronida; c) taxa with clear adult brains but without any larval apical organ, such as Chaetognatha and Gnathostomulida; and d) taxa with a larval apical organ and a brain derived from it, such as Cephalochordata.

The condition in cephalochordates merits some clarification. Although cephalochordates are usually not interpreted as possessing larvae comparable to those found in other invertebrates, e.g., Nielsen (1998), all cladistic analyses published to date that included a character coding for the presence or absence of an apical organ, have scored Cephalochordata as possessing one (Nielsen et al., 1996; Zrzavý et al., 1998; Sørensen et al., 2000; Nielsen, 2001; Peterson & Eernisse, 2001; characters 21, 144, 27, 19, and 45 respectively). This scoring is largely based upon the work of T. C. Lacalli, in particular his proposed homology of the frontal eye complex in “larval” amphioxus and the apical organs that are widespread in invertebrate larvae (Lacalli, 1994, 1996). However, although this postulated homology has become widely incorporated into cladistic data matrices, none of the cladistic studies carried the character scoring to its logical conclusion. In order to maintain logical consistency throughout the data matrix, the scoring of an apical organ in Cephalochordata would have to be accompanied by the scoring of the adult brain being derived from or associated with the apical organ because the larval central nervous system, including the frontal eye complex, is retained in the adult. It must be concluded that the reversal of character 45 in Nielsen (2001) to an unspecified character state in taxa with very dissimilar nervous system ontogenies and morphologies cannot be a reliable bilaterian synapomorphy.

Similar conclusions about the cladistic insignificance of meaningless character state transformations to unspecified states scored for phyla with dissimilar morphologies apply to other characters as well, such as characters 5 and 68 in Giribet et al. (2000) (optimized on the total evidence topology, with mapping optimization criterion dependent). A reversal to the unspecified “absence” state of character 5 (radial cleavage a/p) inappropriately unites phyla with distinct cleavage types, including phyla with spiral, e.g., Annelida, Mollusca, and non-spiral types, e.g., Rotifera, Acanthocephala. Character 68 (tri-radial or star-shaped pharynx a/p) unites phyla with very different organizations of the anterior ends of their digestive systems on the basis of a reversal to the unspecified “absence” state. Here the morphologies incorrectly proposed to be uniquely homologous range from the complete lack of a pharynx, such as is found in some acoels (Rieger et al., 1991), to the presence of a primarily non-muscular esophagus as observed for Cycliophora (Funch & Kristensen, 1997), to the presence of well-developed muscular pharynges with cuticular hard parts characteristic of phyla such as Gnathostomulida and Rotifera (Lammert, 1991; Clément & Wurdak, 1991). Thus in both these examples, the character state transformations merely supply spurious clade support.

As discussed above, the problem is much more widespread than these cases of character reversals. For all 230 a/p characters with unspecified “absence” states found in the five data sets studied here (Appendix), potential phylogenetic significance is ascribed to unspecified character states unsupported by morphological evidence. Such cavalier treatment of cladistic character coding diagnoses an important weakness of current practice in metazoan cladistics. Increased attention to the morphological basis of our data sets and concomitant re-coding of characters is needed to remedy these problems.

Reasserting the central role of comparative morphology in metazoan cladistics


The discrepancies between theory and practice in metazoan cladistics noted above are striking. The predominance of a/p character coding, and the frequent coding of unspecified “absence” character states can only be accommodated under an assumption of taxic homology where only “presence” states have potential phylogenetic significance. However, all published morphological analyses of metazoan cladistics operate by counting character state trans-formations within the context of transformational homology. This demands that all character states are clearly delimited on the basis of a careful analysis of morphological similarity. Unspecified “absence” states prevent any straightforward interpretation of character change in more than a third of the a/p characters included in the five analyses of metazoan cladistics that inaugurated the new millennium. This dissociation between morphological study and character coding and scoring is worrying. The universal adoption of Boolean logic in character coding has in many cases resulted in the bending of morphological variation to fit within the confines of binary characters.

Contemporary morphological analyses of metazoan cladistics are heavily skewed. Typically, disproportionate attention is given to the extraction of phylogenetic signal from a given data set, and the resulting cladograms are usually discussed strictly in terms of their topology. In contrast, the sole empirical anchor of these studies receives surprisingly little explicit attention, both before the analysis of character congruence (data matrix compilation), and especially after (dynamics of character state transformations). This unjustifiably cavalier attitude towards the data may be considered as a defining weakness of contemporary morphological metazoan cladistics. This problem manifests itself in various guises, ranging from uncritical character coding and scoring (Jenner, 2001, 2002; this paper), to the rather arbitrary selection of characters and taxa, thereby strongly impairing the power of cladistic analyses to test phylogenetic hypotheses (Jenner & Schram, 1999; and especially Jenner, submitted).

One important step in the right direction is the explicit study of organismic variation prior to character coding and scoring. Increased attention to comparative morphology, and more explicit attention to character coding is needed to rebalance contemporary metazoan cladistics. The partitioning of variation into discrete states is often not straightforward, and each choice of character coding should therefore be explicitly defended. Instead of bending nature’s variation to fit into ill-defined character states, character coding should instead aim to accommodate recognized variation. Of course, complex morphological variation may provide room for different coding strategies, the relative merits of which can be debated. For example, authors may differ in the number of character states assigned to a particular character depending upon the terminal taxa included in the analysis, or their individual perception of the potential phylogenetic significance of the character states. In other cases authors may differ in their decisions to capture morphological variation either in one multistate, or multiple binary characters (see Jenner, 2002 for examples and references).

Given the differential effects of different coding choices on the outcome of cladistic analyses, a more explicitly experimental attitude towards data matrix construction is imperative. In studies of molecular phylogenetics, experimental manipulations of data sets are commonly employed to assess the robustness of the outcomes of an analysis in terms of varying input parameters and assumptions. For example, it is commonplace to assess the effects of various weighting schemes for transitions/translations, or insertions/deletions on the results of the analysis. Introduction of such an explicit experimental approach would also be a valuable asset in morphological cladistics as it would facilitate a better understanding of the robustness of phylogenetic conclusions both with regard to the data matrix used, and the results of other analyses. Two examples from metazoan cladistics will illustrate the importance of an experimental approach to character coding.

Recently it has been debated how to code one potential morphological synapomorphy of the Ecdysozoa. Schmidt-Rhaesa et al. (1998) discussed the cuticle structure characteristic of Ecdysozoa: Panarthropoda and Introverta (sensu Nielsen, 2001: Nematoida and Scalidophora). The complex ecdysozoan cuticle was defined as tri-layered with a tri-laminate epicuticle, a proteinaceous exocuticle, and a chitinous endocuticle (Schmidt-Rhaesa et al., 1998: 274). Should this be coded as a single complex character (Zrzavý, 2001), or should it be deconstructed into several features that may exhibit a less than total congruence (Wägele et al., 1999; Wägele & Misof, 2001)? This is not a trivial question. For example, Gastrotricha possess some but not all of the components of this complex character. Although they are not included in Schmidt-Rhaesa et al.’s Ecdysozoa clade, they do possess a tri-laminate epicuticle, but as part of a bi-layered cuticle (Ruppert, 1991; Lemburg, 1998). Furthermore, they possess a proteinaceous basal fibrous or granular layer, which may be comparable to the proteinaceous exocuticle (median layer) of panarthropods and scalidophorans (Lemburg, 1998). Chitin has so far not been convincingly demonstrated in gastrotrich cuticles (Neuhaus et al., 1996). So while it is true that a complex cuticle including all components may be a unique synapomorphy of Ecdysozoa (Schmidt-Rhaesa et al., 1998; Zrzavý, 2001), this complex character appears in reality to be a character complex comprised of different features with distinct evolutionary histories (Lemburg, 1998).

If the Gastrotricha represents the sister taxon to the Ecdysozoa, as maintained by Schmidt-Rhaesa et al. (1998), then at the level of Ecdysozoa a tri-laminate epicuticle and a proteinaceous exocuticle are plesiomorphies inherited from the last common ancestor of Gastrotricha plus Ecdysozoa, while a chitinous endocuticle is the only uniquely derived condition supporting ecdysozoan monophyly. Coding a single complex character removes potential phylogenetic evidence (trilaminate epicuticle and proteinaceous basal layer) for uniting Gastrotricha either with Introverta alone (yielding a monophyletic Cycloneuralia sensu Nielsen, 2001) or Ecdysozoa, while subdivision of this cuticle character complex would reduce the empirical weight of the ecdysozoan synapomorphies. The choice between these alternatives is not immediately obvious (partly because Gastrotricha remain a perennial phylogenetic problematicum), but different coding decisions clearly embody differences in the phylogenetic significance of a given range of organismic variation, and consequently each decision has to be carefully justified. The practical significance of these considerations will be illustrated by the following character coding experiment.

Zrzavý et al. (2001) coded the complex cuticle character (21) suggested as an ecdysozoan autapomorphy by Schmidt-Rhaesa et al. (1998) as discussed above (defined by Zrzavý et al. as molted cuticle with epicuticle, exocuticle, and endocuticle, with sclerotization). This complex character unambiguously supported Ecdysozoa in the analysis of Zrzavý et al. (2001). However, Gastrotricha were united with Gnathostomulida at the base of the protostomes (see Jenner, 2002 for discussion of the characters supporting this sister group relationship). I performed two coding experiments for cuticle structure based on the data presented in Lemburg (1998). In the first experiment, I added one character to the data set of Zzravy et al. (2001) to code for a tri-laminate epicuticle with a proteinaceous layer as is shared between Gastrotricha and Introverta plus Panarthropoda, while character 21 in Zrzavý et al. (2001) was retained to represent the new addition of a chitinous endocuticle. In the second experiment, I added two characters to separate the variation for the tri-laminate epicuticle and the proteinaceous layer, while again retaining character 21. Both alternatives were analyzed with a heuristic search, 100 random addition replicates, and TBR branch swapping.

The first experiment yielded a strict consensus tree identical to that obtained for the unmodified data set (Fig. 1). The Ecdysozoa was supported, and Gastrotricha remained a sister group to Gnathostomulida. However, in the second experiment, the clade of Gastrotricha and Gnathostomulida collapsed, Chaetognatha were removed as a sister group of the Ecdysozoa (as supported in the first experiment and with the unmodified data set), and the positions of these phyla in addition to the clade Ecdysozoa remained unresolved within the protostomes (Fig. 1). This indicates that the choice to code complex cuticle morphology in these phyla either as a single a/p character, or as several a/p characters may influence the results in an important way.

FIG2

Fig. 1. Strict consensus trees resulting from an analysis of the data set of Zrzavý et al. (2001) with two modifications: 1) addition of a character coding for a tri-laminate epicuticle with a proteinaceous layer as found in gastrotrichs, introvertans and panarthropods, 2) addition of two characters to code the variation for the tri-laminate epicuticle and the proteinaceous layer separately, resulting in the less resolved consensus tree. See text for discussion.

Being aware of the effects of different interpretations of a certain complex of morphological variation is thus essential when the aim is to test alternative phylogenetic hypotheses, such as for Gastrotricha vis-a-vis the ecdysozoan phyla.

A final example illustrates the importance of character coding decisions for resolving the phylogenetic placement of Gnathostomulida in the animal kingdom. The sister taxon to the gnathostomulids remains uncertain (see Jenner, 2002 for a comprehensive discussion of the phylogenetic position of gnathostomulids within the Metazoa). The two most important competing hypotheses for the placement of Gnathostomulida among the Metazoa based on morphological data are the Plathelminthomorpha and Gnathifera hypotheses. Gnathostomulids are the sister group to a monophyletic Platyhelminthes according to the Plathelminthomorpha hypothesis, and to the Syndermata (Rotifera, including Seison, and Acanthocephala) according to the Gnathifera hypothesis. In terms of the number of independent studies and synapomorphies, the Plathelminthomorpha hypothesis appears to be the best supported hypothesis (Jenner, 2002). However, the monophyly of Gnathifera is supported by characters of a better quality than those supporting the Plathelminthomorpha Hypothesis, principally because they are more detailed and unique. The two most important gnathiferan synapomorphies are:

1) the presence of jaw elements with tube-like support rods composed of electron lucent material surrounding an electron-dense core

2) the presence of cross-striated pharyngeal muscles that attach to the jaw elements through epithelial cells

Although the first character is included in many cladistic analyses, the second character is restricted to the analyses of Nielsen (2001) and Sørensen et al. (2000). Despite the unique ultrastructural similarities of gnathiferan jaw elements, the inclusion of only the first character in a computer-assisted cladistic analysis does not guarantee a monophyletic Gnathifera, as is illustrated by the studies of Wallace et al. (1996), Zrzavý et al. (1998, 2001), Giribet et al. (2000), and Peterson & Eernisse (2001). Interestingly, the only cladistic analyses with a sufficiently broad taxon sampling to test the Plathelminthomorpha against the Gnathifera hypotheses, and that supported the latter are Sørensen et al. (2000) and Nielsen (2001). These studies are also unique in including a separate character on the mode of pharyngeal muscle attachment. It then becomes important to study the effect and justification of coding the jaws and their muscle attachment either as one or two separate characters. I performed two experiments.

First, I re-analyzed the original data matrix of Nielsen (2001), which supported the Gnathifera hypothesis, with a heuristic search, 100 random addition replicates, TBR branch swapping (excluding character 64 as Nielsen did for his strict consensus). Then I re-analyzed the matrix while excluding the character (35) coding for the mode of pharyngeal muscle attachment. These two analyses yielded exactly the same strict consensus of the same four MPTs with a monophyletic Gnathifera, a situation identical to the analysis and results (fig. 56.1) of Nielsen (2001).

Secondly, I re-analyzed the original data matrix of Peterson & Eernisse (2001), which supported the Plathelminthomorpha hypothesis, with the same analysis parameters as in the first experiment. Subsequently I introduced a new character into their matrix coding for the pharyngeal muscle attachment type found in gnathiferans and scored accordingly. The first analysis yielded the 20 MPTs and well-resolved strict consensus found by Peterson & Eernisse (2001) (Fig. 2). In sharp contrast, the re-analysis with the second potential gnathiferan autapomorphy resulted in quite a dramatic collapse of the strict consensus tree, leaving one huge polytomy for Bilateria (Fig. 2). The only clades that were retained are Ecdysozoa, Eutrochozoa, Deuterostomia, Brachiopoda + Phoronida and Platyhelminthes. The relationships between these and all other bilaterian phyla remained unresolved. Furthermore, monophyly of Plathelminthomorpha was no longer supported, and the position of Gnathostomulida and Rotifera remained entirely unresolved.

FIG2

Fig. 2. Strict consensus trees resulting from an analysis of: 1) the unmodified data set of Peterson & Eernisse (2001); 2) the data set of Peterson & Eernisse (2001) with the addition of a character coding for cross-striated pharyngeal muscles that attach to jaw elements through epithelial cells as found in gnathiferans (less resolved consensus tree). See text for discussion.

In this context, it becomes an important question whether the separate coding of a character on the attachment of muscles to pharyngeal hard parts through epithelial cells is justified, as is done in the matrices of Sørensen et al. (2000) and Nielsen (2001) but not in the other studies. Cross-striated pharynx muscles that attach to cuticular jaw elements are also found in Micrognathozoa (Kristensen & Funch, 2000). Naturally, when comparable pharyngeal hard parts are lacking in other taxa, they should logically be scored as ‘inapplicable’ for mode of pharyngeal muscle attachment, but neither Sørensen et al. (2000) nor Nielsen (2001) adopted this scoring. Sørensen et al. (2000) and Nielsen (2001) also score this feature as present in Annelida. However, it is only found in eunicid polychaetes, which are unlikely to be representative of the annelidan ground pattern (Rouse & Fauchald, 1997). If we nevertheless choose to accept this scoring, we have to confront an interesting issue. Cross-striated body muscles (as opposed to cross-striated pharyngeal muscles) in taxa such as kinorhynchs, loriciferans, cycliophorans, and possibly nematodes (Wright, 1991, fig. 28) also do not attach directly to the cuticle, but rather through the intermediate of an epidermal cell (Kristensen & Higgins, 1991; Funch & Kristensen, 1997; Neuhaus, Kristensen & Peters, 1997b). Similarly, somatic muscles of Micrognathozoa always attach through epidermal cells to the, in this case intracellular, skeletal plates that are located in the lateral and dorsal body regions (Kristensen & Funch, 2000). In fact, a survey of muscle attachment types throughout the Metazoa reveals that the attachment of muscles to the cuticle through intermediate epithelial cells is much more widespread. It has, for example, been reported for arthropod muscles, tardigrade stylet muscles, the beak muscles in cephalopods (through beccublasts), gastrotrich muscles, ectoproct muscles (attachment to ectocyst), and chaetognath head muscles (Ruppert, 1991; Mellon, 1992; Dewel, Nelson & Dewel, 1993; Budelmann, Schipp & Boletzky, 1997; Mukai et al., 1997; Shinn, 1997). Recognizing the widespread distribution of this mode of muscle attachment is important for properly evaluating the phylogenetic significance of cross-striated muscle attachment to pharyngeal hard parts in gnathiferans. In fact, it may indicate that muscle attachment to the cuticle through epithelial cells may be a plesiomorphy on the level of Gnathifera (to be tested in future cladistic analyses). This reasoning would lessen the probability that this type of muscle attachment is a novel autapomorphy of Gnathifera to be coded independently from the presence of pharyngeal hard parts.

These experiments illustrate that single changes in a data matrix can have profound effects on the outcome of a cladistic analysis, and the same change for two taxa in two different matrices can have entirely different effects. This underlines the importance of explicitly justifying character coding decisions, and an experimental approach to character coding allows novel insights into the stability of cladistic results that would otherwise remain hidden. For example, Rouse & Fauchald (1997) and Rouse (2001) coded alternative data matrices with only binary a/p or multistate characters to resolve polychaete and siboglinid (formerly the phyla Pogonophora and Vestimentifera) relationships, respectively. These studies convincingly show that the results derived from the a/p and multistate data sets may differ substantially in both topology and resolution (for further examples of the effects of different coding methods in the context of animal relationships see Jenner & Schram, 1999; Jenner, 2001; Donoghue et al., 2000).

Conclusions


The current lack of explicitness in the construction of our morphological data sets can only be labeled as unscientific. Morphological data sets differ profoundly between studies, which in itself is no reason for despair. However, because they differ chiefly in myriad details that are not made explicit, it becomes virtually impossible to make a reasoned choice between alternative hypotheses. This becomes especially clear when the efficacy of different studies is evaluated with respect to hypothesis testing (Jenner, 2002; submitted).

The universal preference for binary character coding, in particular a/p coding, in metazoan cladistics cannot be explained by the superiority of this coding method. In fact, within the context of phylogenetic parsimony a/p coding is currently the most severely criticized coding method, as is discussed above. There are no indications that this realization has permeated into the general consciousness of metazoan systematists yet. Alternative coding methods do exist, for example the conventional coding of Hawkins et al. (1997). The explicit connection of this coding method with comparative morphology is much better justified than for a/p coding (Hawkins et al., 1997). Alternatively, other authors have advised the use of different coding methods, such as step matrices that quantify character state transformation costs (Maddison, 1993; Forey & Kitching, 2000). However, current limitations of phylogenetic software, and the acknowledged idiosyncrasies of different character coding methods prevent an easy solution. As a result the merits of different coding strategies currently remain at the center of debate. For example, although conventional coding is favored by several authors (Hawkins et al., 1997; Strong & Lipscomb, 1999), it can introduce interpretational difficulties associated with inapplicable data when PAUP (Swofford, 2002) or Hennig86 (Farris, 1988) are used to analyze the data. In contrast, the program NONA by P. Goloboff minimizes these problems (Strong & Lipscomb, 1999).

Nevertheless, this leaves unexplained why the vast majority of characters is coded as absent/present. At this point, only speculation can be offered because explicit justification is never provided. One part of an explanation for our willingness to adopt Boolean logic in character coding may be our undeniable preference for ordering complexity by dichotomous division (Wilson, 1998: 169; Gould, 2000). Interestingly, the roots of this preference can be traced back to the systematic biology of pre-evolutionary times when dichotomous division was the favored method of classification (Mayr, 1982, 1995). Dichotomous division ordered organismic diversity by separating groups from each other on the basis of possessing a certain feature or not. The resemblance to binary a/p coding is immediately obvious.

A second partial explanation for the observed preference for a/p coding may stem from the fact that only half the character states (the “presence” states) need to be delimited. Taxa lacking the “presence” state can then simply be scored as a default irrespective of their morphology. Although for some characters an attempt is made to differentiate among the taxa lacking the “presence” state, for example through inapplicability scoring, typically this is not carried out consistently (if at all; see Peterson & Eernisse, 2001). This dissociation between comparative morphology and character coding and scoring fundamentally contradicts the premises of phylogenetic parsimony. Clearly, we have to go back to basics.

References


Anderson DT. 1973. Embryology and phylogeny in annelids and arthropods. Pergamon Press, Oxford.

Bartolomaeus T, Ax P. 1992. Protonephridia and metanephridia – their relation within the Bilateria. Zeitschr. zool. Syst. Evolut.-forsch. 30: 21-45.

Brower AVZ, Schawaroch V. 1996. Three steps of homology assessment. Cladistics 12: 265-272.

Budelmann BU, Schipp R, Boletzky S von. 1997. Cephalopoda. In: Harrison FW, Kohn AJ, eds. Microscopic anatomy of invertebrates. Vol. 6A. Mollusca II. New York: Wiley-Liss, 119-414.

Carine, MA, Scotland RW. 1999. Taxic and transformational homology: different ways of seeing. Cladistics 15: 121-129.

Clément P, Wurdak E. 1991. Rotifera. In: Harrison FW, Ruppert EE, eds. Microscopic anatomy of invertebrates. Vol. 4. Aschelminthes. New York: Wiley-Liss, 219-297.

Davidson EH. 1991. Spatial mechanisms of gene regulation in metazoan embryos. Development 113: 1-26.

Davidson EH. 2001. Genomic regulatory systems. Development and evolution. San Diego, Academic Press.

Davidson EH, Cameron RA, Ransick A. 1998. Specification of cell fate in the sea urchin embryo: summary and some proposed mechanisms. Development 125: 3269-3290.

Dewel RA, Nelson DR, Dewel WC. 1993. Tardigrada. In: Harrison FW, Rice ME, eds. Microscopic anatomy of invertebrates. Vol. 12. Onychophora, Chilopoda, and lesser Protostomata. New York: Wiley-Liss, 143-183.

Donoghue PCJ, Forey PL, Aldridge RJ. 2000. Conodont affinity and chordate phylogeny. Biol. Rev. 75: 191-251.

Eernisse DJ, Albert JS, Anderson FE. 1992. Annelida and arthropoda are not sister taxa: a phylogenetic analysis of spiralian metazoan morphology. Syst. Biol. 41: 305-330.

Eldredge N. 1979. Alternative approaches to evolutionary theory. Bull. Carnegie Mus. Nat. Hist. 12: 7-19.

Emerson SB, Hastings PA. 1998. Morphological correlations in evolution: consequences for phylogenetic analysis. Quart. Rev. Biol. 73: 141-162.

Farris JS. 1988. Hennig86. Port Jefferson, New York.

Farris JS, Kluge AG, Laet JE de. 2001. Taxic revisions. Cladistics 17: 79-103.

Forey PL, Kitching IJ. 2000. Experiments in coding multistate characters. In: Scotland R, Pennington RT, eds. Homology and systematics. Coding characters for phylogenetic analysis. London: Taylor & Francis, 54-80.

Fryer G. 1999. A comment on a recent phylogenetic analysis of certain orders of the branchiopod Crustacea. Crustaceana 72: 1039-1049.

Fryer G. 2001. The elucidation of branchiopod phylogeny. Crustaceana 74: 105-114.

Funch P, Kristensen RM. 1997. Cycliophora. In: Harrison FW, Woollacott RM, eds. Microscopic anatomy of invertebrates. Lophophorates, Entoprocta, and Cycliophora. New York: Wiley-Liss, 409-474.

Giribet G, Distel DL, Polz M, Sterrer W, Wheeler WC. 2000. Triploblastic relationships with emphasis on the acoelomates and the position of Gnathostomulida, Cycliophora, Plathelminthes, and Chaetognatha: a combined approach of 18S rDNA sequences and morphology. Syst. Biol. 49: 539-562.

Giribet G, Edgecombe GD, Wheeler WC. 2001. Arthropod phylogeny based on eight molecular loci and morphology. Nature 413: 157-161.

Gould SJ. 2000. Deconstructing the “science wars” by reconstructing an old mold. Science 287: 253-261.

Grande L, Bemis WE. 1998. A comprehensive phylogenetic study of amiid fishes (Amiidae) based on comparative skeletal anatomy. An empirical search for interconnected patterns of natural history. Society of Vertebrate Paleontology Memoir 4.

Haszprunar G. 1996a. The Mollusca: coelomate turbellarians or mesenchymate annelids? In: Taylor JD, ed. Origin and evolutionary radiation of the Mollusca. Oxford: Oxford University Press, 1-28.

Haszprunar G. 1996b. The molluscan rhogocyte (pore-cell, Blasenzelle, cellule nucale), and its significance for ideas on nephridial evolution. J. Moll. Stud. 62: 185-211.

Hawkins JA. 2000. A survey of primary homology assessment: different botanists peceive and define characters in different ways. In: Scotland R, Pennington RT, eds. Homology and systematics. Coding characters for phylogenetic analysis. London: Taylor & Francis, 22-53.

Hawkins JA, Hughes CE, Scotland RW. 1997. Primary homology assessment, characters and character states. Cladistics 13: 275-283.

Henry JQ, Martindale MQ, Boyer BC. 2000. The unique developmental program of the acoel flatworm, Neochildia fusca. Dev. Biol. 220: 285-295.

Jeffery WR, Swalla BJ. 1997. Tunicates. In: Gilbert SF, Raunio AM, eds. Embryology. Constructing the organism. Sunderland: Sinauer Associates, 331-364.

Jenner RA. 1999. Metazoan phylogeny as a tool in evolutionary biology: current problems and discrepancies in application. Belg. J. Zool. 129: 245-262.

Jenner RA. 2000. Evolution of animal body plans: the role of metazoan phylogeny at the interface between pattern and process. Evol. Devel. 2: 208-221.

Jenner RA. 2001. Bilaterian phylogeny and uncritical recycling of morphological data sets. Syst. Biol. 50: 730-742.

Jenner RA. 2002. Macroevolution of animal body plans. Evaluating alternative hypotheses. Ph.D. thesis, Universiteit van Amsterdam.

Jenner RA. submitted. Unleashing the force of cladistics? Metazoan phylogenetics and hypothesis testing. Integ. Comp. Biol. (formerly Amer. Zool.)

Jenner RA, Schram FR. 1999. The grand game of metazoan phylogeny: rules and strategies. Biol. Rev. 74: 121-142.

Kitching IJ, Forey PL, Humphries CJ, Williams DM. 1998. Cladistics. The theory and practice of parsimony analysis. Oxford: Oxford University Press.

Kluge AG. 2001. Parsimony with and without scientific justification. Cladistics 17: 199-210.

Kluge AG, Farris JS. 1999. Taxic homology = overall similarity. Cladistics 15: 205-212.

Kristensen RM; Funch P. 2000. Micrognathozoa: A New Class With Complicated Jaws Like Those of Rotifera and Gnathostomulida. J. Morph. 246: 1-49

Kristensen RM, Higgins RP. 1991. Kinorhyncha. : Harrison FW, Ruppert EE, eds. Microscopic anatomy of invertebrates. Vol. 4. Aschelminthes. New York: Wiley-Liss, 377-404.

Lacalli TC. 1994. Apical organs, epithelial domains, and the origin of the chordate central nervous system. Amer. Zool. 34: 533-541.

Lacalli TC. 1996. Landmarks and subdomains in the larval brain of Branchiostoma: vertebrate homologs and invertebrate antecedents. Israel J. Zool. 42: 131-146.

Lammert V. 1991. Gnathostomulida. In: Harrison FW, Ruppert EE, eds. Microscopic anatomy of invertebrates. Vol. 4. Aschelminthes. New York: Wiley-Liss, 19-39.

Lee MSY. 1995. Historical burden in systematics and the interrelationships of ‘parareptiles.’ Biol. Rev. 70: 459-547.

Lee D-C, Bryant HN. 1999. A reconsideration of the coding of inapplicable characters. Cladistics 15: 373-378.

Lemburg, C. 1998. Electron microscopical localization of chitin in the cuticle of Halicryptus spinulosus and Priapulus caudatus (Priapulida) using gold-labelled wheat germ agglutinin: phylogenetic implications for the evolution of the cuticle within the Nemathelminthes. Zoomorphology 118: 137-158.

Lüter C. 2000. The origin of the coelom in Brachiopoda and its phylogenetic significance. Zoomorphology 120: 15-28.

Maddison WP. 1993. Missing data versus missing characters in phylogenetic analysis. Syst. Biol. 42: 576-580.

Maddison DR, Maddison, WP. 2001. MacClade 4. Version 4.02. Sunderland: Sinauer Associates, Inc.

Marques AC. 1996. A critical analysis of a cladistic study of the genus Eudendrium (Cnidaria: Hydrozoa), with some comments on the family Eudendriidae. J. Comp. Biol. 1: 141-162.

Martin VJ. 1997. Cnidarians, the jellyfish and hydras. In: Gilbert SF, Raunio AM, eds. Embryology. Constructing the organism. Sunderland: Sinauer Associates, 57-86.

Martindale MQ, Henry JQ. 1998. The development of radial and biradial symmetry: the evolution of bilaterality. Amer. Zool. 38: 672-684.

Mayr E. 1982. The growth of biological thought. Massachussetts: The Belknap Press of Harvard University Press.

Mayr E. 1995. Systems of ordering data. Biol. Phil. 10: 419-434.

Mellon, D. 1992. Connective tissue and supporting structures. In: Harrison FW, Humes AG, eds. Microscopic anatomy of invertebrates. Vol. 10. Decapod Crustacea. New York: Wiley-Liss, 77-116.

Mukai H, Terakado K, Reed CG. 1997. Bryozoa. In: Harrison FW, Woollacott RM, eds. Microscopic anatomy of invertebrates. Lophophorates, Entoprocta, and Cycliophora. New York: Wiley-Liss, 45-206.

Neuhaus B, Kristensen RM, Lemburg C. 1996. Ultrastructure of the cuticle of the Nemathelminthes and electron microscopical localization of chitin. Verhandlungen der Deutschen Zoologischen Gesellschaft 89: 221.

Neuhaus B, Kristensen RM, Peters W. 1997. Ultrastructure of the cuticle of Loricifera and demonstration of chitin using gold-labelled wheat germ agglutinin. Acta Zool. 78: 215-225.

Nielsen C. 1998. Morphological approaches to phylogeny. Amer. Zool. 38: 942-952.

Nielsen C. 2001. Animal evolution. Interrelationships of the living phyla. Oxford: Oxford University Press.

Nielsen C, Scharff N, Eibye-Jacobsen D. 1996. Cladistic analyses of the animal kingdom. Biol. J. Linn. Soc. 57: 385-410.

Olesen J. 2000. An updated phylogeny of the Conchostraca – Cladocera clade (Branchiopoda, Diplostraca). Crustaceana 73: 869-886.

Patterson C. 1982. Morphological characters and homology. In: Joysey KA, Friday AE, eds. Problems of phylogenetic reconstruction. London: Academic Press, 21-74.

Patterson C, Johnson GD. 1997. The data, the matrix, and the message: comments on Begle’s "Relationships of the osmeroid fishes.“ Syst. Biol. 46: 358-365.

Peterson KJ, Eernisse DJ. 2001. Animal phylogeny and the ancestry of bilaterians: inferences from morphology and 18S rDNA gene sequences. Evol. Devel. 3: 170-205.

Pimentel RA, Riggins R. 1987. The nature of cladistic data. Cladistics 3: 201-209.

dePinna, MCC. 1991. Concepts and tests of homology in the cladistic paradigm. Cladistics 7: 367-394.

Pleijel F. 1995. On character coding for phylogeny reconstruction. Cladistics 11: 309-315.

Poe S, Wiens JJ. 2000. Character selection and the methodology of morphological phylogenetics. In: Wiens JJ, ed. Phylogenetic analysis of morphological data. Washington: Smithsonian Institution Press, 20-36.

Pogue MG, Mickevich MF. 1990. Character definitions and character state delineations: the bête noire of phylogenetic inference. Cladistics 6: 319-361.

Reed CG. 1991. Bryozoa. In: Giese AC, Pearse JS, Pearse VB, eds. Reproduction of marine invertebrates. Vol. VI. Echinoderms and lophophorates. California: The Boxwood Press, 85-245.

Rieger RM, Tyler S, Smith III JPS, Rieger GE. 1991. Platyhelminthes: Turbellaria. In: Harrison FW, Bogitsh, eds. Microscopic anatomy of invertebrates. Vol. 3. Platyhelminthes and Nemertinea. New York: Wiley-Liss, 7-140.

Rieppel O, Reisz RR. 1999. The origin and early evolution of turtles. Ann. Rev. Ecol. Syst. 30: 1-22.

Rieppel O, Zaher H. 2000. The braincases of mosasaurs and Varanus, and the relationships of snakes. Zool. J. Linn. Soc. 129: 489-514.

Rieppel O, Kearney M. 2002. Similarity. Biol. J. Linn. Soc. 75: 59-82.

Rohde K. 1996. Robust phylogenies and adaptive radiations: a critical examination of methods used to identify key innovations. Am. Nat. 148: 481-500.

Rouse GW. 2001. A cladistic analysis of Siboglinidae Caullery, 1914 (Polychaeta, Annelida): formerly the phyla Pogonophora and Vestimentifera. Zool. J. Linn. Soc. 132: 55-80.

Rouse GW, Fauchald K. 1995. The articulation of annelids. Zool. Scr. 24: 269-301.

Rouse GW, Fauchald K. 1997. Cladistics and polychaetes. Zool. Scr. 26: 139-204.

Ruppert EE. 1991. Gastrotricha. In: Harrison FW, Ruppert EE, eds. Microscopic anatomy of invertebrates. Vol. 4. Aschelminthes. New York: Wiley-Liss, 41-109.

Ruppert EE. 1994. Evolutionary origin of the vertebrate nephron. Amer. Zool. 34: 542-553.

Ruppert EE, Smith PR. 1988. The functional organization of filtration nephridia. Biol. Rev. 63: 231-258.

Schmidt-Rhaesa A, Bartolomaeus T, Lemburg C, Ehlers U, Garey JR. 1998. The position of the Arthropoda in the phylogenetic system. J. Morph. 238: 263-285.

Scholtz, G. 1997. Cleavage, germ band formation and head segmentation: the groundpattern of the Euarthropoda. In: Fortey RA, Thomas RH, eds. Arthropod relationships. London: Chapman & Hall, The Systematics Association Special Volume Series 55, 317-332.

Schram FR. 1991. Cladistic analysis of metazoan phyla and the placement of fossil problematica. In: Simonetta A, Conway Morris S, eds. The early evolution of Metazoa and the significance of problematic taxa. 35-46.

Schram FR, Jenner RA. 2001. The origin of Hexapoda: a crustacean perspective. In: Deuve T, ed. Origin of Hexapoda. Ann. Soc. entomologique de France 37: 243-264.

Scotland RW. 2000a. Taxic homology and three-taxon statement analysis. Syst. Biol. 49: 480-500.

Scotland RW. 2000b. Homology, coding and three-taxon statement analysis. In: Scotland R, Pennington RT, eds. Homology and systematics. Coding characters for phylogenetic analysis. London: Taylor & Francis, 145-182.

Scotland R, Pennington RT. 2000. Homology and systematics. Coding characters for phylogenetic analysis. London: Taylor & Francis.

Shinn GL. 1997. Chaetognatha. In: Harrison FW, Ruppert EE, eds. Microscopic anatomy of invertebrates. Vol. 15. Hemichordata, Chaetognatha, and the invertebrate chordates. New York: Wiley-Liss, 103-220.

Shultz JW, Regier JC. 2000. Phylogenetic analysis of arthropods using two nuclear protein-encoding genes support a crustacean + hexapod clade. Proc. Roy. Soc. Lond. B 267: 1011-1019.

Siewing R. 1979. Homology of cleavage types. Fortschr. Zool. Syst. Evolut. 1: 7-18.

Smith PR. 1992. Polychaeta: excretory system. In: Harrison FW, Gardiner SL, eds. Microscopic anatomy of invertebrates. Vol. 7. Annelida. New York: Wiley-Liss, 71-108.

Smith PR, Ruppert EE. 1988. Nephridia. In: Westheide W, Hermans CO, eds. Microfauna Marina. Stuttgart: Gustav Fischer Verlag, 231-262.

Sørensen MV, Funch P, Willerslev E, Hansen AJ, Olesen J. 2000. On the phylogeny of the Metazoa in the light of Cycliophora and Micrognathozoa. Zool. Anz. 239: 297-318.

Strong EE, Lipscomb D. 1999. Character coding and inapplicable data. Cladistics 15: 363-371.

Sweet HC, Hodor PG, Ettensohn CA. 1999. The role of micromere signalling in Notch activation and mesoderm specification during sea urchin embryogenesis. Development 126: 5255-5265.

Swofford DL. 2002. PAUP. Sunderland, Sinauer Associates, Include Publishing.

Turbeville JM. 1991. Nemertinea. In: Harrison FW, Bogitsh BJ, eds. Microscopic anatomy of invertebrates. Vol. 3. Platyhelminthes and Nemertinea. New York: Wiley-Liss, 285-328.

Turbeville JM, Ruppert EE. 1985. Comparative ultrastucture and the evolution of nemertines. Am. Zool. 25: 53-71.

Valentine JW. 1997. Cleavage patterns and the topology of the metazoan tree of life. Proc. Natl. Acad. Sci. 94: 8001-8005.

Wägele JW, Erikson E, Lockhart P, Misof B. 1999. The Ecdysozoa: artifact or monophylum? J. Zool. Syst. Evol. Res. 37: 211-223.

Wägele JW, Misof B. 2001. On quality of evidence in phylogeny reconstruction: a reply to Zrzavý’s defence of the ‘Ecdysozoa’ hypothesis. J. Zool. Syst. Evol. Research 39: 165-176.

Waggoner BM. 1996. Phylogenetic hypotheses of the relationships of arthropods to precambrian and cambrian problematic fossil taxa. Syst. Biol. 45: 190-222.

Wallace RL, Ricci C. Melone G. 1996. A cladistic analysis of pseudocoelomate (aschelminth) morphology. Invert. Biol. 115: 104-112.

Watling L. 1999. Toward understanding the relationships of the peracaridan orders: the necessity of determining exact homologies. In: Schram FR, Vaupel Klein JC von, eds. Crustaceans and the biodiversity crisis. Leiden: Brill, 73-89.

Wilkinson M. 1995. A comparison of two methods of character construction. Cladistics 11: 297-308.

Williams DM, Siebert DJ. 2000. Characters, homology and three-item analysis. In: Scotland R, Pennington RT, eds. Homology and systematics. Coding characters for phylogenetic analysis. London: Taylor & Francis, 183-208.

Wilson EO. 1998. Consilience. The unity of knowledge. London, Abacus.

Wright KA. 1991. Nematoda. In: Harrison FW, Ruppert EE, eds. Microscopic anatomy of invertebrates. Vol. 4. Aschelminthes. New York: Wiley-Liss, 111-195.

Zimmer RL. 1997. Phoronids, brachiopods, and bryozoans, the lophophorates. In: Gilbert SF, Raunio AM, eds. Embryology. Constructing the organism. Sunderland: Sinauer Associates, 279-305.

Zrzavý J. 2001. Ecdysozoa versus Articulata: clades, artifacts, prejudices. J. Zool. Syst. Evol. Research 39: 159-163.

Zrzavý J, Mihulka, S, Kepka P, Bezdek A, Tietz D. 1998. Phylogeny of the Metazoa based on morphological and 18S ribosomal DNA evidence. Cladistics 14: 249-285.

Zrzavý J, Hypsa V, Tietz DF. 2001. Myzostomida are not annelids. Molecular and morphological support for a clade of animals with anterior sperm flagella. Cladistics 17: 170-198.

Acknowledgements


This work was supported by grant 805-33.431-P from the Earth and Life Sciences Foundation (ALW) of the Netherlands Organization for Scientific Research (NWO). I thank Alison Cole and Frederick Schram for their incisive comments.

Appendix


 

Appendix

Unspecified “absence” states in metazoan cladistics. Examples of unspecified “absence” character states for a/p characters selected from the most recently published cladistic analyses of the Metazoa. Note that the precentages are ‘conservative’ estimates. Delimitation of “absence” states may be problematic for more characters depending on how strictly one adopts to the similarity criterion for determining primary homologies.

The appendix lists 230 a/p characters with unspecified “absence” states (42.9 % of all a/p characters; 38% of all characters).

Absence unspecified in Nielsen (2001) (33 characters: 51.6%):


7: synapses with acetylcholine: taxa with and without nerve cells

10: monociliate epithelia: taxa with unciliated and multiciliated epithelia

11: multiciliate epithelia: taxa with unciliated and monociliated epithelia

12: general body cuticle with collagen: taxa lacking cuticle and with chitinous cuticle

13: general body cuticle with chitin: taxa lacking cuticle and with collagenous cuticle

14: cuticle molted: taxa lacking cuticle and with a non-molted cuticle

16: gonads with separate gonoducts: taxa lacking gonads and with gonads without separate gonoducts

17: gametes pass through coelom and metanephridia: taxa with or without coelom

18: spiral cleavage with 4-d mesoderm: taxa with different cleavage types

19: larvae with ciliated apical sense organ: taxa lacking larvae and with larvae without apical sense organs

21: larvae or adults with downstream-collecting ciliary system: taxa lacking ciliary feeding systems and with upstream-collecting systems

22: larvae or adults with upstream-collecting ciliary system: taxa lacking ciliary feeding systems and with downstream-collecting systems

27: body segmented with serially repeated organs developed from 4d-mesoderm: unsegmented with 4d-mesoderm and segmented animals lacking 4d-mesoderm

28: body with segments added successively from a teloblastic growth zone: unsegmented and segmented animals

29: segmental longitudinal muscles developed from rows of mesodermal pockets from the archenteron: taxa with and without archenteron derived mesoderm

30: body archimeric: coelomate and non-coelomate animals

31: tentaculated mesosome: animals with and without mesosome

33: mouth terminal, pharynx radial: animals with various mouth positions and various pharynx constructions

35: pharynx with cross-striated muscles attached to jaws by epithelial cells: taxa with and without muscles attached to epithelial cells

37: introvert with teeth, spines and scalids: taxa with and without introvert

38: non-eversible mouth cone with cuticular ridges and spines: taxa with and without introvert

39: pharyngeal gill slits: taxa with and without pharynges

41: notochord: taxa with and without chorda

44: limbs articulated with intrinsic muscles: taxa with and with-out limbs

45: adult brain derived from or associated with larval apical organ: taxa with and without adult brains, and with or without apical organs

46: ventral longitudinal nerve cord, paired or secondarily fused: taxa with and without centralized nerve cords

48: brain collar shaped: taxa with brains of diverse construction and without brains

49: proto-, deuto- and tritocerebrum: taxa with brains of diverse construction and without brains

50: haemal system: taxa with and without coeloms

51: mixocoel: taxa with and without coeloms

53: haemal system with axial complex: taxa with and without coeloms

56: metanephridia with coelomic compartment restricted to a sacculus: taxa with and without metanephridia

58: cleavage bilateral: taxa with dissimilar cleavage types

Absence unspecified in Zrzavý et al. (2001) (21 characters: 35%):


1: spiral quartet cleavage: taxa with dissimilar cleavage types

2: 4d-mesoderm: taxa with dissimilar mesoderm sources

7: archimeric architecture: taxa with and without coeloms

8: dimeric body architecture: taxa of very dissimilar body architecture

10: mesoderm formed from archenteron: taxa with dissimilar mesoderm sources

15: epidermal locomotory ciliature highly reduced or absent: taxa with and without locomotory cilia

21: molted cuticle with epicuticle, exocuticle, and endocuticle, with sclerotization: taxa without cuticle and with cuticle but non-molting

22: cuticle containing a-chitin: taxa with different cuticle compositions and taxa lacking a cuticle

24: compound cilia: taxa with and without multiciliate epidermal cells

28: terminal mouth with radial pharynx: taxa with and without terminal mouths and various pharynx architectures

42: sperm mitochondrial interpolation: taxa with and without sperm

43: male sex reduced or absent: hermaphroditic taxa without males and gonochoristic taxa with males

44: retroperitoneal gonads with gonocoel: taxa with and without coeloms and peritoneum

45: adult brain derived from/associated with apical organ: taxa with and without apical organs and adult brains

46: dorsal nerve concentration/brain behind apical organ/apical pole: taxa with and without apical organs and adult brains

47: collar-shaped pharyngeal brain: taxa with brains of diverse construction and lacking brains

48: orthogonal nervous system: taxa with and without centralized nerve concentrations, cords or ganglia

49: caudal ganglion:taxa with and without caudally located ganglia

51: protostome apical organ: taxa with and without apical organs

56: frontal gland system: taxa with and without frontally located glands

60: association with crustaceans: taxa living in all places except in association with crustaceans

Absence unspecified in Sørensen et al. (2000) (31 characters: 47%):


7: synapses with acetylcholine: taxa with and without nerve cells

14: general body cuticle with collagen: taxa with different cuticle composition and without cuticle

15: general body cuticle with chitin: taxa with different cuticle composition and without cuticle

17: cuticle molted: taxa with and without cuticle

18: trunk without cuticle: taxa with and without cuticle

21: gonads with separate gonoductsL taxa with different gonad organizations and gamete outlets

22: gametes pass through coelom and metanephridia: taxa with different gamete outlets and with or without coeloms

25: sperm with anteriorly inserted flagellum: taxa with aflagellar sperm and flagellum attached in position other than anteriorly

26: spiral cleavage with 4d-mesoderm: taxa with and without mesoderm and with different cleavage types

29: larva or adult with downstream-collecting ciliary bands of compound cilia on multiciliate cells: taxa without epidermal cilia or monociliated epidermal cells

30: larva or adult with upstream-collecting ciliary bands with single cilia on monociliate cells: taxa without epidermal cilia or multiciliated epidermal cells

35: body segmented with serially repeated organs developed from 4d-mesoderm (or ectomesoderm): taxa with and without 4d-mesoderm

36: body with successively added segments developed from a teloblastic growth zone: segmented taxa without teloblastic growth and unsegmented taxa

37: body with segmented longitudinal musculature developed from rows of mesodermal pockets from the archenteron: taxa with and without archenteron-derived mesoderm

38: body archimeric: taxa with and without coeloms

39: with tentaculated mesosome: taxa with and without mesosome

41: mouth terminal, pharynx radial: taxa with and without terminal mouths and with diverse pharynx constructions

43: pharynx with cross-striated muscles, attached to jaw elements by epithelial cells: taxa with and without muscles attached to epithelial cells

45: introvert with spines, teeth, and scalids: taxa with and without introverts

46: non-inversible mouth cone with cuticular ridges and spines: taxa with and without introverts

49: notochorda: taxa with and without chorda

52: limbs articulated with intrinsic muscles: taxa with and with-out limbs

53: adult brain derived from or associated with larval apical organ/apical pole: taxa with and without apical organs and adult brains

55: dorsal nerve concentration/brain behind apical organ/apical pole: taxa with and without apical organs

57: brain collar shaped: taxa with brains of diverse construction and without brains

58: proto-, deuto- and tritocerebrum: taxa with brains of diverse construction and without brains

59: haemal system: taxa with and without coeloms

60: mixocoel: taxa with and without coeloms

61: heart with coelomic pericardiuim: taxa with and without coeloms

62: haemal system with axial complex: taxa with and without coeloms

66: metanephridia with coelomic compartment restricted to a sacculus: taxa with and without metanephridia

Absence unspecified in Peterson & Eernisse (2001) (71 characters: 51.4%):


The majority of characters shows inapplicability problems with respect to the outgroups (Fungi, Choanoflagellata). Apart from those there are problems within the Metazoa for many characters as well:

7: water-canal system: taxa with and without choanocytes

10: “acoelomorph” type of ciliary rootlet: taxa with and without cross-striated ciliary rootlets

14: densely multiciliated epidermis: taxa without ciliated epidermis and with monociliate epidermis

15: distinct “step” in cilia: taxa with and without locomotory cilia

19: spermatozoa without accessory centriole: taxa with and without sperm

21: perforatorium: taxa with and without acrosomes

23: gonads present with gametes passing through coelom and metanephridia: taxa with and without gonads and diverse gamete release mechanisms

29: spiral cleavage: taxa with distinct cleavage types

30: annelid cross: taxa with and without spiral cleavage

31: molluscan cross: taxa with and without spiral cleavage

36: endomesoderm derived from gut: taxa with and without endomesoderm

38: 4d endomesoderm: taxa with endomesoderm from diverse sources

39: mesodermal germ bands derived from 4d: taxa with and without 4d-derived mesoderm

40: lateral coelom derived from mesodermal bands: taxa with and without mesodermal bands (irrespective of source) and lateral coeloms

43: somatoblast: taxa with and without spiral cleavage

46: apical organ with muscles extending to the hyposphere: taxa with and without apical organ

47: pretrochal anlagen: taxa with and without trochophore larvae

48: prototroch: taxa with and without larvae

49: metatroch: taxa with and without trochophore larvae

50: adoral ciliary band: taxa with and without trochophore larvae

51: telotroch: taxa with and without larvae

52: neurotroch: taxa with and without trochophore larvae

53: neotroch: taxa with and without larvae

54: nonmuscular peritoneal cells in lateral regions of coelom: taxa with and without coeloms and peritoneum

55: trimery: taxa with and without coeloms

56: mesocoelomic ducts and pores: taxa with and without mesocoels

57: ciliated extensions of the mesocoel: taxa with and without mesocoels

58: lophophore: taxa with and without mesocoels

59, 61-65: characters coding for variation in lophophore morphology: taxa with and without lophophore

68: protonephridia with channel cell completely surrounding lumen: taxa with and without protonephridia

69: axial complex: taxa with and without coeloms and hemal systems

70: hydropore: taxa with and without protocoel

71: paired hydropores: taxa with and without protocoel

72: metanephridia open through metacoel: taxa with and without metanephridia and metacoels

73: metanephridia with coelomic compartment restricted to sacculus: taxa with and without metanephridia

75: mantle sinuses with gonads: taxa with and without mantles

76: inner epithelium secreting periostracum: taxa with and with-out mantles

77: calcareous valves, which rotate about a hinge axis: taxa with and without mantles

78: cuticle with chitin: taxa without cuticles and with different cuticle compositions

79: trilaminate epicuticle: taxa with and without cuticles

80: trilayered epicuticle: taxa with and without cuticles

81: collagenous basal layer: taxa with and without cuticles

83: ecdysis: taxa with and without cuticles

86: head divided into three segments: taxa with and without segments and heads

87: terminal mouth: taxa with and without terminal mouths

89: oral cone: taxa with and without introvert

93: digestive gut without cilia: taxa with and without digestive gut

94: anus: taxa with and without digestive tract

97: synapticules: taxa with and without pharyngeal gill bars

100: acetylcholine used as a neurotransmitter: taxa with and without nerve cells

101: nerve cells organized into distinct ganglia: taxa with and without nerve cells

102: circumpharyngeal brain with anterior and posterior rings of perikarya separated by a ring of neuropil: taxa with brains of different constructions and without brains

103: ventral nervous system: taxa with and without centralized nervous concentrations

104: circumesophageal nerve ring: taxa with and without centralized nervous concentrations

106: dorsal nervous cord/ganglion associated with the mesosome: taxa with and without mesosomes

109: tanycytes: taxa with and without introverts

114: closed circulatory system with dorsal and ventral blood vessels: taxa with and without hemal systems

128-136: characters coding for differentiations of Hox genes and clusters: taxa with ans without Hox genes

Absence unspecified in Giribet et al. (2000) (data set from Zrzavý et al., 1998) (74 characters: 26.8%):


5: radial cleavage: taxa with distinct cleavage types

6: spiral cleavage: taxa with distinct cleavage types

7: spiral-quartet cleavage: taxa with distinct cleavage types

14: blastopore forming mouth and anus by fusion of lateral lips: taxa with different blastopore fates

15: blastopore forming anus: taxa with distinct blastopore fates

18: body segmented with serially repeated organs developed from 4d-mesoderm or ectomesoderm: taxa with and without segments and different mesoderm sources

19: entomesoblast (4d/2d): taxa with and without spiral cleavage

20: mesoderm formed from archenteron: taxa with distinct cleavage types

23: metameric coelomic cavities: taxa with and without coeloms

24: teloblastic segment-forming zone: taxa with and without segments

26: segmented longitudinal musculature developed from archenteric mesodermal pouches: taxa with and without archenteron-derived mesoderm

30: coelom: taxa with and without mesoderm

31: gonocoel: taxa with disparate gonad constructions and lacking gonads

32: eucoelomatic condition: taxa with and without coeloms

33: coelomocytes: taxa with and without coeloms

35: haemal system: taxa with and without coeloms

36: mixocoel: taxa with and without coeloms

37: bilaterally paired coelomic primordia: taxa with and without coeloms

38: heart with coelomic pericardium: taxa with and without coeloms

42: haemal system with axial complex: taxa with and without hemal systems and coeloms

50: metanephridia with coelomic compartments restricted to sacculus: taxa with and without metanephridia

51: serially repeated nephridiopores: taxa with and without nephridia

57: ultrafiltration through podocytes: taxa with and without podocytes

65: mouth/esophagus with spiny/toothed cuticle consisting of crystalline chitin: taxa with and without cuticle

75: tanycytes: taxa with and without introvert

86: bipartite body: taxa with distinct body architectures

88: intovert with scalids: taxa with and without introverts

89: non-inversible mouth cone: taxa with and without introvert

93: tripartite body and coelom: taxa with and without coeloms

94: lophophore: taxa with and without mesocoels

99: articulated and segmented limbs: taxa with and without limbs

108: gonads with separate gonoducts: taxa with distinct gonad architectures and gamete outlets

109: gametes poas through coelom and metanephridia: taxa with and without coeloms and metanephridia

112: permanent gonopore: taxa with and without gonads

113: gonopericardial system: taxa with and without coelomic pericardium

117: filiform sperm: taxa with diverse forms of sperm

118: acanthocephalan type of sperm: taxa with diverse forms of sperm

119: clitellate type of sperm: taxa with diverse forms of sperm

127: males: taxa with males and hermaphroditic taxa without males (present state)

134: planula larva: taxa with and without larvae

136: larvae/adults with downstream-collecting compound cilia: taxa with monociliate epidermal cells and with nonciliate epidermal cells

137: larvae/adults with upstream-collecting single cilia taxa lacking ciliated epidermal cells and with multiciliated epidermal cells

138: trochophora: taxa with and without larvae

143: dipleurula larva: taxa with and without larvae

145: adult brain derived from/associated withlarval apical organ/apical pole: taxa with and without adult brains and apical organs

146: larval apical organ incorporated into brain: taxa with and without adult brains and apical organs

153: polyp pharynx: taxa with and without polyps

157: endodermal ring canal in medusa: taxa with and without medusae

161: podocysts: taxa with and without polyps

162: pedalia: taxa with and without medusae

186: compound cilia: taxa without epidermal cilia and with monociliate cells

193: cuticle simple or two-layered: taxa with and without cuticle scored for simple cuticle

194: collagenous cuticle: taxa with differently constructed cuticle and lacking cuticle

195: chitinous cuticle: taxa with differently constructed cuticle and lacking cuticle

196: cuticular molting: taxa with and without cuticle

199: dorsal cuticle with aragonite spicules: taxa with and without cuticle

202: cuticular sclerite formation: taxa with and without cuticle

203: sclerotization of cuticle with tannin proteins: taxa with and without cuticle

204: myelinic cuticle: taxa with and without cuticle

214-216, 219: characters coding for nematocyst differentiations: taxa with and without nematocysts

230-231: characters coding for synapses with different neurotransmitters: taxa with and without nerve cells

235: single pair of ventral cords: taxa with and without ventral nerve cords

237: cerebral ganglion: taxa with and without centralized nerve concentrations

239: dorsal neural tube: taxa with ventral nerve cords and without nerve concentrations

240: collar-shaped brain: taxa with diverse brain constructions

241: proto-, deuto- and tritocerebrum: taxa with diverse brain constructions

244: caudal ganglion: taxa with and without caudally located ganglia

248: lateral nerve cords: taxa with and without nerve cords

252: number of statoliths: taxa with and without statoliths

253: number of statocyst parietal cells: taxa with and without statocysts