<oological Journal of the Linnean SocieQ (1982), 74: 267-275. With 2 figures Compatibility analysis and its applications WALTER J. LE QUESNE Anne Cottage, 70 Lye Green Road, Chesham, Buckinghamshire HP5 3NB Accepted for publication JUM 1981 A two-state character is defined as uniquely derived if it has only evolved once in the history ofa group, without subsequent reversal. Two independent characters cannot both be uniquely derived if all four possible combinations (or all three excluding that of the two ancestral forms) occur. A number of ways of choosing compatible sets of uniquely derived characters are discussed and used to derive possible unrooted and rooted trees. Results of these are related to those chosen on parsimony criteria, using data for orthopteroid groups, and the assumptions of both methods are compared. Application of compatibility analysis to the moth genera Teldenia and Argodrepunu is also discussed. Compatibility and parsimony methods are complementary rather than exclusive of each other. KEY WORDS:4ladistics - polarity - parsimony - numerical - taxonomy - incompatibility. CONTENTS Introduction . . . . . . . . . The concept of incompatibility . . . . . Compatibility analysis . . . . . . . The character-pair matrix . . . . . Constructing the network . . . . . The coefficient of character-state randomness The normal deviate . . . . . . Finding the largest possible set . . . . Rooted cladograms . . . . . . Application to multistate characters . . Use of reference taxa outside the study group Application to a subset of the study group . Application to the genus Teldeniu . . . . Application to Kamp’s orthopteroid data . . Value of compatibility methods. . . . . Philosophical considerations . . . . Use of more than one technique of cladogram . . . . . . . . . References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . construction . . . . . . . . . . . . . . . . . . . . . . . . . . 267 268 268 268 269 269 269 270 270 270 270 27 1 271 27 1 274 274 274 274 INTRODUCTION The most direct method of attempting computer analysis of any problem is to try to define the successive steps involved in one’s instinctive approach to it. The classical method of suggesting a phylogeny normally depended on an implicit assumption that a chosen character (or group of characters) had only evolved 0024-4082/82/030267+09 $02.00/0 267 0 1982 The Linnean Society of London W. J. LE QUESNE 268 ’Table 1. Derivation of two independent uniquely derived characters CHARACTER I ..tncestral state First derived A B Second derived A B or CHARACTER 2 A ’4 A A or CH.4RACTER 1 B B (but not both) B B or CHARACTER 2 A B A B (but not both) once, without subsequent reversal, and using it to indicate the pattern of evolution. However, the publication of widely diverse trees for the same group by different authors made taxonomists wonder if more objective methods might not settle the problems more decisively, though with hindsight it would appear that the improvements are often rather limited. THE CONCEPT OF INCOMPATIBILITY About 13 years ago, as a taxonomist with no practical experience of numerical methods, the thought occurred to me that by consideration of two independent twostate characters together one could get an indication of the possibility of their both being uniquely derived, that is having evolved once without subsequent reversal. The test depends on the realization that the species ancestral to the whole group being studied had both characters in the ancestral state, represented in Table 1 by A, where B represents an evolved state. The first change which occurred produced state B in one or other of these characters, and the second change might have occurred either among species of the original AA combination or the evolved AB (or BA) combination, producing one or other of the extra combinations shown in either case, but not both, making a total of three combinations. Thus, it is impossible to obtain all four combinations together without either a character evolving twice or reversal of an already evolved character. The logical consequence is that if all four combinations of character-states for the two characters are found, then one or other of the characters is not uniquely derived (or possibly both are not), as pointed out in my original paper on this topic (Le Quesne, 1969). Estabrook et al. (1976) introduced the term ‘compatible’ for pairs ofcharacters that passed this test. It does not matter for this compatibility test which is the ancestral and which the derived state. Unfortunately, an incompatibility does not tell us which of the characters is or is not uniquely derived. COMPATIBILITY ANALYSIS The character-pair matrix In my 1969 paper, I suggested making up what I called the ‘character-pair’ matrix showing whether or not each possible pair of characters was compatible COMPATIBILITY ANALYSIS 269 with each other, and then eliminating the character with the largest number of incompatibilities by drawing vertical and horizontal lines. The remaining incompatibilities are counted again and the process repeated until none remain. Another simple method suggested in a subsequent paper (Le Quesne, 1972) was acceptance of the character with the smallest number of incompatibilities, eliminating those incompatible with it in similar fashion, recounting and again eliminating until all the incompatibilities have been removed. Constructing the network After elimination by either of these techniques, one is left with a set of characters each of which divides the set or organisms under study into two groups depending on their character-state. To turn this information into a cladogram, I found it most convenient to start with the selected character which divides the taxa under study into two groups as nearly equal as possible, when all the other selected characters will be found to split off a section of one or other of these two groups, enabling one to build up the tree. The convenience of this approach masks the fact that there is no evidence for the position of the root, as pointed out by Felsenstein (1975): thus the term ‘network’ might be more appropriate. The fact that a set of characters which were all compatible when taken on a pairwise basis was compatible as a whole was rigidly proved by McMorris (1975). The coeficient of character-state randomness I also suggested (Le Quesne 1969, 1972) calculation of the number of incompatibilities to be expected if each character was represented by the found number of A and B character-states, but these were distributed in a random fashion: the calculated total number of these I called P,.The ratio obtained by dividing the actual number of incompatibilities (denoted by N,) by the value of P, and expressing a percentage is called the ‘coefficient bf character-state randomness’, and clearly will be zero for a group of completely compatible characters. In most cases studied, the figures have been in the range 67-96y0 (Le Quesne, 1975), but when the figures are above 90% there are difficulties in getting meaningful cladograms. This coefficient of character-state randomness can be applied to the whole of the data or just to the pairings between any one character and each of the others. If this method is used to give a value for each character, it is possible to select the character with the lowest value and to eliminate characters incompatible with this, carrying on until a mutually compatible set is obtained. This constitutes a third selection method. The normal deviate A fourth method, described in my 1972 paper, depends on calculation of a normal deviate, using the formula P S- N x N.D. = (p.(nhPs))” in which N, and Ps are as previously defined and no represents the number of valid comparisons (those involving characters with two or more examples of both character-states). This criterion can again be applied to a single character or to a 270 W. J. LE QUESNE group of characters: where a number of characters are completely correlated, the information from them can be reinforced by use of the combined normal deviate. Again, we can select the character with the largest positive normal deviate, eliminate characters incompatible with this, choose the remaining character with the highest positive value, and so on. The normal deviate can also be calculated for a set of mutually compatible characters, giving a figure related to the possibility of the assemblage occurring by accident, as in Table 2. A single Fortran program has been written which produces networks by all four of the above selection methods. Finding the largest possible set A fifth criterion for selection of a set of compatible characters is to find the largest possible set. I am rather sceptical of the ‘biggest is best’ concept, possibly based on the outcome of its application to the orthopteroid data which I studied and thus a rather subjective view, but it has been developed by Estabrook et al. (1977). My suspicion is that it might bring out parallelisms based on function. Rooted cladograms All the methods described so far require no assumptions as to which is the primitive and which the derived state: that is why unrooted trees are produced. If we know the actual direction of change in each character, we can make a somewhat more stringent compatibility test. Returning to Table 1 , we see that for two uniquely derived characters only AA and two other combinations occur, i.e. the condition that all three combinations other than AA (i.e. AB, BA and BB), occur is sufficient for an incompatibility (Le Quesne, 1979).This is the basis for the ‘cliques’ proposed by Estabrook et al. (1977) and will lead to finding compatible synapomorphies. Application to muitistate characters The methods which I have de\reloped only apply to two-stage characters, but Estabrook et al. (1977) have proposed methods for detecting incompatibilities in multistate characters. However, I prefer breaking these up into pairs of two-state characters (e.g. for a base sequence into (a) A or not A, (b) C or not C, etc.), since one character-state may be uniquely derived from the ancestral one and another derived several times from it : in such circumstances one could easily ‘throw away the baby with the bath-water’ using Estabrook’s method. It should be noted that the choice of unrooted trees in either compatibility or ‘parsimony’ methods is unaffected by ‘singularities’ (i.e. a character-state only found in one taxon under study) and this constraint can substantially reduce the number of two-state characters to be tested. Use of reference taxa outside the study group In practice, we usually deduce the primitive state by reference to organisms closely related to, but outside the study group. Thus, a practical alternative is to add one or more of these reference species to the data matrix, which essentially gives a direction to the tree and supplies a root. An example is the moth genus Teldenia, as discussed below. COMPATIBILITY ANALYSIS 271 ilpplication to a subset of the study group. I have previously referred (Le Quesne, 1975) to some diaspid data where the relationship within a cluster of ten species out of the original 26 were not clear from the initial analysis, but when these ten were studied on their own the relationships became clear. It is thus often very valuable in this work to make changes in the set of taxa under analysis. APPLICATION T O THE GENUS TELDENZA The methods discussed above have been applied to data for the moth genus Teldenia published by Wilkinson (1967). As seen from Table 2, application to the genus Teldenia on its own gave a different cladogram for each of the five methods of selection, very low normal deviates and a coefficient of character-state randomness of over 9476, clearly suggesting that not much confidence could be put in the results. However, by combining these data with those for the related genus Argodrepana a substantially clearer picture could be obtained. Table 2. Teldenia and Argodrepana : results of application of five elimination methods Argodrepana No. of characters No. of species No. of different selections No. of characten selected Normal deviate C.C.S.R. (all data) 23 7 1 19 15.5 34.704 + Teldenia + 36 32 5 7-1 1 2.09-4.02 94.104 A rgodrepana Teldenia + + 51 39 2 19-22 17.4-+ 17.8 72.700 The two alternative cladograms are shown in Fig. 1. In this case, the partition into these genera was supported by seven characters (numbers 8, 11, 16, 19, 60, 64 and 66), while five other characters (numbers 9, 13, 14, 46 and 52) represented groupings within Argodrepana. Thus, one can with confidence supply a root at the point where the two genera separate. Within Teldenia, characters 1,2,22,58,87 and 95 are selected for both alternative trees. Cladogram A also depends on characters 27, 62,63 and 69, while cladogram B is supported by character 44. The latter character is one of wing pattern, and thus must remain rather suspect, though having the advantage that all the species are included in the cladogram. Three of the characters supporting cladogram A are based on male genitalia, but this scheme has the disadvantage that not all of the species can be placed. APPLICATION T O KAMPS ORTHOF'TEROID DATA Mickevich (1978) has applied a useful objective test to various numerical methods by use of pairs of data matrices for the same study group of organisms. (Incidentally, the method which she ascribes to me is founded more closely on Estabrook's work, based on finding the largest clusters of mutually compatible W.J. LE QUESNE 272 Cladograrn B Cladogram A Argodrepano spp. Argodrepona spp veriicoio galbona ouroiifrons 3 ouratifruns denficulafo deniicuIota ienebro 4 5 6 umbroso 7 ruficosia ruficosfo , 1 z-j3j 5 fenebra Teldenio spp Teldenio SPP 27 23 C - 27 b 23 W O pure 24 cofhoro coibaro 25 30 19 b 30 19 ruficosio monilioio ruficosio mon/liaia niveo niveo opoio unisirigoia 2I sparsoto 14 orgeio heleno 20 d 26 28 unisfrigato 2I specca subpuro 29 niveoto sirigoso 32 inanis obsoleio 29 subpuro nigrinotaio melanasiicfo demo 9 desma ouriiineo niveoio I3 specco atbo olbo vesfigiaio psoro loiilinea celidogropbia 10 psaro 5 celidographio I0 geminato Jlunaia 12 15 unplaced geminofo 12 sparsofa 14 15 apato 18 orgeto d 26 28 inonis 17 seriato 31 heleno 20 seriato 31 sfrigoso 32 Figure 1. Two cladograms obtained by compatibility analysis of data for the genera Teldmia and Argodrepana. COMPATIBILITY ANALYSIS Giles doto 213 Blockith doto steps &h G b D B M P G T A m steps steps 1 Ip Combined dota (I) (n) 160 141 7 139 160 140 160 139 174 n133 171 Ill 135 306 172 I Y 135 307 301 P n G b D B M P A T G L??d?l Jll 300 Gb D P B M A T G m IU 299 Gb D P B M A T G I A h 307 Gb P B M D A T G II I G b P D B M A T G rnh G ~ MB P D A T G C.C.S.R.(011 doto) No.of chorocters No. of chorocters selected 86.I% 76 18 - 21 88.3% 56 14- 16 88.7% I32 30 - 33 Figure 2. Cladograms selected on compatibility analysis and parsimony grounds from Kamp’s data on orthopteroids. Roman numerals indicate the selection methods leading to each cladogram based on the various data sets. A, Acrididae; B, Blattaria; D, Dermaptera; G, Gryllidae; Gb, Grylloblattodea; M, Mantodea; P, Phasmida; T, Tettigoniidae. characters.) Following her philosophy, I have recently been analysing the two separate data matrices on orthopteroids published by Kamp (1973) in an attempt to place the aberrant group Grylloblattodea. The results obtained are shown in Fig. 2, using the five selection methods which I have discussed above, on both the separate matrices and on the combined data. I have also found the minimum number of steps required for each network as a parsimony criterion, using a method based on that of Fitch (1971, 1975). In fact, seven different networks have been selected, but they fall into two groups, the upper ones fitting the Giles data on both compatibility and parsimony grounds W. J. LE QUESNE 274 (the most parsimonious number of steps in each case being underlined). T h e bottom three networks fit the Blackith data better using both techniques, while the combined data fit in with the first, second and fourth networks. (Here methods I and 1’do not distinguish between two possibilities-the methods are numbered in the order which they are mentioned above and as designated in my 1972 paper.) From these results we may conclude that in general compatibility and what are traditionally termed ‘parsimony’ methods will not lead to widely disparate answers. It may be noted, incidentally, that the Grylloblattodea have been put on the left of the network in each case and that three possibilities emerge for their closest relatives. The coefficients of character-state randomness are between 86 and 8 9 O 6 , making conclusions not very firm: in every case, after the singularities have been excluded, the number of characters selected is less than 3001, of the total number. VALUE OF COMPATIBILITY METHODS Philosophical considerations Finally, why use compatibility methods? I feel that they are close to the implicit, sometimes subconscious, judgments long made by taxonomists when assessing relationships. The fundamental philosophical question that separates compatibility and ‘parsimony’ methods is whether all characters are equal in their information content. Classical taxonomists tend to think in terms of ‘stable’ and ‘unstable’ characters, and hope to find some in the former category that are good indicators of the history of the group. The idea of an ‘unstable character’ is often applied to those that are adaptive to ecological circumstances. This is less obviously true with base or amino-acid sequences, but may still be significant if the biological function of the protein coded for by the gene is not known. Use of more than one technique of cladogram construction Moreover, use of a number of different techniques helps to indicate the degree of confidence one can put in the cladograms produced. When a number of techniques give similar results, one naturally feels happier than when each gives a very different conclusion, so the various compatibility and ‘parsimony’ tests available should be regarded as complementary. REFERENCES ESTABROOK, G. F., JOHNSON, C. S. Jr. & McMORRIS, F. R., 1976. A mathematical foundation for the analysis of cladistic character compatibility. Malhrmafical Bioscience, 29: 181-187. ESTABROOK, G. F., STRAUCH, J. C . & FIALA, K. L., 1977. An application of compatibility analysis to Blackith’s data on Orthopteroid insects. .$sfemufir (WIOQ, 26: 269-276. FELSENSTEIN, J., 1975. Discussion ofpreceding presentation. In G. F. Estabrook (Ed.), Proceedings of the Eighfh Infcrnafwna[ Conference on :Vumnual Taxonomy: 428. San Francisco: W. H. Freeman. FITCH, W. M., 1971. Toward defining the course of evolution: minimum change for a specific tree topology. Sysftmafic ZWIO~V, 20: 40-416. FITCH. W. M.. 1975.Toward finding the tree ofmaximum parsimony. In G. F. Estabrook (Ed.),Proceedingsofthe Eighfh Infernational Conference on .Numcrual Taxonoms: 189-230. San Francisco: W. H. Freeman. KAMP, J. W., 1973. Numerical classification of the Orthopteroids, with special reference to the Grylloblattodea. Canadian Entomologist, 105: 1235-1 249. LE QUESNE, W.J., 1969.A method ofselectionofcharacten in numerical taxonomy. .$sfematic<oology, 18: 201-205. LE QUESNE, W. J., 1972. Further studies based on the uniquely derived character concept. .$stmatic <oology, 21: 281-288. COMPATIBILITY ANALYSIS 275 LE QUESNE, W. J., 1975. Discussion of preceding presentations. In G. F. Estabrook (Ed.), Pmceedings of the Eighth International Conference on Numerical Taxonomy: 416-429. San Francisco: W. H. Freeman. LE QUESNE, W. J., 1979. Compatibility analysis and the uniquely derived character concept. +sfmatic xoology, 28: 92-94. McMORRIS, F. R., 1975. Compatibility criteria for cladistic and qualitative taxonomic characters. In. G. F. Estabrook (Ed.),Proceedings of the Eighth International Conference on Numerical Taxonomy: 399-415. San Francisco: W. H. Freeman. MICKEVICH, M. F., 1978. Taxonomic congruence. Systematic .Zoology, 27: 143-1 58. WILKINSON, C., 1967. A taxonomic revision of the genus Teldenia Moore (Lepidoptera: Drepanidae, Drepaninae). Transactions of the Royal Entomological Socieg of London, 119: 303-362.
© Copyright 2026 Paperzz