DNA sequences provide an easy way to quantify phylogenetic characters and promise to revolutionize systematics by clarifying phylogenetic relationships between taxa. Has this promise been realized? Probably not to the extent some have hoped for, but sequence data are certainly interesting and instructive to analyze.
Recently, two groundbreaking works by Warren et al. (2008, 2009) have been published. Warren, Ogawa and Brower obtained DNA sequences for segments of three genes: one mitochondrial (COI) and two nuclear (EF1 and wg1), for over 200 skipper species representing all major phylogenetic lineages world-wide. These sequences were analyzed with cladistic approaches, and taxonomic hypotheses suggested by this phylogenetic analysis were presented. The scale and precision of this study is nothing short of amazing, and this work is a goldmine for future generations to explore. Here is an example.
DNA sequences are particularly convenient for phylogenetic reconstruction and evolutionary tree building. Branching order in such a tree suggests genealogy of species. However, due to significant noise in DNA data (=signals other than phylogenetic) and our yet poor understanding of how mutations happen, this branching order is very hard to figure out with any kind of certainty.
Morphological features of specimens are easy to visualize and compare; they are good for phenetic classification, which groups organisms by similarity rather than by genealogy. Since usually there is a good correlation between phenetic and phylogenetic classifications (think molecular clock!), phenetic approach can result in reasonable phylogenetic hypotheses. However, it is more difficult to score morphological features in a way needed for phylogenetic reconstruction.
Here, for the sake of curiosity, we explore a possibility to analyze DNA data in a phenetic way, the way one would compare morphology of insects in a drawer. The largest difficulty here is how to convert a string of letters, which DNA is for evolutionary analysis, to something that can be seen in 3D.
Difference between DNAs for every pair of species can be quantified and expressed as a "distance" between the two species. When DNA strings are identical, this distance is 0; when similarity is close to random, the distance is very large. DNAs of the three genes under consideration (COI, EF1 and wg1) are quite similar between all animal species. In fact, human DNA sequences for these genes are not that different from butterfly sequences. Thus there is a very strong similarity between DNA, and distances between skipper species are not very large. However, these distances are different between different pairs of species. Those skippers that are similar to each other, for example Megathymus and Agathymus, are separated by a short distance. More different-looking species, like Megathymus and Gesta have a longer DNA distance between them. This has been known for quite a while, and such distances offer a convenient way of building evolutionary trees. However, we are not after the trees here.
If one knows distances between all cities on a map, a map can be reconstructed, i.e. from distances between cities we can compute their locations on the map. The same can be done for distances between skipper species measured by DNA comparison. This way, we will see a map of skippers. The space needed for such a map is very multidimensional (more than 50 dimensions for ~200 species of skippers), so for visualization purposes, we simplify it to show the map in 3D.
The result is the following constellation (or cloud) of skippers. Each number on the "movie" images below is a skipper species Warren and co-workers obtained DNA sequences for. Only DNA data were used to generate this map. If two numbers are close to each other, DNAs of these species are very similar; if numbers are further away, DNAs are less similar. So this is a phenetic map, not phylogenetic. This is because there could be two "sisters", one of whom changed very rapidly (e.g. gained weight) and became less similar to her sister than the other sister to her cousin. In phylogenetic approach, sisters will be grouped together, because they are "closer related", but in a phenetic approach one sister will be grouped with her cousin, because she is more similar to the cousin than to her sister.
Anyway, below is this map (think a drawer with skippers!), shown in several projections for easier observation. It is an oddly shaped cloud: skippers populate the space in a very non-random fashion. In all projections it has a Y (or T) shape to it, most prominent in projection 1. Click on an image to display it in a different browser window for easier comparison.
Here is a list of species coded by numbers in the above images. Color codes subfamily as defined in Warren et al. (2009). What do you think? Is the "molecular" cloud of skippers consistent with cladistics analysis of Warren et al. (2009)? How does it agree with morphology?
Some conclusions that can be reached looking at the skipper cloud are:
1.) There is a reasonably good separation between subfamilies as they were defined in Warren et al. (2009).
2.) Three main groups are quite pronounced: Coeliadinae (red), Pyrginae (green) and Hesperiinae (blue). These are subfamilies with the largest number of species sequenced by Warren et al. (2009).
3.) Although only DNA data and no morphological data were used to plot the "cloud", distinction between the groups in it is very much consistent with morphology.