Extensive data on ringed and (Admelpg)- birds over a 3-year period indicate low migration rates (Galbuseraet al. As discussed in background on clustering methods, it is currently Mulutm to use distance-based clustering methods to visualize genotype data of this kind. To permit a comparison between that type of approach and our own method, we begin by showing a neighbor-joining tree of the bird data (Figure 3).

Inspection of the tree reveals that the Chawia and Mbololo individuals represent (somewhat) distinct clusters. Several individuals (marked by asterisks) appear to be classified with other groups.

The tree illustrates several shortcomings of distance-based clustering methods. First, it would not be possible (in this case) to identify the appropriate clusters if the labels were missing. Second, since the tree does not use a formal probability model, it is difficult to ask statistical questions about features of the tree, for example: Are the individuals marked with asterisks actually migrants, or are they simply misclassified by chance.

Is there evidence of population structure within the Ngangao group (which appears from the tree to be quite diverse). Inzulin tree of individuals in the T. Each tip represents a single individual. C, M, N, and Y indicate the populations of origin (Chawia, Mbololo, Ngangao, and Yale, respectively). Using the labels, it is possible to group the Chawia and Mbololo individuals into (somewhat) distinct clusters, as marked.

Choice of K, for Taita thrush data: To choose an appropriate value of K for modeling the data, we ran a series of independent runs at the Gibbs sampler at a range of values of K. The distance matrix was computed as follows (Mountain and Cavalli-Sforza 1997).

After running numerous medium-length runs to investigate the behavior of the Gibbs sampler (using the diagnostics described in Choice of K for simulated data), we again chose to use a burn-in period of 30,000 iterations and to collect data for 106 iterations.

We ran three to five independent simulations of this length for each K between 1 and 5 and found that the independent runs produced highly consistent results. Given these results, we now focus our subsequent analysis on the model with three populations. Clustering results for Taita thrush data: Figure 4 shows a plot of the clustering results for the individuals allergy medicine the sample, assuming that there are three populations (as inferred above).

We did not use (and indeed, did not know) the sampling locations of individuals when we obtained these results.

All of the points in the extreme corners (some of which may be difficult to resolve in the picture) are assigned. We return to this data set in incorporating population information to consider the question of whether the individuals that seem not to cluster tightly with others sampled from the same location are the product of migration.

Inferring the value of K, the number of populations, for the T. This may reflect the presence of population structure within the continental groupings, although in this case the additional populations do not form discrete clusters and so are difficult to interpret. Again it is interesting to contrast our clustering results with the neighbor-joining tree of these data (Figure 6).

While our method finds it quite easy to separate the two continental groups into the correct clusters, it would not be possible to use the neighbor-joining tree to detect distinct clusters if the labels were not present. The data set of Jorde also contains a set of individuals of Asian origin (which are more closely related to Europeans than are Africans). Neither the neighbor-joining method nor our method differentiates between the Europeans and Asians with great accuracy using this data set.

The results presented so far have focused on testing how well our method works. We now turn our attention to some further applications of this method. Our clustering results (Figure 4) confirm that the three main geographic groupings in the thrush data set (Chawia, Mbololo, and Ngangao) represent three genetically distinct populations. Individual 2 is also identified as a possible outlier on the neighbor-joining tree (Figure 3).

Given this, it is natural to ask whether these apparent outliers are immigrants (or descendants of recent immigrants) from other populations.

