- Computation of different diversity indices
- Number of polymorphic loci or number of segregating sites
- Number of different haplotypes
- Number of alleles per locus
- Number of nucleotide sites with:
- substitutions
- transitions
- transversion
- insertion-deletion
- Nucleotide frequencies in a sample of DNA sequences
- Estimation of distances between molecular haplotypes
- Number of pairwise differences
- Proportion of pairwise differences
- Jukes and Cantor correction for multiple hits per site, with or without gamma-correction for heterogeneous mutation rates (Jukes and Cantor, 1969)
- Kimura 2-parameters correction for transition-transversion bias, with or without gamma-correction for heterogeneous mutation rates (Kimura, 1980)
- Tajima-Nei correction for unequal substitution rates among nucleotides, with or without gamma-correction for heterogeneous mutation rates (Tajima and Nei, 1984)
- Tamura's correction for transition-transversion bias and unequal substitution rates among nucleotides, with or without gamma-correction for heterogeneous mutation rates (Tamura, 1992)
- Tamura-Nei's gamma-correction for heterogeneous mutation rates (Tamura and Nei, 1993)
- Estimation of maximum-likelihood allele frequencies, with or without a recessive allele
- Estimation of maximum-likelihood multi-locus haplotype frequencies
- Using a gene counting method when the gametic phase is known
- Using an EM (expectation-maximization) algorithm when the gametic phase is unknown, or in the presence of recessive alleles.
- Standard deviations are obtained by a bootstrap procedure (Efron 1982)
- Estimation of the mutation parameter q=4Nu, and its confidence interval, from
- the observed number of alleles (haplotypes) k and the sample size n.
- the observed number of segregating sites S and the sample size n.
- The sample homozygosity (H)
- The mean number of pairwise differences (p) between all pairs of haplotypes in the sample
- Generation of expected allele (haplotype) frequencies under the infinite allele model, conditional on sample size n and observed number of alleles (haplotypes) k, using a simulation procedure adapted from Stewart (1977).
- Estimation of sample molecular diversity (mean number of pairwise site differences, p).
- Estimation of sample nucleotide diversity (mean heterozygosity per nucleotide site).
- Estimation of sample heterozygosity and sample homozygosity (unbiased estimates).
- Computation of the distribution of the number of pairwise differences between all pairs of chromosomes in the sample.
- Exact test of Hardy Weinberg equilibrium, using a Markov-chain approach modified from Guo and Thomson (1992).
- Exact test of linkage disequilibrium between any pair of loci when the gametic phase is known, using a Markov chain approach.
- Likelihood ratio test of linkage disequilibrium when gametic phase is unknown (Chi-square approximation)
- Likelihood ratio test of linkage disequilibrium when gametic phase is unknown (non-parametric test based on permutation of alleles among haplotypes)
- Ewens exact test of selective neutrality, using a procedure adapted from Slatkin (1994), applicable to any number of alleles per locus and any sample size.
- Ewens-Watterson F-test (Watterson 1978) of selective neutrality based on sample autozygosity (F)
- Tajima's selective neutrality test (Tajima 1989a) based on the comparison between the sample mean number of pairwise differences (p) and the number of segregating sites (S).
- Search for shared alleles or haplotypes between populations
- Population genetic structure is estimated from haplotypic data using an analysis of molecular variance (AMOVA) framework (Excoffier et al. 1992) with a maximum of four hierarchical levels:
- alleles (or haplotypes) within individuals
- individuals within demes
- demes within populations
- populations
- It allows the estimation of unbiased fixation indices (Weir and Cockerham 1984, Weir 1990), for any combination of these 4 sources of variability. the following data types can be accommodated:
- RFLPs
- DNA sequences
- Microsatellite data
- Standard data (allele frequencies)
- Population genetic structure is estimated from genotypic data for the same molecular data types and hierarchical levels, using the approach described in Michalakis and Excoffier (1996).
- The significance of the fixation indices are tested using non-parametric permutation approaches. Different permutation schemes are implemented when testing the different fixation indices depending on a given hierarchical structure.
- Pairwise FST's, coancestry coefficients and Nm estimates can be computed for all pairs of populations. Their significance is also tested by a non-parametric permutation approach. Pairwise FST 's can then be translated into divergence times between populations.
- Exact test of population differentiation based on the comparison of haplotype or genotype frequencies