**************************************************************************************** *Estimation of the parameters of a demographic expansion from the mismatch distribution* **************************************************************************************** The methods is that described in Schneider, S., and L. Excoffier. 1999. Estimation of demographic parameters from the distribution of pairwise differences when the mutation rates vary among sites: Application to human mitochondrial DNA. Genetics 152:1079-1089. Please read CAREFULLY this paper before proceeding to the rest. Steps to perform: ----------------- 1) Compute H file for a given sequence length, and for a given mutation rate distribution (see below for what it contains). This is done with the program CALC_H.EXE (see below) 2) Estimate the demogrpahic parameters from the mismatch distribution This is done with the program EXPDEMOG.EXE (see below) CALC_H.EXE ========== CALC_H.EXE will allow you to compute the H coefficients appearing in eqs. 8 and 9 of Schneider and Excoffier (1999). These H coefficients (or more exactly H_m(i, j) )are the conditional probabilities of observing i difference given thata j mutations have occurred in the ancestry of two sequences of length m. We compute these coefficients recurrently, but it takes along time for long sequences. Input for CALC_H.EXE -------------------- When running the program CALC_H.EXE, you will be asked to provide: 1) The mutation model you want to consider (ALWAYS use the Gamma rates model) 2) The length of your sequence 3) The alpha parameter of the Gamma distribution of mutation rates 3) The transition rate Optionally, you may be requested to enter a file with arbitrary mutation rates in your sequence. Ouptut of CALC_H.EXE -------------------- A file will be produced, stating the length of the sequence and the mutation model. Use that file as a parameter for the program EXPDEMOG.EXE EXPDEMOG6.EXE ============= This program will actually perform the estimation of the parameters Tau=2Tu, Theta0=2N0u and Theta1=2N1u, with T being the number of generations since the onset of the instantaneous expansion N0 being the size of the ancestral population (before the expansion) N1 being the size of the new population (after the expansion) Input file for EXPDEMOG6.EXE ---------------------------- This file contains information on the DNA mutation pattern and the observed mismatch. Format: L1: Sequence length L2: alpha (shape) parameter of the Gamma distribution of mutation rates. It must be the same that you used for computing the H file L3: Name of the precomputed H file L4: Initial values (guestimates) for the three parameters Tau, Theta0, and Theta1,in that order. L5 and following: Mismatch distribution (the number of pairs showing 0, 1, 2, 3, ... numbers of differences) Running EXPDEMOG6.EXE --------------------- To run EXPDEMOG.EXE, you need 1) to have prepared an input file, as described above 2) to define how many bootstrap will be performed for estimating confidence intervals around the estimates We shall assume that you have computed the H files under the n-gamma rates model, which is our mutation model No. 5. Then to launch the estimation of the parameters type: expdemog.exe where is the name of the input file you have prepared. is the number of bootstrap to be performed for estimating confidence intervals around the estimates should be No. 5 is the proportionof substitutions that are transitions. Example: -------- expdemog6.exe inputExample.msm 1000 5 0.9 Output file of expdemog6.exe ============================ A file with the same name as the input file but with the suffix "_res.xl" will be created, and will contain estimates of the three parameters Tau, Theta0, and Theta1, confidence intervals, as well as the mismatch distribution for those parameters. Enjoy Contact: Laurent Excoffier CMPG University of Berne Baltzersreasse 6 3012 berne Switzerland email: laurent.excoffier@zoo.unibe.ch