****************************************************************************************
*Estimation of the parameters of a demographic expansion from the mismatch distribution*
****************************************************************************************
The methods is that described in
Schneider, S., and L. Excoffier. 1999. Estimation of demographic parameters
from the distribution of pairwise differences when the mutation rates vary among
sites: Application to human mitochondrial DNA. Genetics 152:1079-1089.
Please read CAREFULLY this paper before proceeding to the rest.
Steps to perform:
-----------------
1) Compute H file for a given sequence length, and for a given mutation
rate distribution (see below for what it contains).
This is done with the program CALC_H.EXE (see below)
2) Estimate the demogrpahic parameters from the mismatch distribution
This is done with the program EXPDEMOG.EXE (see below)
CALC_H.EXE
==========
CALC_H.EXE will allow you to compute the H coefficients appearing in eqs. 8 and 9
of Schneider and Excoffier (1999).
These H coefficients (or more exactly H_m(i, j) )are the conditional probabilities of
observing i difference given thata j mutations have occurred in the
ancestry of two sequences of length m. We compute these coefficients recurrently, but it takes along time for long sequences.
Input for CALC_H.EXE
--------------------
When running the program CALC_H.EXE, you will be asked to provide:
1) The mutation model you want to consider (ALWAYS use the Gamma rates model)
2) The length of your sequence
3) The alpha parameter of the Gamma distribution of mutation rates
3) The transition rate
Optionally, you may be requested to enter a file with arbitrary mutation rates in your sequence.
Ouptut of CALC_H.EXE
--------------------
A file will be produced, stating the length of the sequence and the mutation model.
Use that file as a parameter for the program EXPDEMOG.EXE
EXPDEMOG6.EXE
=============
This program will actually perform the estimation of the parameters Tau=2Tu, Theta0=2N0u
and Theta1=2N1u, with
T being the number of generations since the onset of the instantaneous expansion
N0 being the size of the ancestral population (before the expansion)
N1 being the size of the new population (after the expansion)
Input file for EXPDEMOG6.EXE
----------------------------
This file contains information on the DNA mutation pattern and the
observed mismatch.
Format:
L1: Sequence length
L2: alpha (shape) parameter of the Gamma distribution of mutation rates.
It must be the same that you used for computing the H file
L3: Name of the precomputed H file
L4: Initial values (guestimates) for the three parameters
Tau, Theta0, and Theta1,in that order.
L5 and following: Mismatch distribution (the number of pairs showing 0, 1, 2, 3, ...
numbers of differences)
Running EXPDEMOG6.EXE
---------------------
To run EXPDEMOG.EXE, you need
1) to have prepared an input file, as described above
2) to define how many bootstrap will be performed for estimating confidence
intervals around the estimates
We shall assume that you have computed the H files under the n-gamma
rates model, which is our mutation model No. 5.
Then to launch the estimation of the parameters type:
expdemog.exe
where
is the name of the input file you have prepared.
is the number of bootstrap to be performed for
estimating confidence intervals around the estimates
should be No. 5
is the proportionof substitutions that are transitions.
Example:
--------
expdemog6.exe inputExample.msm 1000 5 0.9
Output file of expdemog6.exe
============================
A file with the same name as the input file but with the suffix "_res.xl" will be created,
and will contain estimates of the three parameters Tau, Theta0, and Theta1,
confidence intervals, as well as the mismatch distribution for those parameters.
Enjoy
Contact:
Laurent Excoffier
CMPG
University of Berne
Baltzersreasse 6
3012 berne
Switzerland
email: laurent.excoffier@zoo.unibe.ch