fast sequential
Markov coalescent
simulation of genomic data under complex evolutionary models
While preserving all the simulation flexibility of simcoal2,
fastsimcoal is now implemented under a faster continous-time sequential
Markovian coalescent approximation, allowing it to efficiently generate
genetic diversity for different types of markers along large genomic
regions, for both present or ancient samples. It includes a parameter
sampler allowing its integration into Bayesian or likelihood parameter
estimation procedure.
fastsimcoal can handle very complex evolutionary scenarios including an
arbitrary migration matrix between samples, historical events allowing
for population resize, population fusion and fission, admixture events,
changes in migration matrix, or changes in population growth rates. The
time of sampling can be specified independently for each sample,
allowing for serial sampling in the same or in different populations.
Different markers, such as DNA sequences, SNPs, STRs (microsatellites)
or
multi-locus allelic data can be generated under a variety of mutation
models (e.g. finite- and infinite-site models for DNA sequences,
stepwise or generalized stepwise mutation model for STRs data,
infinite-allele model for standard multi-allelic data).
fastsimcoal can simulate data in genomic regions with arbitrary
recombination rates, thus allowing for recombination hotspots of
different intensities at any position. fastsimcoal implements a new
approximation to the ancestral recombination graph in the form of
sequential Markov coalescent allowing it to very quickly generate
genetic diversity for >100 Mb genomic segments.
fastsimcoal2
now allows one to estimate demographic
parameters from
the (joint) site frequency spectrum (SFS)using
simulations to compute the expected SFS and a robust method for the
maximization of the composite likelihood.
new version of fastsimcoal2 : fsc28 (September 2023)
fsc28 main new feature is the ability to
deal
with sample spatial and time heterogeneity with the introduction of the
notion of sfs pools.
New syntax in the .tpl
files
to deal with sample heterogeneity. We
introduce the concept of sfs pools where the sfs of different samples
can be
computed as a pool. It allows for considering any spatial of temporal
heterogeneity. New key word “sfspool”
in deme size section
Possibility to record the deme
of origin of chromosome
segments when implementing an admixture even so that it is possible to
simulate
chromosome painting. New keyword “recordAdmOrigin”
in historical
events
New command line options (-y
and -z) to fine
tune the parameter estimation procedure
Other changes and bug
corrections:
When simulating several data
files with definition files (.def)
the SFSs are written in different files, either in separate directories
with
the -j option, or in the same directory without the -j option
Program was crashing when
simulating exponential growth and
migration. Bug found by Jason Weir.
Optimisation of computations when estimating
data from
multidimensional SFS
Bad computation of lhood when estimated from
the maxL.par
files as compared to that computed during parameter
estimations, in case of
population growth. Bug found by Kyle Lewald
Incorrect simulations from par files when some
demes are
explicitly killed. Bug found by Kyle Lewal
See this page
for a complete
list of changes since first fastsimcoal release
benchmarks
Comparisons with other coalescent simulations programs such as ms,
simcoal2
or MaCS
can be found here
getting started
A quick overview of how to get started with fastsimcoal can be found here (but it is better to
read the manual
first)
additional scripts for the preparation of input files and the
analysis of results
A series of R and bash scripts decribed and made available here
have been developed by several people from our group to facilitate the
analysis of the results of fsc27 as well as for preparing input file
running fsc27 on a mac
I have realized (thanks to Melissa Wilson Sayre) that the plain version
of fsc26 and following will not run on mac osX unless you have
installed a recent
version of gcc.
This is because fsc26 and above is multithreaded and it uses intel's
libraries
based on openMP, which are not distributed anymore with recent versions
of mac OSX.
So to be able to run fsc28 on your mac, you need to first install a
recent version of gcc.
Extract the tar archive with the command
gunzip gcc-10.2-bin.tar.gz
Install gcc ver 5.1 in /usr/local with the command sudo tar -xvf gcc-10.2-bin.tar -C /.
problems running fsc27 on old versions of linux (kernel too
old)
It seems that fsc27 (and more recent versions) is not able to
run on old linux version with an old
kernel, potentially due to the need of openmp libraries that need to be
dynamically linked to the program.
A Google
group on fastsimcoal
(https://groups.google.com/forum/#!forum/fastsimcoal) has been created
to promote discussion or allow queries on any aspect of fastsimcoal.
Please use it!
citation
fastsimcoal2:
Excoffier,
L., Kapopoulou, A., Marchi,, N. (2023) Demogenomic inference from
spatially and temporally heterogeneous samples. Molecular Ecology
Resources https://doi.org/10.1111/1755-0998.13877
Excoffier, L., Marchi,N.,
Marques, D. A., Matthey-Doret, R.,
Gouy, A., Sousa, V. C. (2021) fastsimcoal2:
demographic
inference
under complex evolutionary scenarios. Bioinformatics. 37:4882-4885.
Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V.C., and M.
Foll
(2013) Robust demographic inference from genomic and SNP data. PLOS
Genetics, 9(10):e1003905.
fatsimcoal:
Excoffier, L. and Foll, M (2011) fastsimcoal: a continuous-time
coalescent simulator of genomic diversity under arbitrarily complex
evolutionary scenarios Bioinformatics 27: 1332-1334.