fast sequential
Markov coalescent
simulation of genomic data under complex evolutionary models
While preserving all the simulation flexibility of simcoal2,
fastsimcoal is now implemented under a faster continous-time sequential
Markovian coalescent approximation, allowing it to efficiently generate
genetic diversity for different types of markers along large genomic
regions, for both present or ancient samples. It includes a parameter
sampler allowing its integration into Bayesian or likelihood parameter
estimation procedure.
fastsimcoal can handle very complex evolutionary scenarios including an
arbitrary migration matrix between samples, historical events allowing
for population resize, population fusion and fission, admixture events,
changes in migration matrix, or changes in population growth rates. The
time of sampling can be specified independently for each sample,
allowing for serial sampling in the same or in different populations.
Different markers, such as DNA sequences, SNPs, STRs (microsatellites) or
multi-locus allelic data can be generated under a variety of mutation
models (e.g. finite- and infinite-site models for DNA sequences,
stepwise or generalized stepwise mutation model for STRs data,
infinite-allele model for standard multi-allelic data).
fastsimcoal can simulate data in genomic regions with arbitrary
recombination rates, thus allowing for recombination hotspots of
different intensities at any position. fastsimcoal implements a new
approximation to the ancestral recombination graph in the form of
sequential Markov coalescent allowing it to very quickly generate
genetic diversity for >100 Mb genomic segments.
fastsimcoal2
now allows one to estimate demographic
parameters from
the (joint) site frequency spectrum (SFS)using
simulations to compute the expected SFS and a robust method for the
maximization of the composite likelihood.
new version of fastsimcoal2 : fsc27 (.09) (October 2022)
fsc27 introduces several improvements for the simulation of
large recombining segments, highly subsidivided population, a new
output file format, new syntax of .est files facilitating the
description of complex scenarios. and
corrects several bugs.
fsc2705 corrects a bug occurring in presence of exponential
growth an migration, which has been introduced in released version 2.7. Previous version 2.6 did not have this bug. People
who have been simulating scenarios with both migration and exponential
growth are encouraged to redo their computations with the new version,
even if the program did not crash.
fsc2709 corrects several bugs (detected by Kyle Lewald):
Lhood estimated from par files was badly estimated in case of population growth
Problems
when simulating arrays of demes using a single migration matrix after
some demes were explicitly killed with historical events
A
new version of fsc27 for Linux is now available (fsc27093) for download
on October 13th 2022. It corrects a bug, preventing fsc to get stuck
randomly at the beginning of the computations. Problem
was due to gcc compiler, and not to code. Windows and Mac version were
unaffected. Results of runs that did not get stuck are ok.
what's new in fastsimcoal2709 (compared to ver 2.6)
New features
New syntax in the .est files.
It is now possible to
include previously defined simple parameters as search range
delimiters. The
keyword paramInRange needs to be specified at the end of
lines
containing such parameters.
New keyword in .par or .tpl
file: absoluteResize.
It allows a given sink population to take a new absolute size,
independently of
its previous size. It eliminates the need to compute this resize as a
complex
parameter in the .est file
The [RULES] section has been
suppressed from input
files. It is simply not read anymore. These rules have become obsolete
given
the new syntax described in point 1.
SNP data
types are not considered anymore, as they led to
biased simulations. Use short segments of DNA and the -sX
option to
generate X SNPs instead
Simulations of large and sparsely occupied
structured
populations has been optimized and can be up to10 times faster than the
previous version. There is very little gain for simulations with a
small number
of migration-connected demes, though.
Simulations of large recombining chromosomes has
been
optimized, when using large values of the -k options
Generation of genotype table (.gen
file) as an
alternative output to Arlequin (-G option). The
additional -g
option allows one to generate diploid genotypes (coded as 0, 1 or 2)
instead of
haploid genotypes (coded as 0 or 1)
Possibility to “kill” demes, such as
to make them
inaccessible to migration. Setting a sink deme size to zero (using a
sink
resize of zero in a historical event) will now prevent further
migration to
this deme. This is useful as one can keep the same migration matrix
after the
disappearance of some demes (e.g. due to population fusion backward in
time).
Comments are now possible at the end of any line
of .est files.
Other changes and bug corrections:
When
a deme size goes to zero (e.g. due to negative growth),
a warning is only produced if the deme is occupied (thanks to David
Marques for
requesting this change).
Bug corrected
when computing likelihood with ghost
populations and a single sampled deme.
Corrected bug (found by David Marques) with
options --noSingleton
and --foldedSFS in the presence of ghost populations (the max
est lhood
was larger than the max obs lhood).
Corrected
bug occurring when computing the position of the
next recombination position in case of very small recombination rates
(thanks
to Silvert Martin)
Corrected
important bug (thanks to David Marques) in case of
the introduction of population growth at a given point in a population
of
initial constant size. The population size was adjusted as if there had
been
growth since generation zero.
Corrected
bug (thanks to Yu Sugihara) when generating
diversity based on random parameters and using -Ex option when x
>1.
Corrected
bug (thank to Jason Weir) when simulations scenarios with both
migration and exponential growth. It led to program crashes and
incorrect migration patterns.
See this page
for a complete
list of changes since first fastsimcoal release
benchmarks
Comparisons with other coalescent simulations programs such as ms,
simcoal2
or MaCS
can be found here
getting started
A quick overview of how to get started with fastsimcoal can be found here (but it is better to
read the manual
first)
additional scripts for the preparation of input files and the
analysis of results
A series of R and bash scripts decribed and made available here
have been developed by several people from our group to facilitate the
analysis of the results of fsc27 as well as for preparing input file
running fsc27 on a mac
I have realized (thanks to Melissa Wilson Sayre) that the plain version
of fsc26 and following will not run on mac osX unless you have installed a recent
version of gcc.
This is because fsc26 and above is multithreaded and it uses intel's libraries
based on openMP, which are not distributed anymore with recent versions
of mac OSX.
So to be able to run fsc27 on your mac, you need to first install a
recent version of gcc.
Extract the tar archive with the command
gunzip gcc-10.2-bin.tar.gz
Install gcc ver 5.1 in /usr/local with the command sudo tar -xvf gcc-10.2-bin.tar -C /.
problems running fsc27 on old versions of linux (kernel too old)
It seems that fsc27 is not able to run on old linux version with an old
kernel, potentially due to the need of openmp libraries that need to be
dynamically linked to the program.
A Google
group on fastsimcoal
(https://groups.google.com/forum/#!forum/fastsimcoal) has been created
to promote discussion or allow queries on any aspect of fastsimcoal.
Please use it!
citation
fastsimcoal2: Excoffier, L., Marchi,N., Marques, D. A., Matthey-Doret, R.,
Gouy, A., Sousa, V. C. (2021) fastsimcoal2: demographic
inference
under complex evolutionary scenarios. Bioinformatics. 37:4882-4885.
Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V.C., and M.
Foll
(2013) Robust demographic inference from genomic and SNP data. PLOS
Genetics, 9(10):e1003905.
fatsimcoal:
Excoffier, L. and Foll, M (2011) fastsimcoal: a continuous-time
coalescent simulator of genomic diversity under arbitrarily complex
evolutionary scenarios Bioinformatics 27: 1332-1334.