A program to define the genetic structure of populations by a simulated annealing approach
SAMOVA 1.0 implements an approach to define groups of populations that are geographically homogeneous and maximally differentiated from each other. As a by-product, it also leads to the identification of genetic barriers between these groups. The method is based on a simulated annealing procedure that aims at maximizing the proportion of total genetic variance due to differences between groups of populations (SAMOVA, Spatial Analysis of MOlecular VAriance). The method is described in Dupanloup, Schneider and Excoffier (2002).
SAMOVA 1.0 runs on Windows. There is no Linux or Mac version yet.
SAMOVA 1.0 needs two input files. The first one (*.geo) must contain the geographic coordinates of the sampling localities of your populations. The second one (*.arp) is an Arlequin input file containing the genetic data sampled in your populations. The Arlequin file must have the SAME NAME as the geographical file with the extension (*.arp). The order of the populations in the two input files MUST BE THE SAME !!!
The file containing the geographic coordinates of the sampling localities of your populations must have the .geo extension.
Important notice: SAMOVA 1.0 does not work if two sampling localities have the same geographical coordinates.
The geographical input file must be structured the following way. Each line corresponds to a population. Each line must contain five fields separated by a tab character:
- an integer number corresponding to the line in the file
- the name of your population within quotes
- the longitude of your sampling point
- the latitude of your sampling point
- an integer (for example, 1).
Examples of input files are given below:
When the SAMOVA window disappears from your screen that means that the computations are finished. It takes time and this time depends on the number of populations you have and the number of simulated annealing processes you wish to perform.
- the name of the input files (for example: inputdata, in this case, you MUST have in the directory containing the soft the 2 inputfiles used by SAMOVA and these files MUST be called inputdata.geo and inputdata.arp).
- the number K of groups of populations you wish to define (the final structure defined by SAMOVA will contain K groups)
- the number of simulated annealing processes you wish to perform (100 seems a good choice)
- the type of molecular distance between haplotypes you want to compute (SAMOVA like AMOVA is based on a matrix of distances between haplotypes observed in the whole set of samples). With this option, you can choose between pairwise differences between haplotypes (for DNA data) or sum of squared size differences between haplotypes (for microsatellite data).
A set of output files are created by SAMOVA:
- SAMOVA_results_arlequin.txt: the genetic structure defined by SAMOVA as well as the fixation indices corresponding to this group structure and their significance level evaluated by 1,000 permutations of populations among groups.
- SAMOVA.log: this file contains all the steps done by SAMOVA 2.0 and, in case of problems, the location of the problems.
- SAMOVA_finalstructure.arp: an arlequin project file created by appending the input arlequin project file with the genetic structure defined by SAMOVA.
- SAMOVA_results.ps: this files (eps) can be read with GSview for Windows; it contains a map of the sampling points and the barriers between the groups of populations defined by SAMOVA.
- Arlequin.log: this file is generated during the computation of the fixation indices corresponding to the genetic structure defined by SAMOVA. It contains all the run-time WARNINGS and ERRORS encountered during this computations.
- Samova 1.0 has been developed on Windows XP. It might encounter problems when running on later versions of Windows. To solve this problem, run Samova 1.0 in compatibility mode for Windows XP. To do so:
- right click on the icon of Samova 1.0
- click Properties
- in the Properties dialog box, click the Compatibility tab
- select the Run This Program in compatibility mode for Windows XP
- Tab characters MUST be used as separator in the .geo files. If you use spaces, instead, Samova 1.0 will not run properly.
- Dupanloup, I., Schneider, S., Excoffier, L. (2002) A simulated annealing approach to define the genetic structure of populations. Molecular Ecology 11(12):2571-81.
- Excoffier, L., Smouse, P., Quattro, J.M. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131: 479-491.
- Excoffier, L., Lischer, H.E.L. (2010) Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources 10: 564-567.
Isabelle Dupanloup, CMPG, Institute of Ecology and Evolution, University of Bern