|
Geneclust is an R package (interfaced with Fortran programming language) that implements a Bayesian clustering algorithm in spatial population genetics. It simulates and detects population structure (without assuming predefined populations) from individual multilocus genetic data sampled at distinct geographical locations. I implemented this R package during my PhD in collaboration with my co-supervisor Gilles Guillot.
Although the concept of population refers here to genetic structure only, it is often realistic to assume that populations are spatially organized. Toward this aim, Geneclust is a spatially explicit Bayesian hierarchical model based on a mixture of sub-populations characterized by continuous spatial variations of their allele frequencies. Basically, to account for the spatial "continuity" of the multilocus genetics data, the model includes a Hidden Markov Random Field for discrete random variables as prior distribution on cluster memberships. In addition to a spatial prior, a second particularity of Geneclust is that it allows to account for departures from the standard hypothesis of sub-populations at (or close to) Hardy-Weinberg equilibrium caused by inbreeding (which implies excess of homozygots genotypes in comparison to panmixia structure).
Given individual geographical locations, the program builds a network structure (based on a Delaunay graph) which describes the prior spatial relationships between the individuals. Then, given individual multilocus genotypes, Geneclust : 1- returns graphical displays of geographical cluster assignments of individuals and membership posterior probabilities 2- provides an estimator of the spatial interaction parameter (quantifying the degree of spatial organization of the sub-populations) 3- provides an estimator of the inbreeding coefficient associated with each sub-population (i.e., probability that two homologous genes are identical by descent)
| To install and run Geneclust |
Step 1: Download and Install R
Under Windows: 1- Click here 2- Download the executable R-x.x.x-win32.exe 3- Launch this executable
Step 2: Install Geneclust
1- Launch R 2- Type install.packages("Geneclust") 3- Follow the instructions.
Geneclust is based on the add-on packages deldir, fields and spatial. They have to be also installed as they do not belong to the R base distribution. All packages can be installed at the same time via the command-line by typing: install.packages(c("Geneclust","deldir","fields","spatial")
To download the package source, MacOS X binary, Windows binary associated with Geneclust, click here
1- Geneclust manual [pdf]
2- On the model (and sub-models) implemented in Geneclust
* Ancelet S., François O., Guillot G. (2007). Hidden Markov Random Fields and the Genetic Structure of the Scandinavian Brown Bear Population. Journal de la SFdS [preprint]
* François O., Ancelet S., Guillot G. (2006) Bayesian Clustering using Hidden Markov Random Fields in Spatial Population Genetics. Genetics 174: 805-816 [preprint]
3- On the implementation of MCMC inference on the spatial interaction parameter of a Potts-Dirichlet model
* Green, P., Richardson, S. (2002) Hidden Markov models and disease mapping. Journal of the American Statistical Association. 97(460):1055-1070 [pdf]
|