PDF Print E-mail
 
Overview



Geneclust is an R package (interfaced with Fortran programming language) that implements a Bayesian clustering algorithm in spatial population genetics. It simulates and detects population structure (without assuming predefined populations) from individual multilocus genetic data sampled at distinct
geographical locations. I implemented this R package during my PhD in collaboration with my co-supervisor Gilles Guillot.

Although the concept of population refers here to genetic structure only, it is often realistic to assume that populations are spatially organized. Toward this aim, Geneclust is a spatially explicit Bayesian hierarchical model based on a mixture of sub-populations characterized by continuous spatial variations of their allele frequencies. Basically, to account for the spatial "continuity" of the multilocus genetics data,  the model includes a Hidden Markov Random Field for discrete random variables as prior distribution on cluster memberships. In addition to a spatial prior, a second particularity of Geneclust is that it allows to account for departures from the standard hypothesis of sub-populations at (or close to) Hardy-Weinberg equilibrium caused by
inbreeding (which implies excess of homozygots genotypes in comparison to panmixia structure).

Given individual geographical locations, the program builds a network structure (based on a Delaunay graph) which describes the prior spatial relationships between the individuals.  Then, given individual multilocus genotypes, Geneclust :
   1-  returns graphical displays of geographical cluster assignments of individuals and membership posterior probabilities
   2-  provides an estimator of the spatial interaction parameter (quantifying the degree of spatial organization of the sub-populations)
   3-  provides an estimator of the inbreeding coefficient associated with each sub-population (i.e., probability that two homologous genes are identical by descent)



To install and run Geneclust

 


    Step 1:  Download and Install R

 

    Under Windows:
             1- Click
here
             2- Download the executable R-x.x.x-win32.exe
             3- Launch this executable


    Step 2: Install Geneclust

 

             1- Launch R              
             2- Type install.packages("Geneclust")
 
             3- Follow the instructions.

 

 

Geneclust is based on the add-on packages deldir, fields and spatial. They have to be also installed as they do not belong to the R base distribution. All packages can be installed at the same time via the command-line by typing:
   install.packages(c("Geneclust","deldir","fields","spatial")

 


To download the package source, MacOS X binary, Windows binary associated with Geneclust, click here



 

Reference papers

 

 

1-  Geneclust manual [pdf]

2-  On the model (and sub-models) implemented in Geneclust

      * Ancelet S., François O., Guillot G. (2007). Hidden Markov Random Fields and  
         the Genetic Structure of the Scandinavian Brown Bear Population. 
       Journal de   la SFdS  [preprint]

      * François O., Ancelet S., Guillot G. (2006) Bayesian Clustering using Hidden  
         Markov Random Fields in Spatial Population Genetics.
      Genetics 174: 805-816 [preprint]

 



3- On the implementation of MCMC inference on the spatial interaction  parameter of a Potts-Dirichlet model

     * Green, P., Richardson, S. (2002) Hidden Markov models and  
     disease mapping.
 Journal of the American Statistical Association.  
     97(460):1055-1070
 [pdf]

   
Last Updated on Wednesday, 30 September 2009 13:38