Improved statistical approaches for the analysis of biodiversity using genetic and spatial data
Informations
- Funding country
France
- Acronym
- GenoSpace
- URL
- -
- Start date
- 10/3/2016
- End date
- -
- Budget
- 136,163 EUR
Fundings
| Name | Role | Start | End | Amount |
|---|---|---|---|---|
| AAPG - Generic call for proposals [Appel à projets générique] 2016 | Grant | 10/3/2016 | - | 136,164 EUR |
Abstract
It is stating the obvious that we live on a planet consisting in continuous landscapes. Yet, there exists genuine barriers to developing sound statistical models that accommodate for continuous spatial information and genetic data in a satisfactory way. In fact, dominant spatial models in population genetics rely on the crude assumption that populations are divided in discrete demes. Other approaches make predictions about the spatial distribution of individuals that are generally not supported by biological evidence. Current limitations in the models and the inference techniques available hamper our understanding of biodiversity in space and time. They are thus the main focus of our project. Recent advances in theoretical population genetics have produced a new model, the spatial Lambda-Fleming-Viot model, that alleviates the limitations of current methods. This model considers the habitat as a truly continuous area and allows for a stationary distribution of individuals in time and space. A straightforward probabilistic description of the ancestral locations and genealogical relationships between sampled individuals is also available, thereby defining a simple way to calculate the likelihood of this model (the probability of the data given the model parameters). Yet, this likelihood involves a lot of latent variables, i.e., parameters that are not of utmost biological interest but are mandatory in order to proceed with the evaluation of the function of interest. It is therefore not clear whether the spatial Lambda-Fleming-Viot model is amenable to parameter inference. We have implemented and tested a prototype of a Bayesian sampler that estimates the posterior distribution of this model parameters from the analysis of geo-referenced genetic data. Preliminary results indicate that, when harnessed to state of-the-art statistical inference techniques, this new model indeed provides accurate estimates of the population densities and the dispersal range, two parameters that cannot be estimated separately with most traditional approaches. These promising results suggest that the spatial Lambda-Fleming-Viot model can indeed serve as a sound basis to tackle important biological questions. In particular, we will assess the impact of non-homogeneous landscapes on migration of individuals in this project. We will also investigate the detection of variability of a population density in space and during the course of evolution. Alongside these extensions of the original model, mathematical simplifications of the likelihood function will be examined. We have in fact identified analytical "shortcuts" that should considerably simplify, and therefore speed up, the calculations. Extensions and improvements of the models and inference techniques developed in this project will be applied to the analysis of large population genomics datasets from two flagship species of considerable economic importance: the "harlequin ladybird" and the spotted wing drosophila. We will quantify levels of gene flow and population densities throughout their respective habitats, thereby gaining some insight into the biology of these organisms. Software applications will be produced that implement the most relevant approaches developed in this project. These applications will be thoroughly tested through extensive simulations and then made available to a wide scientific audience.