Some Experiments with the Coalescent - Raazesh Sainudiin

First I will focus on the main ideas in "Experiments with the Site Frequency Spectrum" that appeared in:
The Inaugural Issue in Algebraic Biology, Bulletin of Mathematical Biology, 73.4, 829-872 (2011), (jointly with eight other co-authors) with the following Abstract:
"Evaluating the likelihood function of parameters in highly-structured population genetic models from extant deoxyribonucleic
acid (DNA) sequences is computationally prohibitive. In such cases, one may approximately infer the parameters from summary
statistics of the data such as the site-frequency-spectrum (SFS) or its linear combinations. Such methods are known as approximate
likelihood or Bayesian computations. Using a controlled lumped Markov chain and computational commutative algebraic methods,
we compute the exact likelihood of the SFS and many classical linear combinations of it at a non-recombining locus that is neutrally
evolving under the infinitely-many-sites mutation model. Using a partially ordered graph of coalescent experiments around the SFS,
we provide a decision-theoretic framework for approximate sufficiency. We also extend a family of classical hypothesis tests of
standard neutrality at a non-recombining locus based on the SFS to a more powerful version that conditions on the topological
information provided by the SFS."

Then, I will present some generalisations of these ideas (jointly with Tanja Stadler and Amandine Veber) in
"Finding the best resolution for the Kingman?Tajima coalescent: theory and applications",
Journal of Mathematical Biology, 70, 1207?1247 (2015), and in
"Full likelihood inference from the site frequency spectrum based on the optimal tree resolution",
Theoretical Population Biology, 124, 1?15 (2018), that seem to be having some impact today
in the field of theoretical and applied population genomics (including machine-learning and "AI" approaches in predictive tasks).

In the last few minutes, I will outline some mathematical challenges in combinatorial stochastic control processes for statistical inference
over the same probability space for empirical genome-wide phenomena based on work done jointly with Bhalchandra Thatte and Amandine
Veber (and current work with Berzunza-Ojeda) on population pedigree processes with coalescence, recombination and breeding behaviour in
"Ancestries of a Recombining Diploid Population",
Journal of Mathematical Biology, 72.1, 363-408 (2016)

Paul Thévenin, Sannolikhetsteori och kombinatorik

12 maj 2022



