© 2018 by Sonia Huelsenbeck.

Statistical Phylogenetics

Darwin founded the field of evolutionary biology on the concept that all organisms are related to one another through an unknown evolutionary tree. This unknown evolutionary tree is called a "phylogeny" and there is probably more interest in reconstructing phylogenies today than at any time. The field has become quite sophisticated. A beginner to the field is confronted with a plethora of terms — MCMC, JC69, Bayesian inference, birth-death processes, GTR, Gamma-distributed rate variation, and maximum likelihood, to name just a few — along with numerous computer programs to estimate trees from DNA (or other types) of data. How does the novice navigate this complicated field?

This workshop will introduce you to the theory and practice of statistical phylogenetics. You will be taught by faculty from world-class universities, all of whom are experts in the theory and practice that you will be taught. 

Workshop Details

The course will be taught from March 10, 2020 to March 20, 2020, Sunday excluded. Expect to be in class in four sessions: morning and afternoon. Don't worry, however, because there will be numerous opportunities to socialize with other students and course faculty. We plan to have an end-of-course potluck party along with another celebration midway through the course at a tapas bar.

The tuition for the course is deliberately low, at only 400 Euros. Tuition does not include lodging. You will be required to find a place to stay if you are from outside of Madrid.

What makes this workshop different from the other workshops you could attend, aside from the absurdly low tuition? The faculty in this workshop regularly lecture at such prestigious workshops as the Workshop on Molecular Evolution (Woods Hole, Massachusetts), EMBO (Hinxton, England, and Heraklion, Crete), and Bodega Bay (California). MadPhylo differs from these excellent workshops in several important respects. First of all, we coordinate our lectures tightly, even to the point of using the same styles in our presentations. Other workshops make concessions to the schedules of the faculty so that often lectures are in an order that doesn't make pedagogical sense. Second, we tightly integrate our theory lectures with hands on practicals using the program RevBayes. You will learn the theory and also how to apply the theory on the same day.. Third, and finally, the course will be held at the Royal Botanical Garden in Madrid, one of the best cities in the world. Not only will you learn how to analyze your data, you will learn how to do so in one of the most magnificent places in the world!

Course Syllabus

Part 1: The basics of Bayesian estimation of phylogeny using RevBayes

Objectives: The student should understand how to interpret phylogenetic trees, including the combinatorics of trees, how trees can be rooted, and the meaning of branch lengths. The student will learn the basics of probability (joint, marginal, and conditional probabilities and Bayes’s theorem). The student will learn how maximum likelihood and Bayesian inference work and understand the differences between the two. The student should be able to explain the likelihood function and its role in estimation. The student will learn how to choose among models using likelihood ratios and Bayes factors. The student will understand the phylogenetic model and how to calculate likelihoods under such a model. The student will learn the rudiments of Markov chain Monte Carlo. Finally, the student will learn the rudiments of RevBayes and how to perform a simple phylogenetic analysis using the program.

Day 1 (Monday, June 10, 2019)

Morning: L, Background, terms, concepts; P, Terminal, bash, installing and using RevBayes

Afternoon: L, Probability, estimation and MCMC; P, MCMC

Day 2 (Tuesday, June 11, 2019)

Morning: L, The phylogenetic model (continuous-time Markov models and likelihood calculation)

Afternoon: P, Simulating DNA sequence evolution on a tree, using a die; L/P, Playing with simulated data

Day 3 (Wednesday, June 12, 2019)

Morning: L, Model selection; P, Model selection

Afternoon: L, MCMC Diagnostics; P, MCMC Diagnosis, Student consultations

Day 4 (Thursday, June 13, 2019)

Morning: P: Perform a Bayesian MCMC analysis of your own choosing!

Afternoon: No lectures

Part 2: Comparative phylogenetic analysis

Objectives: The student will learn how to investigate three major classes of problems using RevBayes: (1) biogeography, (2) diversification models, and (3) continuous-trait models.

Day 5 (Friday, June 14, 2019)

Morning: Discrete and continuous characters

Afternoon: Discrete and continuous characters

Day 6 (Saturday, June 15, 2019)

Morning: L, Biogeography; P, Biogeography

Afternoon: L, Biogeography; P, Biogeography

Day 7 (Sunday, June 16, 2019)

No lectures

Day 8 (Monday, June 17, 2019)

Morning: L, Diversification Models; P, Diversification Models

Afternoon: L, Diversification Models; P, Diversification Models, simple and EBD

Part 3: Addressing questions in molecular evolution in a phylogenetic context

Objectives: The student will extend his or her basic knowledge of Bayesian phylogenetic to (1) understand how codon models can be used to detect the footprint of natural selection, (2) date speciation events on a tree, and (3) understand how lineage sorting complicates phylogenetic analysis of multiple genes. The student will learn about the Dirichlet process prior model and how it has been used in molecular phylogenetics.

Day 9 (Tuesday, June 18, 2019)

Morning: Divergence time dating, all day, all the time!

Afternoon: Divergence time dating, all day, all the time!

Day 10 (Wednesday, June 19, 2019)

Morning: P: More divergence time dating!; L, Codon models & CAT Model

Afternoon: L, Gene-tree/Species-trees; P, Species tree estimation using RevBayes

Evening: Course party!

Lecture Slides