Instructor: Brendan Barrett (bbarrett@ab.mpg.de)
Lectures: Uploaded and pre-recorded, 1-2 per week
Discussion and Lab: In person (Bücklestrasse 5, 5th floor common area) and Online (Zoom once weekly on Mondays at 13:30
This course teaches data analysis, but it focuses on scientific models. The unfortunate truth about data is that nothing much can be done with it, until we say what caused it. We will prioritize conceptual, causal models and precise questions about those models. We will use Bayesian data analysis to connect scientific models to evidence. And we will learn powerful computational tools for coping with high-dimension, imperfect data of the kind that biologists and social scientists face.
Online, flipped instruction. You will watch the authors online pre-recorded lectures. We'll meet in person and online once a week for an 1.5 hours to discuss the material and work on exercises together.
We'll use the 2nd edition of, <Statistical Rethinking>.. If you require a PDF of the book and are enrolled in the course reach out to me.
Registration: 30 people limit.
There are 16[17] weeks of instruction. Links to lecture recordings will appear in this table. Weekly problem sets are assigned on Mondays and due the next Monday, when we discuss the solutions in the weekly online meeting. The first unofficial meeting is an intro and there will not be problem sets assigned. I will be available to help troubleshoot and issues you are having with installing cmdstanr or rethinking
Full lecture playlist: <Statistical Rethinking 2023 Playlist>
Note about slides: In some browsers, the slides don't show correctly. If points are missing from plots, download the slides PDF instead of viewing in browser.
| Week ## | Meeting date | Reading | Lectures |
|---|---|---|---|
| Week 00 | 26 May | Chapter 1 | [1] <Science Before Statistics> <Slides> |
| Week 01 | 02 June | Chapters 2,3 | [2] <Garden of Forking Data> <Slides> |
| Week 02 | 16 June | Chapter 4 | [3] <Geocentric Models> <Slides> [4] <Categories and Curves> <Slides> |
| Week 03 | 30 June | Chapter 5 | [5] <Elemental Confounds> <Slides> |
| Week 04 | 07 July | Chapter 6 | [6] <Good and Bad Controls> <Slides> |
| Week 05 | 29 July TUESDAY | Chapters 7,8 | [7] <Overfitting> <Slides> |
| Week 06 | 04 Aug | Chapter 9 | [8] <MCMC> <Slides> |
| Week 07 | 25 Aug | Chapter 10 | [9] <Modeling Events> <Slides> |
| Week 08 | 08 Sept | Chapter 11 | [10] <Counts and Confounds> <Slides> [11] <Ordered Categories> <Slides> |
| Week 09 | 15 Sept | Chapter 12 | [12] <Multilevel Models> <Slides> |
| Week 10 | 22 Sept | Chapter 13 | [13] <Multilevel Adventures> <Slides> |
| Week 11 | 29 Sept | Chapter 13 | [14] <Correlated Features> <Slides> |
| Week 12 | 06 Oct | Chapter 14 | [15] <Social Networks> <Slides> |
| Week 13 | 13 Oct | Chapter 14 | [16] <Gaussian Processes> <Slides> |
| Week 14 | 27 Oct | Chapter 15 | [17] <Measurement> <Slides> [18] <Missing Data> <Slides> |
| Week 15 | 17 Nov | Chapter 16 | [19] <Generalized Linear Madness> <Slides> [20] <Horoscopes> <Slides> |
| Week 16 | 03 Nov | Chapter 17 and Presentations| [20] <Horoscopes> <Slides>
| Week 16 | 03 Nov | Chapter 17 and Presentations| [20] <Horoscopes> <Slides>
This course involves a lot of scripting. Students can engage with the material using either the original R code examples or one of several conversions to other computing environments. The conversions are not always exact, but they are rather complete. Each option is listed below. I will teach in R. If you wish to try some of the other online language conversions below, feel free to but I will not be of much use to help you with some coding issues.
For those who want to use the original R code examples in the print book, you need to install the rethinking R package. The code is all on github https://github.com/rmcelreath/rethinking/ and there are additional details about the package there, including information about using the more-up-to-date cmdstanr instead of rstan as the underlying MCMC engine.
The <Tidyverse/brms> conversion is very high quality and complete through Chapter 14.
The <Python/PyMC3> conversion is quite complete.
The <Julia/Turing> conversion is not as complete, but is growing fast and presents the Rethinking examples in multiple Julia engines, including the great <TuringLang>.
The are several other conversions. See the full list at https://xcelab.net/rm/statistical-rethinking/.
I will also post problem sets and solutions. Check the folders at the top of the repository. After our weekly discussion about the lectures and book chapters, we will work through problem sets together and go over answers at the end of the following week. Some will be from the book, and some I will make up using real data pulled from my work, my colleagues work, or open-access repositories associated with published papers. Preparing homework using RMarkdown .RmD or Quarto.quarto files will nbe of use to yourself and your classmates when sharing results and thinking.
If you are a student taking the course for credit you will have to have one evaluated assignment.
- Use a structural causal model or direceted cyclical graph to help formulate the models you wish to analyze in one of your dissertation chapters or projects.
- Simulate Data based on your DAG or SCM, and recover parameters of interest to show you can answer the question you propse with a realistic sampling regime/design
- AND/OR Run an analyses using the skills you learned in class for data you have collected.
The goal is to generate a product that will be useful for you or make your science better.