A [maybe] better way to learn Gaussian Process
A note on the approach and resources to learn Gaussian Process. This is what my younger self would love to know 6 months ago.
- 1. Pre-knowledge
- 2. Gaussian Process [Once Gaussian always Gaussian]
- Small notes
- Before moving to Gaussian Process, a Bayesian non-parametric method, one should be familiar with parametric Bayesian models
- Firstly, I will start from Richard McElreath's Statistical Rethinking by watching his lecture on Youtube, reading the book and doing excercises. The homework solution coded in PyMC is here thanks to Gabriel B.C. I prefer Python and PyMC, so I will use the PyMC implemetation of the book.
- Secondly, Bayesian Analysis with Python (second edition) by Osvaldo Martin is a really good book to learn Bayseian data analysis with PyMC.
- Thirdly, Probabilistic Programming and Bayesian Methods for Hackers: An introduction to Bayesian methods and probabilistic programming. This one really help to know how Bayesian methods are used in different applications.
I also found the PyMCon2020 talk: My Journey in Learning and Relearning Bayesian Statistics by Ali Akbar Septiandri is really helpful.
The lecture video and notes on the Machine Learning for Intelligent Systems course at Cornell University is a great introduction on general kernels as as Linear, Polynomial, Radial Basis Function (RBF) (aka Gaussian Kernel), Exponential Kernel, ...
Note that not any function K(⋅,⋅) → R can be used as a kernel. Only the matrix K(xi,xj) has to correspond to real inner-products after some transformation x→ϕ(x), and if and only if K is
Later, to learn more on kernels,
2.2. Introduction to Gaussian Process
A Primer on Gaussian Processes for Regression Analysis from Chris Fonnesbeck | PyData NYC 2019, Youtube link
Notebooks on Github link is a great place to start to learn about GP. He introduced with a simple regression problem, then move to a simple Gaussian Process model using PyMC.
To understand more on Gaussian Process, I found this lecture on Gaussian Processes of from Cornell Uni is really helpful. Many thanks to Kilian Weinberger to upload his notes as well as lecture videos publiclly.
2.3. Gaussian Process Summer Schools
Gaussian Process Summer Schools is a great place to learn various topics on GPs. The materials and slides can be found on gpschool github, while the records were published on Youtube.
I would suggest to start ton the 2017 Gaussian Process Summer Schools, as this year has a comprehensive introduction into GPs, and other topics. However, if you want to check more updated topics on GPs, just watch the recent workshops.
2.4. Deep dive into GP
- Chapter 5, Carl Eduard Rasmussen and Christopher K.I. Williams, “Gaussian Processes for Machine Learning”, MIT Press 2006, the PDF version of the book here
- The Kernel Cookbook: Advice on Covariance functions by David Duvenaud here
- PyMC examples of GP: https://github.com/pymc-devs/pymc-examples/tree/main/examples/gaussian_processes
Deep dive into GP by implementing GP from scratch. Building GPs from
scipy is a good way to deep understand how GPs work. From that, I think it also helps to know more insights into Multi-variate normal distributions.
At the begining, it is kind of difficult to understand and work with GP. It needs resilient. I have watched and re-watched some videos and played with notebooks several times.
Knowning GP helps understanding more on parametric Bayesian models and distributions. Expecially Multivariate Normal distribution.