The Bayesian Approach to Linear Regression

I’ve long been interested in the Bayesian approach to statistical modeling and machine learning.  I recently dug in deep to an example in Bishop’s text to re-create his results and paraphrase my understanding of it.  Much like how you would want an online learning system to work, this work shows in detail how each data point updates the posterior distribution and becomes the next iteration’s prior.  In turn, you can observe the uncertainty in the model’s estimate improving.

What I like most about the Bayesian treatment is that you get a full probability distribution for your model parameters, not just a point estimate of its value.  This allows you to ensemble many different models, each of which is weighted by its respective probability.  This is a far more robust estimate than the common Maximum Likelihood approach.

If you find the treatment useful, feel free to fork the GitHub code and play with it yourself.  Find the notebook here.

Bayesian estimates within +/- 1 stdev from the mean using only two data points

Bayesian estimates within +/- 1 stdev from the mean using only two data points

Leave a Reply

Your email address will not be published. Required fields are marked *