Bayesian Theorem: Breaking It To Simple Using PyMC3 Modelling


This article edition of Bayesian Analysis with Python introduced some basic concepts applied to the Bayesian Inference along with some practical implementations in Python using PyMC3, a state-of-the-art open-source probabilistic programming framework for exploratory analysis of the Bayesian models.


Frequentist vs. Bayesian approaches for inferential statistics are interesting viewpoints worth exploring. Given the task at hand, it is always better to understand the applicability, advantages, and limitations of the available approaches.

Bayesian and Frequentist Approaches

The Bayesian Approach:

The Bayesian approach is based on the idea that, given the data and a probabilistic model (which we assume can model the data well), we can find out the posterior distribution of the model’s parameters. For e.g.

Naive Bayes Algorithm for Classification

Discussions on Bayesian Machine Learning models require a thorough understanding of probability concepts and the Bayes Theorem. So, now we discuss Bayes’ Algorithm. Bayes’ theorem finds the probability of an event occurring, given the probability of an already occurred event. Suppose we have a dataset with 7 features/attributes/independent variables (x1, x2, x3,…, x7), we call this data tuple as X. Assume H is the hypothesis of the tuple belonging to class C. In Bayesian terminology, it is known as the evidence. y is the dependent variable/response variable (i.e., the class in the classification problem). Then Mathematically, the Bayes theorem is stated as :

  1. P(X|H) is the posterior probability of X conditioned on H and is also known as ‘Likelihood’.
  2. P(H) is the prior probability of H. This is the fraction of occurrences for each class out of total number of samples.
  3. P(X) is the prior probability of evidence (data tuple X), described by measurements made on a set of attributes (x1, x2, x3,…, x7).
  • VI– Variational Inferencing method tries to find the best approximation of the distribution from a parameter family. It uses an optimization process over parameters to find the best approximation. In PyMC3, we can use Automatic Differentiation Variational Inference (ADVI), which tries to minimize the Kullback–Leibler (KL) divergence between a given parameter family distribution and the distribution proposed by the VI method.

Prior Selection: Where is the prior in data, and from where do I get one?

Bayesian modeling gives alternatives to include prior information in the modeling process. If we have domain knowledge or an intelligent guess about the weight values of independent variables, we can make use of this prior information. This is unlike the frequentist approach, which assumes that the weight values of independent variables come from the data itself. According to Bayes theorem:

Modelling Using PyMC3 Library for Bayesian Inferencing


This blog is an attempt to discuss the concepts of Bayesian inferencing and its implementation using PyMC3. It started off with the decades-old Frequentist-Bayesian perspective and moved on to the backbone of Bayesian modeling, which is the Bayes theorem. Once setting the foundations, the concepts of intractability to evaluate posterior distributions of continuous variables along with the solutions via sampling methods viz., MCMC, and VI are discussed. A strong connection between the posterior, prior, and likelihood is discussed, taking into consideration the data available at hand. Next, the Bayesian linear regression modeling using PyMc3 is discussed, along with the interpretations of results and graphs. Lastly, we discussed why and when to use Bayesian linear regression.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

Affine is a provider of analytics solutions, working with global organizations solving their strategic and day to day business problems