Animating Normal Distributions with Python
Updated 12/5/2021
I’ve been fascinated by 3Blue1Brown’s incredible math videos, and was happy to learn that the engine Grant Sanderson uses to generate his elegant animations is an open-source Python library called manim
. To generate some animations of my own, I used the Manim Community fork to explore of one of the core components of statistics: the normal distribution. More specifically, I animated the probability density functions (PDFs) of normal distributions, observing how adjustments to the parameters of these distributions, such as mean and variance, influence their probability density functions. I started with two-dimensional plotting, exploring normal distributions of single variables (“univariate” normal distributions). But much more interesting was rendering three-dimensional probability density functions of bivariate normal distributions, where parameter adjustments to the distributions of two variables stretch and squish the resultant 3D surfaces.
Univariate Normal Distribution
Starting in two-dimensions, we can observe what happens when adjustments are made to the mean and standard deviation of a univariate normal distribution. A general form of the PDF can be written as:
where µ (mu
) represents the distribution mean and σ (sigma
) represents the distribution standard deviation.
In Python, this function can be defined as:
def PDF_normal(x, mu, sigma):
'''
General form of probability density function of univariate normal distribution
'''
return math.exp(-((x-mu)**2)/(2*sigma**2))/(sigma*math.sqrt(2*math.pi))
Adjustments to Mean
Inheriting from the Manim’s Scene
class, we can start constructing the first animation. A ValueTracker
is defined to track mu
, the normal distribution’s mean.
class AdjustMu(Scene):
'''
Scene to observe how adjustments to the mean of a normal distrubtion
influences the shape of its probability density function
'''
def construct(self):
ax = Axes(
x_range = [-5, 5, 1],
y_range = [0, 0.5, 0.1],
axis_config = {'include_numbers':True}
)
# Initialize mu (distribution mean) ValueTracker to 0
mu = ValueTracker(0)
Then, LaTeX is defined and positioned in the scene. Note that this code would be a continuation of the construct()
function and the indentation should align accordingly:
# Text to display distrubtion mean
mu_text = MathTex(r'\mu = ').next_to(ax, UP, buff=0.2).set_color(YELLOW)
# Always redraw the decimal value for mu for each frame
mu_value_text = always_redraw(
lambda: DecimalNumber(num_decimal_places=2)
.set_value(mu.get_value())
.next_to(mu_text, RIGHT, buff=0.2)
.set_color(YELLOW)
)
Notice that mu_value_text
uses always_redraw()
to ensure that the content is redrawn each frame, allowing for continuous animation. The line .set_value(mu.get_value())
continuously grabs the current value for mu
to write it in as a DecimalNumber
.
The always_redraw()
function is used again to continuously animate the PDF curve itself, using the previously defined PDF_normal()
function:
# Define PDF curve, always redraw for each frame
curve = always_redraw(
lambda: ax.plot(
lambda x: PDF_normal(x, mu.get_value(), 1), color=YELLOW)
)
The animation is all set up now. The actual animation visuals are then written, using mu.animate.set_value()
to animate adjustments to mu
:
# Start animation
self.add(ax, mu_text, mu_value_text)
self.play(Create(curve))
self.play(
mu.animate.set_value(2), run_time=1,
rate_func=rate_functions.smooth
)
self.wait()
self.play(
mu.animate.set_value(-2), run_time=1.5,
rate_func=rate_functions.smooth
)
self.wait()
self.play(
mu.animate.set_value(0), run_time=1,
rate_func=rate_functions.smooth
)
self.play(Uncreate(curve))
This is the resultant animation:
See the full code for the AdjustMu
2D scene on GitHub.
Adjustments to Standard Deviation
A very similar approach can be taken to animate adjustments to the normal distribution’s standard deviation. A different ValueTracker()
can be initialized with:
# Initialize sigma (distribution standard deviation) ValueTracker to 1
sigma = ValueTracker(1)
Different LaTeX is defined to animate the decimal number text, and the curve function is now:
# Define PDF curve, always redraw for each frame
curve = always_redraw(
lambda: ax.plot(
lambda x: PDF_normal(x, 0, sigma.get_value()), color=YELLOW)
)
After writing out a few adjustments to sigma
, the resultant animation is:
See the full code for the AdjustSigma
2D scene on GitHub.
Bivariate Normal Distribution
The bivariate normal distribution is a joint probability distribution of two variables that obey the normal distribution. These variables, say x_1
and x_2
, each have their own mean and standard deviation. The correlation between the two variables, ρ (rho
), is also accounted for.
The general PDF of the bivariate normal distribution can be written as:
where µ (mu
) represents the variable mean, σ (sigma
) represents the variable standard deviation, and ρ (rho
) represents the correlation between the two variables (-1 < ρ < 1).
In Python, this can be defined as:
def PDF_bivariate_normal(x_1, x_2, mu_1=0, mu_2=0, sigma_1=1, sigma_2=1, rho=0):
'''
General form of probability density function of bivariate normal distribution
'''
normalizing_const = 1/(2 * math.pi * sigma_1 * sigma_2 * math.sqrt(1 - rho**2))
exp_coeff = -(1/(2 * (1 - rho**2)))
A = ((x_1 - mu_1)/sigma_1)**2
B = -2 * rho * ((x_1 - mu_1)/sigma_1) * ((x_2 - mu_2)/sigma_2)
C = ((x_2 - mu_2)/sigma_2)**2
return normalizing_const * math.exp(exp_coeff*(A + B + C))
Standard Bivariate Normal
The PDF_bivariate_normal
function can be used in three-dimensional manim
renderings. Instead of inheriting from the Scene
class, we can inherit from the ThreeDScene
class and use ThreeDAxes
:
class StandardBivariateNormal(ThreeDScene):
'''
Plots the surface of the probability density function of the standard
bivariate normal distribution
'''
def construct(self):
ax = ThreeDAxes(
x_range = [-4, 4, 1],
y_range = [-4, 4, 1],
z_range = [0, 0.2, 0.1]
)
x_label = ax.get_x_axis_label(r'x_1')
y_label = ax.get_y_axis_label(r'x_2', edge=UP, buff=0.2)
z_label = ax.get_z_axis_label(r'\phi(x_1, x_2)', buff=0.2)
axis_labels = VGroup(x_label, y_label, z_label)
The standard bivariate normal distribution is a specific case of the bivariate normal distribution where µ = 0 and σ = 1 for both variables. If the correlation between the two variables of the standard bivariate normal distribution is zero (ρ = 0), the general form of the PDF can be simplified to:
Note the default argument values for PDF_bivariate_normal()
; these represent this standard case. To animate this surface with manim
, a Surface
object can be defined with:
distribution = Surface(
lambda u, v: ax.c2p(u, v, PDF_bivariate_normal(u, v)),
resolution=(42, 42),
u_range=[-3.5, 3.5],
v_range=[-3.5, 3.5],
fill_opacity=0.7
)
After coloring the surface with a cool-to-hot color ramp and animating some adjustments to camera angles, the resultant animation is:
See the full code for the StandardBivariateNormal
3D scene on GitHub.
Adjustments to Means
Similarly to the value adjustments made in the univariate normal distribution animations, ValueTracker
objects can be used to animate adjustments to the surface. For example, to animate adjustments to the means of each variable:
# Initialize ValueTrackers to adjust means
mu_1 = ValueTracker(0)
mu_2 = ValueTracker(0)
Then adjustments to both mu_1
and mu_2
can be made to see how each influences the resultant PDF surface:
See the full code for the AdjustMu
3D scene on GitHub.
Adjustments to Standard Deviations
Animations that alter values for sigma_1
and sigma_2
demonstrate how the standard deviations of each variable in the distribution play off each other to squeeze and stretch the surface:
See the full code for the AdjustSigma
3D scene on GitHub.
Adjustments to Correlation
Adjustments to the correlation value rho
produce interesting results too, where correlations squeeze the surface and pull it at 45-degree angles:
See the full code for the AdjustRho
3D scene on GitHub.
Closing Thoughts
Animations produced with manim
can foster some visual intuition for complicated math concepts. Animating probability density functions can demonstrate how a normal distribution’s parameters influence the probability of certain values. These animations could easily be tweaked to represent real datasets, perhaps comparing empirical observations to the theoretical PDF surface. Picture a bunch of floating data points around a one of these three-dimensional “bell curves”!
Resources
-
Prefer interactive visuals? Here’s a cool interactive bivariate distribution on GeoGebra prepared by Dovid Fein.
-
Interested in more details regarding the bivariate normal distribution? Check out this page from Penn State’s Applied Multivariate Statistical Analysis course (STAT 505).
-
Penn State’s Introduction to Probability Theory (STAT 414) course is also a great free online resource.