The basics of Monte Carlo integration (2024)

Implementation in python.

Published in

Towards Data Science

7 min read

Oct 26, 2020

The basics of Monte Carlo integration (3)

We all remember the integrals we had to compute manually in hight school. To do so, we had to compute a series of more or less complexe operations to find the antiderivative functions’ expressions before applying substraction through the desired interval. However, you may also remember that integrating a function is graphically equivalent to calculating the area under the curve of that function. The principle of numerical integration lies on this second statement. The idea is to estimate the integral of a function, over a defined interval, only knowing the function expression. For such an aim, Monte Carlo methods are a great help. Monte Carlo integration is a technique for numerical integration using random numbers.

Let’s try to integrate a univariate function f. We will denote by F the value of the integral.

The basics of Monte Carlo integration (4)

As we said in the introduction, this integral can be interpreted as the area below the function’s curve. Let’s take the following function as an example:

The basics of Monte Carlo integration (5)

The figure 1 is the graphical representation of f and the integral we’d like to compute.

The basics of Monte Carlo integration (6)

Let’s say a=-2 and b=5. If we take a random point x_i between a and b, we can multiply f(x_i) by (b-a) to get the area of a rectangle of width (b-a) and height f(x_i). The idea behind Monte Carlo integration is to approximate the integral value (gray area on figure 1) by the averaged area of rectangles computed for random picked x_i. This is illustrated in figure 2 below.

The basics of Monte Carlo integration (7)

By adding up the area of the rectangles and averaging the sum, the resulting number gets closer and closer to the actual result of the integral. In fact, the rectangles which are too large compensate for the rectangles which are too small. So it seems that the empirical mean of f(x) could be a good estimator for the integral. This idea is formalized with the following formula, which is the Monte Carlo estimator (N is the number of random draws for X):

The basics of Monte Carlo integration (8)

In this formula, X is a random variable, and so is FN. The law of large numbers gives us that as N approaches infinity, the Monte Carlo estimator converges in probability to F, the true value of the integral:

The basics of Monte Carlo integration (9)

Another important result we get from the Monte Carlo estimator is the variance of the estimator: 𝜎^2 / N where 𝜎 is the standard deviation of the function values, and N is the number of samples x_i. It means we need 4 times more samples to reduce the error of the estimator by 2.

In the following two parts, we will give concrete examples of implementation for integration in python (2D and 3D) with the crude method first, and then we will explain a variance reduction solution: importance sampling.

The crude Monte Carlo method is the most basic application of the concept described above:

Random draws x_i are made over X following a uniform law.
We compute the sum of f(x_i), multiply it by (b-a) and divide by the number of samples N.

To illustrate the process, let’s take a concrete use case: we want to integrate the beta distribution function beta(x, 50, 50) over the interval [0 ; 0.55] as it is described on figure 3 below. If X is a random variable that follows this beta law, this integral corresponds to P(X <= 0.55), which can be calculated exactly with the cumulative density function (c.d.f.) of the beta(x, 50, 50) density law: it gives 0.8413. We will try to approach this number through Monte Carlo integration.

The basics of Monte Carlo integration (10)

The basics of Monte Carlo integration (11)

Monte Carlo estimator

As describe earlier, we will do N = 10 000 random draws of x_i using the uniform distribution. Then, we compute the integral estimation and the associated error using the following formulas:

The basics of Monte Carlo integration (12)

The basics of Monte Carlo integration (13)

The blue points correspond to the 10 000 values of f(x_i) computed from the uniform draws we made over X.

We can already notice that because the draws were made uniformly over X (i.e. the horizontal axe), we have quite some space and less overlapping between the points when the slope of the curve increases.

Geometric estimation of the area under the curve

Another way to compute the integral is to make a geometric approximation of the integral. Indeed, by using uniform random draws over both x and y axes, we map a 2D rectangle that correspond to the desired range [x_min ; x_max] and compute the ratio of points under the curve over the total points drawn in the rectangle. This ratio would converge to the area under the curve with N, the number of draws. This idea is illustrated in figure 5.

The basics of Monte Carlo integration (14)

The basics of Monte Carlo integration (15)

The idea behind Importance Sampling is very simple: as the error of the Monte Carlo estimator is proportional to the standard deviation of f(x) divided by the square root of the number of samples, we should find a cleverer method to chose x_i than the uniform law. Importance Sampling is in fact a way to draw x_i to reduce the variance of f(x). Indeed, for the function beta(x, 50, 50) we noticed a lot of x_i over [0 ; 0.55] will give a f(x) ~ 0, while only few values of x around 0.5 will give f(x) ~ 8.

The key idea of Importance Sampling is that we use a probability density function (p.d.f.) to draw samples over X that will give more chance to high values of f(x) to be computed, and then we weight the sum of f(x) by the chance that x happened (i.e. the value of the p.d.f.).

Following this idea, the integral we try to approximate will became:

The basics of Monte Carlo integration (16)

with:

N: number of samples
f: function to integrate
g: p.d.f. chosen for the random draws over X
G: the inverse function of g

And the variance of f used to compute error of the estimator becomes:

The basics of Monte Carlo integration (17)

Back to our use case with the beta(x,50,50) distribution function, it seems that using a normal distribution centered at 0.5 as the g function could help us. The links between f and g are shown on figure 6 below.

The basics of Monte Carlo integration (18)

Figure 7 below shows the results of using such a method to integrate beta(x,50,50) over [0 ; 0.55]. The density of the points also shows the relevance to use of the Gaussian p.d.f compared to the uniform law to choose x_i: the x_i points are concentrated in areas of interest, where f(x_i) != 0.

The basics of Monte Carlo integration (19)

Methods comparison

Here is the summary of integral values and relative errors. We reduce x4 the error with the Importance Sampling method.

The basics of Monte Carlo integration (20)

Figure 8 takes an example of 3D jointplot mapped on a 2D space (x_A, x_B). This example is taken from the analysis of Bayesian A/B tests. The blue contour plot corresponds to beta distribution functions for 2 different variants (A and B). The idea is to compute the probability that variation B is better than variation A by calculating the integral of the joint posterior f, the blue contour plot on the graph, for x_A and x_B values that are over the orange line (i.e. values where x_B >= x_A ).

The basics of Monte Carlo integration (21)

The basics of Monte Carlo integration (22)

Thus, let’s compute the integral that corresponds to the area below the blue 3D bell for values of x_A and x_B that respect x_B >= x_A (upper to the orange line on the graph). This time we will draw N = 100 000 samples.

With the crude method:

Integral value: 0.7256321849276118
Calculation error: 0.01600479134298051

With Importance Sampling

Integral value: 0.7119836088517515
Calculation error: 0.0018557917512560286

The importance sampling method enabled to reach an almost x10 more precise result with the same amount of samples.

The crude method and importance sampling belong to a large range of Monte Carlo integration methods. Here you have all the material to implement these two techniques in python with no more than usual libraries as numpy and scipy. Keep in mind that Monte Carlo integration is particularly useful for higher-dimensional integrals. With the example we came through, we got a x10 improvement regarding precision for the 3D case, while it was a x4 improvement for the 2D case.