As we said in the previous post, despite their many applications, all Monte Carlo methods fall roughly into three families: numerical integration, optimization, generation of probability distributions. Let’s look at three examples, one for each family.
1) Monte Carlo Integration
In the previous post we discussed a Monte Carlo method to calculate an approximate value for the area of a circular sector. The trick was in counting how many randomly drawn points within the unit square fell within the circumference.
You can easily guess how this approach can be generalized to calculate areas with different shapes.
From calculus we know that an area can also be calculated using integrals:
$$A =\displaystyle \int_a^b f(x) dx$$
But then when we use the Monte Carlo method to find areas, we are just calculating an approximation of an integral!
This approach is not very useful when calculating integrals in a single dimension. In this case, there are more efficient deterministic methods.
However, if you want to calculate integrals over many dimensions, deterministic methods become less efficient, while Monte Carlo methods become more useful.
What causes this strange phenomenon?
Given the degree of precision you want to achieve in the calculation, deterministic methods require a number of elementary steps that grow exponentially with the number of dimension. On the other hand, the error you expect from an estimation done with a Monte Carlo method depends only on the number of total extractions, not on the number of dimensions.
For example, to calculate a 100-dimensional integral, deterministic approaches would require such a huge number of calculations as to be practically impossible, while Monte Carlo methods are still applicable.
But wait! In what cases is it necessary to calculate integrals with so many dimensions?
It typically happens in statistical mechanics, a branch of theoretical physics that studies systems with many degrees of freedom, for example, a system composed of many particles enclosed in a box. To calculate the value taken by macroscopic quantities such as temperature or pressure, it is necessary to perform integrals on a space with a number of dimensions proportional to the number of particles.
You can understand that the number of dimensions of these statistical mechanics integrals can be very large. In cases like these, Monte Carlo methods are used more than deterministic methods.
2) Numerical Optimization
Various algorithms are used to find the local minima of a function. Typically, these algorithms:
- start at a given point
- check in which direction the function takes a smaller value with respect to the current one
- following this direction, select a new point where the function has a value smaller than the previous point
Applying these steps many times these algorithms finally reach a minimum (in this case, at step number 2 you can’t determine which direction to move in).
However, what if we have a function with many local minima and want to find the point that globally minimizes the function?
A local search process may stop at any of the many local minima of the function. How can an algorithm understand if it has found one of the many local minima or the global minimum?
Actually, there is no way to exactly establish this. The only option is to explore different areas of the search domain to increase the likelihood of finding the global minimum among the various local minima.
Many methods have been developed to realize this idea of “exploring” the domain. Very popular are those known as genetic algorithms and are inspired by species evolution.
These are Monte Carlo methods in which a starting population of points is created and subsequently evolved through an algorithm that pairs a couple of points to generate new ones (the coordinates of the two points are combined to obtain the coordinates of a new point). In this pairing process, random genetic mutations happen so that the pairing function gives a randomized result. During the simulation of different generations of points a process of natural selection intervenes that keeps only the best points (those that give lower values of the function to be minimized).
In the meantime, the algorithm also keeps track of which point represents the best “individual” ever.
Continuing with this process the points tend to move toward the local minima but exploring many areas of the optimization domain. The process at some point is stopped (usually by setting a limit to the number of generations) and the best individual is taken as the estimate for the global minimum (usually this is used as a starting point for a subsequent local optimization algorithm that refines the result).
3) Generating probability distributions
The last kind of application concerns the generation of probability distributions that can’t be derived through analytical methods.
Example: estimate the probability distribution of damage caused by tornadoes in the United States over a period of one year.
In this type of analysis there are two sources of uncertainty: how many tornadoes will happen over one year and how much damage each tornado causes. Even if you are able to assign a probability distribution to these two logical levels, it’s not always possible to put them together to get an annual loss distribution with analytical methods.
It’s much simpler to do a Monte Carlo simulation like this:
-
-
-
- draw a random number from the distribution of the number of annual events
- if you get n events from the previous step, take n draws from the distribution of the loss for a single event
- add the values of the n draws of step 2. to obtain a value representing an annual loss caused by n events.
-
-
Repeating these three steps many times you can generate a sample of annual losses you can use to estimate the probability distribution that you couldn’t derive analytically.
If you liked this post, consider sharing it or signing up for the newsletter.