Analytica > Blogs > Uncertainty analysis and Monte Carlo methods

Uncertainty analysis and Monte Carlo methods

Sean Salleh
August 15, 2013

Uncertainty analysis is often a prominent part of studies for sectors such as the environment. The uncertainty itself is determined by a number of elements. They include available measurements of data to be used as input, identification of extreme or limit values of such data, knowledge of the distribution of the data and mechanisms affecting this, and any additional expert opinion that can be usefully added in. Uncertainty in the data itself may come from the definition of what data is to be collected or used, natural variability of the process generating the data, and uncertainty in measuring or sampling the data, or using reference data with incomplete descriptions.

Example of combining two uncertainty distributions to generate a third one

Image source: www.fda.gov

When should you use Monte Carlo simulation?

Uncertainty propagation equations exist for situations that allow their use: typically normally or Poisson distributed uncertainties that are relatively small without significant correlation between the factors defining the model. Outside the simpler, normally distributed case and also when uncertainties are bigger, a Monte Carlo simulation is a technique that handles non-normal distributions, complex algorithms and correlations between input factors for the model in question. In this case, a distribution is determined for each parameter (see below). Then data are generated for each distribution, and these data are used as input for the model to produce output, these two steps being repeated as many times as is reasonably necessary to achieve an outcome curve or distribution in its own right.

Choosing an input distribution

Simple example of a lognormal distribution

Image source: commons.wikimedia.org Distributions vary according to the data being modeled. Human height follows a normal distribution, whereas concentrations of chemicals in the environment follow a lognormal distribution. If there are known boundaries, these distributions may also be expressed in a truncated form. Where information is lacking about the processes that generate the data, other possibilities exist. The uniform distribution (for example, the position of rain drops falling on a wire) gives equal probability to all values in a given range. The triangular distribution sets upper and lower limits and a preferred value somewhere between them.

Modeling according to sensitivity

Preliminary sensitivity analyses help to identify suitable models and possibly include or exclude parameters on that basis. Even if computing power is currently available in large quantities to handle multiplicities of parameters, knowing which parameters to focus on by conducting Monte Carlo based sensitivity analyses helps to avoid any unnecessary data gathering effort. It also leads to a more meaningful propagation and definition of overall uncertainty, which remains the measure of the ‘goodness’ of the model’s result.

Sampling on Simple Random Sampling or Latin Hypercube Sampling

Whether a Monte Carlo simulation uses SRS (Simple Random Sampling) or LHS (Latin Hypercube Sampling) depends on the available sample size. The cross-over zone may be taken to be in the ‘few thousands’: more than this suggests using SRS, less suggests LHS. The difference with LHS is that the distribution for a parameter is first divided into sections of equal probability and a one random number selected per section to form the total input into the model. When the use of LHS is indicated, the simulation then converges more rapidly on the final outcome. If you’d like to know how Analytica, the modeling software from Lumina, can help you give you the right options for both uncertainty analysis and Monte Carlo modeling, then try a free evaluation of Analytica to see what it can do for you.

Sean Salleh

Sean Salleh is a data scientist with experience in guiding marketing strategy from building marketing mix models, forecasting models, scenario planning models, and algorithms. He is passionate about consumer technologies and resource management. He has master's degrees in Operations Research from University of California Irvine and Mathematics from Northeastern University.