Simulation Models for Risk Analysis Workshop

Welcome Back From Spring Break
• Brief Review
– Forecasting for 3 weeks
– Simulation
• Motivation for building simulation models
• Steps for developing simulation models
• Stochastic variables and why they are included in models
• What financial simulation model is used for
• Parametric Distributions (N, U, Bernoulli)
Test Results 2016
Mean 81.28, Std Dev 10.38, Range 59-100
CDF Grades for Test 2
1
Prob
0.8
0.6
0.4
0.2
0
55
60
65
70
75
80
85
90
95
100
Test 2
Histogram of Grades for Test 2
PDF of Grades for Test 2
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
55.00
59.00
69.25
79.50
89.75
100.00
60.00
65.00
70.00
75.00
80.00
Test 2
85.00
90.00
95.00
100.00
Test Results 2015
Mean 82.61, Std Dev 10.96, Range 52-99
Prob
CDF for Exam 2
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
50
60
70
80
Histogram for Exam 2
90
100
PDF Approximation Exam 2
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
52.00
63.75
75.50
87.25
99.00
50.00
60.00
70.00
80.00
90.00
100.00
Non-Parametric vs. Parametric Distributions
• Non-Parametric Distributions – not a fixed
form that is parameter dependent, ex:
– Discrete Uniform
– Empirical
– GRKS
– Triangle
• Parametric Distributions (covered last lecture)
– Fixed form, shape dependent on parameters
– Uniform, Normal, Beta, Gamma, Bernoulli
Discrete (Uniform) Empirical
• Discrete Empirical distribution used where only
fixed values can occur
– Each value has an equal probability of being drawn
– No interpolation between observed values
• Examples of Discrete Empirical distributions
– Discrete number of labors who show up to work
– Number of steers on a cattle truck
– Simulating a fair die: 1, 2, 3, 4, 5, 6
– Letter grades: A, B, C, D, F
Discrete (Uniform) Empirical
Distribution
PDF for DE(3, 4, 6, 7)
CDF for DE(3, 4, 6, 7)
1
.75
.5
.25
0
3
4
6
7
X
3
4
6
7
X
PDF and CDF for a Discrete Uniform Distribution.
- Parameters for a DE(x1, x2, x3, …, xn) based on history
- Discrete Empirical means that each observed value of Xi, has
an equal probability of being observed
Row
1
2
3
4
5
A
10
12
20
15
13
B
C
=DEMPIRICAL (A1:A5) function in Simetar
Discrete Uniform Empirical
• Simulate this type of random variable two
ways in Simetar
– Discrete empirical with equal probabilities
=DEMPIRICAL(A1:A5)
You can simulate DE
using a USD and IF
statements
Discrete Empirical -- Alphanumeric
• =RANDSORT(I1:I5)
• Random shuffle 5 names -> highlight 5 cells and
Type =RANDSORT(I1:I5) then
press and hold Ctrl Shift Enter
Empirical Distribution
• An empirical distribution is defined totally by the observations
for the data, no distributional shape is assumed
• Parameters to simulate an empirical distribution
– Forecasted values: means (Ῡ) or forecasts (Ŷ)
– Calculate percentage deviation from the mean or forecast = (Yi- Ŷi) / Ŷi
– Sort the deviations from the mean (or forecast) from low to high
– Assign a cumulative probability to each sorted deviate (usually assume
equal probability for each data point)
• Cumulative probabilities go from 0.0 to 1.0; named F(xi) or F(Si)
– Assume the distribution is continuous, so interpolate between the
observed points
• Use the Inverse Transform formula to simulate the
distribution
• This requires simulation of a USD to use in the interpolation
• Use Emp icon to estimate parameters in Simetar
PDF and CDF for an Empirical Dist.
Probability Density Function
Cumulative Distribution Function
F(x) 1.0
f(x)
X
min
max
0.0
min
max X
We interpolate the Dark Black line in the CDF based on the discrete CDF and
use it as the approximation for a continuous distribution using the Inverse
Transform method
Inverse Transform for Simulating an
Empirical Distribution
F(x)
1.0
Start with a
random USD
U(0,1) = 0.45
Interpolate the Ỹ
axis using the
USD value
0.0
Y1
Y2 Y3
Stochastic
Y4 Y5
Ỹi
Y6
Y7
Derived by linear interpolation
Using the Empirical Distribution
• Empirical distribution should be used if the
– Random variable is continuous over its range,
– You have less than 20 observations for the variable, and/or
– You cannot easily estimate parameters for the true PDF
• Simulate crop yields as an Empirical distribution when
you have less than 20 historical values
– Assume we have 10 observed yields:
• Yield can be any positive value, not discrete values
• We don’t have enough observations to test for
normality
• We know the 10 random values were observed with a
probability of 1/10, or one observation each year
– So F(x) goes from 0.0 to 1.0 in equal increments
Simulating Empirical Distributions
• Empirical distribution is “best” simulated as percent
deviations from mean or trend:
percent deviates from mean = (Yt – Ῡt )/Ῡt
• Parameters are:
– Mean of the data is either Ῡt or Ŷt
– Sorted deviations from mean or forecasted Ŷ are
St = Sort [(Yt – Ῡt )/Ῡt ]
or
St = Sort [(Yt – Ŷt)/ Ŷt ]
– Probabilities for St’s, are called F(St) or F(xi) values
and MUST range from 0.0 to 1.0
• Use the parameters to simulate random variable Ỹ:
Ỹ = Ῡt * (1 + EMP(St, F(St), [USD]) )
Empirical Distribution -- No Trend
•
•
Given a random variable, Ỹ, with 11 observations
Develop the parameters if simulating variable using the mean to forecast
the deterministic component:
• Parameter for deterministic component is
the mean or the second column
• Calculate the stochastic component or ê as:
êi = Yi – Ῡ
• Convert residuals to fractional deviations of
the forecast mean value: Devi = êi / Ῡ
• Sort the Devi values from low to high (Si)
and assign the probabilities of Si or F(Si)
• Simulate Ỹ in two steps:
Stoch Devi = EMP(Si , F(x), [USD] )
Stoch ỸT+i = ῩT+i * (1 + Stoch Devi)
• Note: Devi = (Yi- Ῡi) / Ῡi rearrange terms
or
so
(Ῡ * Devi) = Yi – Ῡ
Ỹi = Ῡ + (Ῡ * Devi)
Empirical Dist. -- With Trend
Parameters for EMP() if deterministic component is the trend forecast
•Calculate the stochastic component
or ê as:
êi = Yi – Ŷi
• Convert residual to fractional deviate of
forecast value: Devi = êi / Ŷi
• Sort the Devi values from low to high (Si)
and calculate the probabilities of Si or F(Si)
• Simulate Ỹ as follows:
Stoch Devi = EMP(Si, F(x), [USD] )
ỸT+i = ŶT+i * (1 + Stoch Devi)
• Derived from: Stoch Devi = (Yi - Ŷi) / Ŷi
or
Yi – Ŷi = (Ŷi * Stoch Devi)
or
Ỹi = Ŷi + (Ŷi * Stoch Devi)
•ỸT+I Could have been developed from a
structural or time series equation, then êi
are the residuals from the regression
3 Ways to Simulate Emp Distribution
• Let: Si be in B1:B10 and F(x) in A1:A10
• If Si are expressed as actual values
=EMP(B1:B10)
Memorize these 3
formulas. They are
very important!
• If Si are residuals from the mean or OLS
= Ῡ + EMP(B1:B10, A1:A10)
• If Si are fractional deviates from mean or
trend: Si = (ẽ / Ŷ)
= Ŷ * (1 + EMP(B1:B10, A1:A10))
Simulating an Emp Distribution
• Advantages of Emp Distribution
– It lets the data define the shape of the distribution
– Does not force an assumed distribution shape on the
variable
– Larger the number of observations in the sample, the
closer Emp will approximate the “true” distribution
– Avoids assuming a parametric distribution
• Disadvantages of Emp Distribution
– It has finite min and max values
– It does not adhere to known probabilities and parameters
– Parameters can be difficult to estimate w/o Simetar
Simulating an Emp Distribution
• Advantages of specifying the Si’s as
fractional deviates for forecasted values
– Guarantees the “relative risk” for a random
variable is the same as the historical period
• Coefficient of Variation for the simulated data is
constant over time CVt = (σ / Ῡt) * 100
– Allows you to use any mean (Ŷ or Ῡ) for the
simulated planning horizon and simulated
values have same CV as the historical period
• Historical Ῡ can be 100 and the mean for the
forecast period Ŷ can be 150 and the Ỹ values will
have the same CV as the historical data.
Example of Assuming a Distribution
GRKS Distribution
• When we have insufficient historical data to
estimate parameters to estimate a
parametric or Empirical distribution
– Need to use expert opinion or
– Use the limited data to define a distribution
– Some people resort to a triangle distribution
but it is really bad
• GRKS distribution developed to simulate
random variables with limited data
GRKS Distribution
• Gray, Richardson, Klose and Schumann
(GRKS) distribution requires three
parameters
– Minimum: 97.5% of observations are greater than this
parameter
– Middle: average or median, 50% of the observations
will be less than this parameter
– Maximum: 97.5% of the values are less than this
parameter
• Parameters are generally set based on expert
opinion or limited data (less than 10
observations)
GRKS Distribution
• Advantage over triangle distribution
– Recognizes that there is a small probability of a value
lower (or greater) than what we have observed in the
past or the expert’s expectations
– Triangle distribution is generally parameterized by
asking experts what is:
• the lowest value we can expect 1 year out of 10
• the highest value we can expect 1 year out of 10
– The problem is that the triangle distribution will
simulate the min or max only 2% when these
parameters should be observed 10% of the time,
based on the experts response to the questions!
GRKS Distribution
• Results of Using GRKS option in Simetar to
estimate the parameters
GRKS Distribution
• Simulate the GRKS using the F(x) and Sorted X
values using =EMP(Sx, F(x))
• Results for the parameters are presented here
Simetar Simulation Results for 500 Iterations. 11:24:08 PM 3/16/2016 (2 sec.). © 2016.
Variable Sheet1!F21
Mean
54.06372
Minimum
20 Prob(x<20) 0.024575
StDev
20.58581
Middle
50 Prob(x<50) 0.501085
CV
38.07693
Maximum
100 Prob(x<100) 0.977384
Min
6.080955
Max
123.3198
Triangle Distribution (20, 50, 100)
• Note that the minimum is observed less
than 1%
• Note the maximum is observed less than
1%
• Values <= middle observed less than 37%
Simetar Simulation Results for 500 Iterations. 11:31:00 PM 3/16/2016 (3 sec.). © 2016.
Variable Sheet1!F23
Mean
56.66571
Minimum
20 Prob(x<25) 0.011892
StDev
16.52264
Middle
50 Prob(x<50) 0.375633
CV
29.1581
Maximum
100 Prob(x<95) 0.994634
Min
21.46559
Max
98.17278
GRKS and Triangle Distributions
CDFF for the GRKS and Triangle Distributions (20, 50, 100)
1
0.9
0.8
0.7
Prob
0.6
0.5
0.4
0.3
0.2
0.1
0
0
20
40
60
GRKS
80
Triangle
100
120
140
GRKS Distribution
• Easy to modify the GRKS distribution to
represent any subjective risk or random
variable. This makes the dist. very flexible.
• From the Simetar Toolbar click on
GRKS Distribution and fill in the menu
• Edit table of deviates for Xs and F(Xs) to
change the distribution shape to conform to
your subjective expectations
• Simulate distribution using =EMP(Si , F(x))
GRKS Distribution
• The GRKS menu asks for
– Minimum
– Middle
– Maximum
– No. of intervals in Std Deviations beyond the
min and max. I like 4 intervals to give more
flexibility for customizing the distribution.
– Always request a chart so you can see what
your distribution looks like after you make
changes in the X’s or Prob(x)’s
Modified GRKS Distribution
GRKS in Simetar provides the F(x) and Sorted values for the distribution so
they can be edited to better fit your expectations for the random variable.
The bold F(x) and Sx values can be changed to develop “your own” dist.
Simulate it as EMP( Sx, F(x)). I changed the Bold values below.
GRKS Distribution With the Following Parameters:
Modified to force 10% chance of Min and Max
Minimum Mode Maximum
GRKS Distribution (20,50,100)
20
50
100
Interval Prob(Xi)
Xi
1.0000
Pseudo Min
1
0.0000
20.00 0.9000
2
0.0030
20.00 0.8000
3
0.0062
20.00 0.7000
4
0.0122
20.00 0.6000
0.5000
Minimum
5
0.0228
20.00
0.4000
6
0.0401
20.00 0.3000
7
0.0668
20.00 0.2000
8
0.1000
20.00 0.1000
9
0.1587
35.00 0.0000
0.00
20.00
40.00
60.00
80.00
10
0.2266
38.40
11
0.3085
42.50
12
0.4013
46.13
Mode
13
0.5000
50.00
14
0.5987
56.44
15
0.6915
62.50
16
0.7734
69.33
17
0.8413
75.00
18
0.9000
100.00
19
0.9332
100.00
20
0.9599
100.00
Maximum
21
0.9772
100.00
22
0.9878
100.00
23
0.9938
100.00
24
0.9970
100.00
Pseudo Max
25
1.0000
100.00
100.00