University of FloridaDepartment of Agricultural & Biological Engineering

 

Morris SU (Sampling Uniformity) code

 

Download the Matlab code, sample inputs and documentation:

Description

EE_Sampler_Mapper Package is a set of MATLAB functions that generate parameter samples for the method of Elementary Effects/Morris method (Morris, 1991). The function ‘Fac_Sampler.m’ is the main function that user needs to run. It generates parameter samples in unit hyperspace and then transforms them according to the specified probability distributions. Currently this code gives four options for sampling strategy (a) the method of Optimized Trajectories [OT] (Campolongo et al. 2007), (b) the Modified Optimized Trajectories [MOT] (Ruano et al., 2012), and (c) Sampling for Uniformity [SU] (Khare et al., 2015), and (d) Enhanced Sampling for Uniformity [eSU] (Khare et al., in preparation).

Program Usage & Output

Syntax
Fac_Sampler(facfile,SS,OvrSamSiz,NumLev,NumTraj)

Inputs:
(1) facfile: this is '*.fac' file which must contain the following information
      (a) number of parameters (NumFact)
      (b) default distribution truncation values
      (c) distribution type and distribution characteristics for each parameter '*.fac' file can be generated from SimLab v2.2. For exact file format and distribution characteristics please refer SimLab v2.2 manual App. C (available here). Note: the name of this file is used to createthe file ouput names (please see 'Outputs' section below).
(2) SS: Sampling Strategy. Currently we provide four options (OT/MOT/SU)
      (a) Campolongo et al. (2007) - Method of Optimized Trajectories (OT)
      (b) Ruano et al. (2012) - Method of Modified Optimized Trajectories (MOT)
      (c) Khare et al. (2015) - Sampling for Uniformity (SU
      (d) Khare et al. (in preparation) - Enhanced Sampling for Uniformity (eSU)
(3) OvrSamSize: Oversampling Size. For OT and MOT recommended oversampling size is 500-1000. For SU recommended oversampling size is 300. If the dimensionality of the model is relatively low (say NumFact<10) then for SU and eSU we recommend to use OverSamSize ≥ 1000 for better sample.
(4) NumLev: Number of parameter levels. In EE literature various values for number of levels have been suggested. However, standard practice is to use even number of levels usually 4, 6 or 8. For SU and eSU by default NumLev = 4.
(5) NumTraj: Number of trajectories to be generated. In EE literature recommended value for number of trajectories vary from as little as 2 to as large as 100. However, a number of studies have reported that 10-20 trajectories are sufficient. Note: Our study based on eSU (Enhanced Sampling for Uniformity) indicates that ideally number of trajectories should be multiple of number of levels to get better results. For example, if 4 levels are chosen, 12 or 16 trajectories would be needed.

Outputs:
(1) Factor_Sample : Parameter sample [ncol,nrows]
      ncol = NumFact
      nrows = NumTraj*(NumFact+1)
      Each column corresponds to a parameter
(2) name_FacSample.xlsx : Factor samples are written to an excel file ‘name_FacSample.xlsx’, where "name" in the first part of the file name is the same as the first part of the ‘.fac’ file. (e.g. Factor sample for the input file ‘Example.fac’ will be saved in ‘Example_FacSample.xlsx’).
Each column corresponds to a factor. Factor name is specified on the first row
(3) name_FacSamChar.txt: Characteristics used for generating the sample are written to a text file ‘name_FacSamChar.txt, where "name" in first part of the file name is same as the first part of the ‘.fac’ file. (e.g. FacSamChar file for the input file ‘Example.fac’ will be saved as ‘Example_FacSamChar.txt’).       First Line: Sampling Strategy – OT, MOT, SU or eSU
      Second Line: Oversampling Size
      Third Line: Number of levels
      Fourth Line: Number of trajectories
      Fifth Line: Number of factors

Folder Structure:
There are 6 Matlab functions (i.e. m files) and four folders (Campolongo, Ruano and Khare and Enhanced_Khare) included in this package (EE_Sampler_Mapper). Generated factor/parameter sample file 'Factor_Sample.xlsx' and ‘FacSamChar.txt’ will be stored in the folder same as ‘fac_sampler.m’.

Probability Distributions:
Currently following parameter distributions can be generated using this package. For the details about distribution characteristics please refer SimLab v2.2 user manual App A (available here).
      (1) Uniform
      (2) LogUniform
      (3) Normal
      (4) LogNormal
      (5) Discrete
      (6) Constant
      (7) Triangular
      (8) Weibull
      (9) Beta
      (10) Gamma
      (11) Exponential
      (12) Log10Uniform

Probability distribution truncation

(A) When parameter distribution/distributions have long tails (Normal, LogNormal, Weibull, Gamma, Exponential), to get accurate results/ parameter rankings consistent with variance-based SA methods (e.g. Sobol’), experience has shown that truncated distributions perform better. SimLab v2.2 truncates distributions at 12.5% and 87.5% i.e. overall 25% truncation. Though, ideal truncation may vary from model to model, we recommend 2.5% to 5% truncation from either side.
(B) If user wants to use different truncations for different factors/parameters lower and upper percentiles (expressed as fractions) should be edited in the .fac' (please refer to App. C of SimLab v2.2 manualSimLab v2.2 manual App. C available here). By default SimLab v2.2 sets these values these values at 0.001 and 0.999. This will become clear in Example 2 in the following sections.

Input/Output Examples

Example 1                                                 

To generate samples/10 trajectories for a model with 4 factors/parameters as defined in ‘try.fac’(download here) using method of OT with 4 levels and OverSamSiz as 1000 and no truncation. Details on the Simlab v2.2.1 ".fac" file format are available in App.C, and a detailed description of the probability distributions used are here in App. A)

Type following in the Matlab command window (make sure that EE_Sampler_Mapper is the active folder)

Fac_Sampler('try.fac','OT',1000,4,10)

Example 1 – Figure 1: Factor file ‘try.fac’

 

Example 1 – Figure 2: Function execution in Matlab command window

 

Example 1 – Figure 3: Factor sample displayed in Matlab command window

Example 1 – Figure 4: Excel file ‘try_FacSample.xlsx’ generated by EE_Sampler_Mapper

 Example 1 – Figure 5: ‘try_FacSamChar.txt’ file generated by the Fac_Sampler.m

 

Example 2                                                 

To generate samples/8 trajectories for a model with 10 factors/parameters as defined in ‘try_eSU.fac’ generated using method of eSU with 4 levels, 8 trajectories, and OverSamSiz as 300. Parameter details are given in Table 1 below.

Table 1: Parameter/Factor PDF and truncation details

Parameter Name

Distribution

Lower Truncation

Upper Truncation

P1

Normal (mu = 0, sigma = 1)

0.001

0.999

P2

Normal (mu = 0, sigma = 1)

0.1

0.9

P3

Normal (mu = 0, sigma = 1)

0.05

0.8

P4

LogNormal (mu = 0, sigma = 1)

0.001

0.999

P5

LogNormal (mu = 0, sigma = 1)

0.01

0.9

P6

Exponential (lambda = 1, b = 0)

--

0.999

P7

Exponential (lambda = 1, b = 0)

--

0.8

P8

Gamma (r = 1, lambda = 1, b = 0)

--

0.999

P9

Gamma (r = 1, lambda = 1, b = 0)

--

0.9

P10

Gamma (r = 1, lambda = 1, b = 0)

--

0.8

 

Type following in the Matlab command window (make sure that EE_Sampler_Mapper is the active folder):

Fac_Sampler('try_eSU.fac','eSU',300,4,8)

Example 2 – Figure 1: Factor file ‘try_eSU.fac’

 

Example 2 – Figure 2: Function execution in Matlab Command window

Example 2 – Figure 3: Matlab command window showing generated factor sample

Example 2 – Figure 4: Excel file ‘try_eSU_FacSample.xlsx’ generated by EE_Sampler_Mapper

Notice the difference between parameter values for P1, P2, P3, even although they had identical distribution characteristics. These differences are due to different truncation levels. (see Table 1)

Example 2 – Figure 5: ‘FacSamChar.txt’ file generated by the Fac_Sampler.m

Program License

This program is distributed as Freeware/Public Domain under the terms of GNU-License. If the program is found useful the authors ask that acknowledgment is given to its use in any resulting publication and the authors notified. The source code is available from the authors upon request:


Return to top

References

  • Khare, Y.P.*, Muñoz-Carpena, R., Rooney, R.W., Martinez, C.J. A multi-criteria trajectory-based parameter sampling strategy for the screening method of elementary effects. Environmental Modelling & Software 64:230-239. doi:10.1016/j.envsoft.2014.11.013.

Return to top

This page was last updated on February 07, 2017.