University of FloridaDepartment of Agricultural & Biological Engineering

 

FITEVAL

Program for objective assessment of model goodness-of-fit with statistical signifficance based on Ritter and Muñoz-Carpena (2013).

Program description

FITEVAL was developed as a software tool for standardized model evaluation that incorporates data and model uncertainty following the procedures presented in this paper. The tool is implemented in MATLAB® and is available free of charge as a computer application (MS-Windows® and Apple OSX®) or as a MATLAB function. FITEVAL the quantification of model prediction error (in units of the output) as the root mean squared error (RMSE); and the computation of the Nash and Sutcliffe (1970) coefficient of efficiency (NSE) as a dimensionless goodness-of-fit indicator. As described in Ritter and Muñoz-Carpena (2013), FITEVAL uses the general formulation of the coefficient of efficiency Ej, which allows modelers for computing modified versions of this indicator,

where Bi is a benchmark series, which may be a single number (such as the mean of observations), seasonally varying values (such as seasonal means), or predicted benchmark values using a function of other variables. For j=2 and Bi=Ō , Ej yields E2=NSE. The code is flexible to be used with other model efficiency threshold values than those proposed, or for calculating Legates and McCabe (1999) modified form of the coefficient of efficiency (E1) instead of NSE. Notice that using transformed series in the corresponding ASCII text input file, such as {√Oi,√Pi} , {ln (Oi + ε). ln (Pi + ε)} or {1/(Oi+ ε), 1/(Pi + ε)}, the program computes NSE (and RMSE) applied on root squared, log and inverse transformed values, respectively (Le Moine, 2008; Oudin et al., 2006).

Hypothesis testing of NSE exceeding threshold values is performed based on developing approximated PDFs, which are obtained by bootstrapping (Efron and Tibshirani, 1993) or block bootstrapping (Politis and Romano, 1994) in the case of time series data (non independent autocorrelated values). Proposed threshold values are used for delimiting model efficiency classes denoted as Unsatisfactory (NSE<0.65), Acceptable (0.65≤NSE<0.80), Good (0.80≤NSE<0.90) and Very good (NSE≥0.90). Other model efficiency thresholds for particular applications could be justified and used without affecting the value of the methods proposed herein. The goodness-of-fit is statistically accepted (i.e. H0: NSE< 0.65 rejected) when the one-tailed p-value is less than a significance level a. The p-value represents here the probability of wrongly accepting the fit (NSE NSEthreshold=0.65). Common significance level values are a=0.1, 0.05 or 0.01, but the choice of a should be based on the research context, i.e. how strong the evidence needs to be for accepting or rejecting H0. Thus, for p-value >a, the goodness-of-fit is considered Unsatisfactory. Ritter and Muñoz-Carpena (2013) suggested to adopt the least restrictive significance level of a= 0.10 as a starting point, and Harmel et al. (2014) provide guidance on model interpretation and evaluation considering intended model uses (i.e., Exploratory, Planning, and Regulatory or Legal).

For realistic model evaluation, consideration of uncertainty associated with the observations and the model predictions is critical. Two methods to account for measurement and/or model uncertainties are studied and implemented within FITEVAL: a) Probable Error Range, PER) that has the advantage that is non-parametric, but can yield excessive model performance increases; and b) Correction Factor (CF) provides more realistic modification of the deviation term used in the goodness-of-fit indicators and allows for model uncertainty, but it requires assumptions about the probability distribution of the uncertainty about the data and/or the model predictions. FITEVAL provides modelers with an easy to use tool to conduct model performance evaluation accounting for data and prediction uncertainty in a simple and quick procedure.

Program Use & Output

FITEVAL can be executed as standalone application or as a MATLAB function.

A user supplied input ASCII data file is required, located in the same directory as the application. By default this file name is "fiteval.in", but the user can specify .an arbitrary file name passed in the command line, i.e.,

fiteval
fiteval data_ex1.in

The first line will execute with the data contained in the fiteval.in file located in the distribution directory, Notice this is the same behaviour that will be obtained by clicking on the fiteval executable in Windows. The second command above will read the contents of the user specified file ("data_ex1.in" in the example. See dditional details about program execution in the Examples section below.

Input data file

The input data file (fiteval.in by default or a filenmae given in the command line as shown above), must contain at least two paired vectors or columns,

{observations, predictions [...]}
The input file may contain missing values that must be denoted as "nan". The program can handle many additional options(comparisons with benchmark data, uncertainty in observed values, uncertainty in the simulated results and combinations of these) by specifying additional columns in data input file and line 9 of the fitevalconfig.txt file (described below). See the help file (fitevaloptions_help.pdf), and corresponding input file examples provided in the distribution directory.

Outputs

During texecution the program presents a summary of statistics on the command window and a Figure with graphical assessment. The program closes after the user closes the Figure window. For batch execution (for example for a set of different data sets at once), a "silent mode" (with no graphical output or pause after it) is vailable by setting the option in the "fitevalconfig.txt " described below.

After execution, FITEVALprovides the goodness-of-fit evaluation providing a portable data file (pdf) with the same name as the input data file written to the same directory where the application is exectuted from containing:
a) a plot of observed vs. computed values illustrating the match on the 1:1 line;
b) the calculation of NSE and RMSE and their corresponding confidence intervals of 95%;
c) the qualitative goodness-of-fit interpretation based on the established classes;
d) a verification of the presence of bias or the possible presence of outliers;
e) the plot of the Ceff cumulative probability function superimposed on the Ceff class regions;
f) a plot illustrating the evolution of the observed and computed values.

The latter plot helps for visually inspecting the similarity degree of the two series, and detecting which observations are best or worst predicted by the model. When uncertainty is incorporated in the evaluation, the resulting plots show the uncertainty boundaries as vertical (measurement uncertainty) or horizontal (model uncertainty) bars. Additionally, the numerical output is stored in a ascii text file with the same name as the input data file and extension '.out'.

The plots can be obtained also as separated files in the specified (as argument) graphic format ('eps', 'pdf', 'jpg', 'tiff, or 'png'). A plot label can be provided by providing a "text string" as argument after the filename. Examples of use are provided in the section below.

Removing repeated cases in the observed and calculated values is possible by passing as second argument NOREP. The program indicates the number of removals only if repeated cases present (see examples below..

Configuration file

The fitevalconfig.txt file is required when you want to run FITEVAL with other threshold values or for calculating Legates and McCabe (1999) modified form of the coefficient of efficiency (E1) instead of NSE. This file contains six lines specifying: Acceptable NSEthreshold, Good NSEthreshold, Very good NSEthreshold, relative bias threshold value (%), the option for computing E1, and figures’ font size, and the option for canceling the on-screen display of the graphical output (useful when using FITEVAL for multiple series calculations within Matlab code or other scripting environments). The corresponding default values are 0.65, 0.80, 0.90, 5, 0, 10, and 0, respectively. FITEVAL can also apply Efron and Tibshirani (1993) bootstrap or Politis and Romano (1994) block bootstrap when dealing with time series. The latter isrecommended for autocorrelated values like is typical in time series and is set  as the default option. An example of the fitevalconfig.txt file is provided below, .

0.65 %Acceptable NSE threshold value

0.80 %Good NSE threshold value

0.90 %Very good NSE threshold value

5 %BiasValue

0 %Compute Legates and McCabe modified Ceff (1=yes,0=no)

0 % Bootstrap method (1= Efrons' bootstrapping, 0= block boostrapping)

10 % FontSizeValue

0 % Do not display graphical output (1=accept)

0 % Take into account observations uncertainty (>0,1,2,3 or 4= yes)

ro % Color and type of marker size

4 % Sizes of the series marker

To find different combinations of markers and colors that can be used in MatLab, please visit http://www.mathworks.com/help/matlab/ref/linespec.html for full description of options. For example, "ro" in the example above means "red circle" marker.

Running FITEVAL without data uncertainty

Several EXAMPLE files are included in this package (fiteval_1.in through fiteval_14s.in) and fiteval.in (default). The program input file must be written in ASCII or text format (be sure to select this option when saving the file with the editor of your choice). The program is exceduted from the DOS command prompt in Windows or from the terminal unix shell in OS X. Some examples are presented below,

A) Under Windows (or inside Matlab in both Windows and OS X)
fiteval <argument_list>

fiteval
fiteval fiteval.in norep
fiteval data_ex1.in
fiteval data_ex2.in
fiteval data_ex3.in
fiteval data_ex3.in norep
fiteval data_ex1.in "Example Data Set" jpg

Example of FITEVAL numerical and graphical output

>>> fiteval data_ex1.in

B) Under Mac OS X

run_fiteval.sh <directory> <argument_list>

run_fiteval.sh <directory>
run_fiteval.sh <directory> fiteval.in norep
run_fiteval.sh <directory> data_ex1.in
run_fiteval.sh <directory> data_ex2.in
run_fiteval.sh <directory> data_ex3.in
run_fiteval.sh <directory> data_ex3.in norep
run_fiteval.sh <directory> data_ex1.in "Example Data Set" jpg

<directory> is the directory where the MCR (or the MatLab) is installed (see Library Requirements below). For exampe if Matlab is installed in your computer in the directory "/Applications/MATLAB_R2011a.app", the terminal command would be:

run_fiteval.sh /Applications/MATLAB_R2011a.app data_ex1.in

If instead the runtime libraries are installed, for example in "/Applications/MATLAB/MATLAB_Compiler_Runtime/v715" the the command would be:

run_fiteval.sh /Applications/MATLAB/MATLAB_Compiler_Runtime/v715 data_ex1.in

C) Outliers and model bias

Outliers (Dixon's test) and bias (with threshold set in the configuration file above) are automatically checked when running the program to guide the user in potentia sources of differences between observed and simulatied values.  Information for both tests is included in the grapical(filled symbols and annotated coordinates) and text outputs. And example is provided below for the file fiteval.in (included in examples with the distribution),

fiteval fitevalin

Notice that the specific outlier coordinates are provided also in the output text file,

D) Checking effect of repreated values

fiteval data_ex3.in NOREP

=============== GOODNESS-OF-FIT EVALUATION ================
RMSE= 0.008 [0.007 - 0.010]*
NSE = 0.917 [0.862 - 0.965]*

Evaluation of NSE: From GOOD to VERY GOOD
Probability of fit being:
- Very good (NSE = 0.900 - 1.000): 68.6%
- Good (NSE = 0.800 - 0.899): 31.4%
- Acceptable (NSE = 0.650 - 0.799): 0.0%
- Unsatisfactory (NSE < 0.650): 0.0% (p-value: 0.000)

Presence of outliers (Q-test): NO
Model bias: NO

27 repeated cases were removed
__________________________________________________________________
*: 95% Confidence interval obtained from Bca bootstrapping
using Politis and Romano (1994) block bootstrap method
for stationary dependent data.

E) Batch processing of data files

Notice that a batch file can be used for executing many FITEVAL examples automatically (to avoid stopping the execution after each on-screen graphical output set the appropriate flag in the fitevalconfig.txt file).

An example batch processing file (run_all_examples.m) is included in the distribution package for running the example data files. This can be extended for any number of files and names.

Running FITEVAL accounting for observations and model uncertainty

Several options are providedin FITEVAL to consider uncertainty in observations and/or model outputs following Ritter and Muñoz-Carpena (2017) based on Harmel and Smith (2007) and Harmel et al. (2010). When selecting uncertainty several options are available controlled from the fitevalconfig.txt (see help file for details, fitevaloptions_help.pdf)., Typically, to run these cases, a combination of an option value (>0-4) in line 9 of the file fitevalconfig.txt and number of columns in the input data file will be required.

IMPORTANT NOTES:

  1. Be careful to match the option value with the number of columns required  (see help file for details, fitevaloptions_help.pdf) to produce the desired results.
  2. When running uncertainty cases theprogram will run twice, the first time with uncertainty and the second without uncertainty. Please compare both sets of results (PDF files generated in working directory, with and without "U" in file name).

A) Running FITEVAL accounting for uncertainty of observed data

To include uncertainty in observations when evaluating the model the user must provided the proper number of columns in the input data file and assign the case number on the configuration file: 0<PER<1,  1, 2, 3 (see help file for details, fitevaloptions_help.pdf).

For example, if the user desires to specify the error distribution from factor CF, UB_Omin, UB_Omax for every individual data point, line 9 in the fitevalconfig.txt file now becomes,

3 % Take into account observations/model uncertainty (>0,1,2,3 or 4= yes)

and the input data file now requires 5 columns (Yobs, Ypred, CF, UB_Omin, UB_Omax - see file fiteval_9s.in examples and described in Case 9 of fitevaloptions_help.pdf).

Note that FITEVAL can also compute automatically the correction factors (CFi) and the uncertainty bounds based on the selected probability distribution according to the table in fitevaloptions_help.pdf.   For example, the inputs can be simplified when a common error distribution is used for all data points. For example, in order to run FITEVAL accounting for observation uncertainty (Case 9 in fitevaloptions_help.pdf) described by a normal distribution with a coefficient of variation of 17% common to all data points, the following fievalconfig.txt file will be required.

0.65 %Acceptable NSE threshold value

0.80 %Good NSE threshold value

0.90 %Very good NSE threshold value

5 %BiasValue

0 %Compute Legates and McCabe modified Ceff (1=yes,0=no)

0 % Bootstrap method (1= Efrons' bootstrapping, 0= block boostrapping)

10 % FontSizeValue

0 % Do not display graphical output (1=accept)

3 9 N 17 % Take into account observations/model uncertainty (>0,1,2,3 or 4= yes)

ro % Color and type of marker size

4 % Sizes of the series marker

The input data file now requires only 2 columns (Observed vs. predicted values). In this case FITEVAL will automaticallycompute the correction factor (CF) and upper and lower error bounds (UB_Omin, UB_Omax) for each observed value that are used to modify the goodness of fit indicators based on Harmel et al. (2010). For example, executing "fiteval fiteval_1.in" with the configuration above will produce the following result,

 

B) Running FITEVAL accounting for uncertainty of model results

Model errors like those generated from a Monte-Carlo uncertainty analysis, can be also considered (Case 13). For example, using the simplication of a common error distribution for all predictions, for example a triangular distribution with bounds given as 10% lower and 20% up from the predicted value, line 9 in the fitevalconfig.txt file now becomes,

4 13 T 10 20 % Take into account observations/model uncertainty (>0,1,2,3 or 4= yes)

and the fiteval.in file now requires only 2 columns (Observed vs. predicted values). For example with this model error and using the example example fiteval_1.in we obtain,

However if individual distributions are desired for each predicted point, the default 5 columns for Case 13 will be used (Yobs, Ypred, CF, UB_Pmin, UB_Pmax - see file fiteval_13s.in examples) and line 9 in the fitevalconfig.txt file now becomes,

4 % Take into account observations/model uncertainty (>0,1,2,3 or 4= yes)

 

C) Running FITEVAL accounting forboth uncertainty ofobserved data and model results

An example of evaluation with both observed and simulated errors with fiteval_13s.in in this case is presented below,

Library Requirements

If Matlab is already installed in your computer no addtional libraries are needed to execute this program.If you currently do not have Matlab 7.15 (R2011a) or above installed on your computer, download and install the MCRInstaller before running the application. This is a self-extracting utility, which depends on the end user's platform:

Once downloaded follow these instructions:

  1. Download and open the file for your operating system Windows or OS X only)).
  2. A command window opens and begins preparation for the installation.
  3. When the MCRInstaller wizard appears.
  4. You may specify where you want to install the MCR or accept the default folder.
  5. The installation begins and can take a few minutes to complete.

Once the installation is completed, MatLab developed stand-alone applications can be executed.

Program License

This program is distributed as Freeware/Public Domain under the terms of GNU-License. If the program is found useful the authors ask that acknowledgment is given to its use in any resulting publication and the authors notified. The source code is available from the authors upon request:


Return to top

References

  • Harmel, R.D., Smith, P.K., 2007. Consideration of measurement uncertainty in the evaluation of goodness-of-fit in hydrologic and water quality modeling. J. Hydrol. 337, 326–336.
  • Harmel, R.D., Smith, P.K., Migliaccio, K.W., 2010. Modifying goodness-of-fit indicators to incorporate both measurement and model uncertainty in model calibration and validation. Trans. ASABE 53, 55 – 63.
  • Ritter, A. and R. Muñoz-Carpena. 2017. Effect of data and model uncertainties on the objective evaluation of model goodness-of-fit with statistical significance testing (under review).
  • Ritter, A. and R. Muñoz-Carpena. 2013. Predictive ability of hydrological models: objective assessment of goodness-of-fit with statistical significance. J. of Hydrology 480(1):33-45. doi:10.1016/j.jhydrol.2012.12.004

Return to top

This page was last updated on November 22, 2017.