FITEVAL
Program for objective assessment of model goodness-of-fit with statistical signifficance based on Ritter and Muñoz-Carpena (2013).
- Download FITEVAL for Windows [1.8Mb]
- Download FITEVAL for OS X [1.8Mb]
- Library requirements (see below)
Program description
FITEVAL was developed as a software tool for standardized model evaluation that incorporates data and model uncertainty following the procedures presented in this paper. The tool is implemented in MATLAB® and is available free of charge as a computer application (MS-Windows® and Apple OSX®) or as a MATLAB function. FITEVAL the quantification of model prediction error (in units of the output) as the root mean squared error (RMSE); and the computation of the Nash and Sutcliffe (1970) coefficient of efficiency (NSE) as a dimensionless goodness-of-fit indicator. As described in Ritter and Muñoz-Carpena (2013), FITEVAL uses the general formulation of the coefficient of efficiency E_{j}, which allows modelers for computing modified versions of this indicator,
where B_{i} is a benchmark series, which may be a single number (such as the mean of observations), seasonally varying values (such as seasonal means), or predicted benchmark values using a function of other variables. For j=2 and B_{i}=Ō , E_{j} yields E_{2}=NSE. The code is flexible to be used with other model efficiency threshold values than those proposed, or for calculating Legates and McCabe (1999) modified form of the coefficient of efficiency (E_{1}) instead of NSE. Notice that using transformed series in the corresponding ASCII text input file, such as {√O_{i},√P_{i}} , {ln (O_{i }+ ε). ln (P_{i} + ε)} or {1/(O_{i}+ ε), 1/(P_{i} + ε)}, the program computes NSE (and RMSE) applied on root squared, log and inverse transformed values, respectively (Le Moine, 2008; Oudin et al., 2006).
Hypothesis testing of NSE exceeding threshold values is performed based on developing approximated PDFs, which are obtained by bootstrapping (Efron and Tibshirani, 1993) or block bootstrapping (Politis and Romano, 1994) in the case of time series data (non independent autocorrelated values). Proposed threshold values are used for delimiting model efficiency classes denoted as Unsatisfactory (NSE<0.65), Acceptable (0.65≤NSE<0.80), Good (0.80≤NSE<0.90) and Very good (NSE≥0.90). Other model efficiency thresholds for particular applications could be justified and used without affecting the value of the methods proposed herein. The goodness-of-fit is statistically accepted (i.e. H0: NSE< 0.65 rejected) when the one-tailed p-value is less than a significance level a. The p-value represents here the probability of wrongly accepting the fit (NSE≥ NSE_{threshold}=0.65). Common significance level values are a=0.1, 0.05 or 0.01, but the choice of a should be based on the research context, i.e. how strong the evidence needs to be for accepting or rejecting H0. Thus, for p-value >a, the goodness-of-fit is considered Unsatisfactory. Ritter and Muñoz-Carpena (2013) suggested to adopt the least restrictive significance level of a= 0.10 as a starting point, and Harmel et al. (2014) provide guidance on model interpretation and evaluation considering intended model uses (i.e., Exploratory, Planning, and Regulatory or Legal).
For realistic model evaluation, consideration of uncertainty associated with the observations and the model predictions is critical. Two methods to account for measurement and/or model uncertainties are studied and implemented within FITEVAL: a) Probable Error Range, PER) that has the advantage that is non-parametric, but can yield excessive model performance increases; and b) Correction Factor (CF) provides more realistic modification of the deviation term used in the goodness-of-fit indicators and allows for model uncertainty, but it requires assumptions about the probability distribution of the uncertainty about the data and/or the model predictions. FITEVAL provides modelers with an easy to use tool to conduct model performance evaluation accounting for data and prediction uncertainty in a simple and quick procedure.
Program Use & Output
FITEVAL can be executed as standalone application or as a MATLAB function.
A user supplied input ASCII data file is required, located in the same directory as the application. By default this file name is "fiteval.in", but the user can specify .an arbitrary file name passed in the command line, i.e.,
fiteval
fiteval data_ex1.in
The first line will execute with the data contained in the fiteval.in file located in the distribution directory, Notice this is the same behaviour that will be obtained by clicking on the fiteval executable in Windows. The second command above will read the contents of the user specified file ("data_ex1.in" in the example. See dditional details about program execution in the Examples section below. This file must contain at least two paired vectors or columns: the first with the observations and the second with the model-calculated values file is required containing the observed and calculated values to be evaluated. The input file may contain missing values that must be denoted as nan. The program can handle many additional options(comparisons with benchmark data, uncertainty in observed values, uncertainty in the simulated results and combinations of these) by specifying additional columns in data input file and line 9 of the fitevalconfig.txt file (described below). See the help file (fitevaloptions_help.pdf), and corresponding input file examples provided in the distribution directory.
During texecution the program presents a summary of statistics on the command window and a Figure with graphical assessment. The program closes after the user closes the Figure window. For batch execution (for example for a set of different data sets at once), a "silent mode" (with no graphical output or pause after it) is vailable by setting the option in the "fitevalconfig.txt " described below.
After execution, FITEVALprovides the goodness-of-fit evaluation providing a portable data file (pdf) containing:
a) a plot of observed vs. computed values illustrating the match on the 1:1 line;
b) the calculation of NSE and RMSE and their corresponding confidence intervals of 95%;
c) the qualitative goodness-of-fit interpretation based on the established classes;
d) a verification of the presence of bias or the possible presence of outliers;
e) the plot of the Ceff cumulative probability function superimposed on the Ceff class regions;
f) a plot illustrating the evolution of the observed and computed values.
The latter plot helps for visually inspecting the similarity degree of the two series, and detecting which observations are best or worst predicted by the model. When uncertainty is incorporated in the evaluation, the resulting plots show the uncertainty boundaries as vertical (measurement uncertainty) or horizontal (model uncertainty) bars. Additionally, the numerical output is stored in a ascii text file. The plots can be obtained also as separated files in the specified (as argument) graphic format ('eps', 'pdf', 'jpg', 'tiff, or 'png'). A plot label can be provided by providing a "text string" as argument after the filename. Examples of use are provided in the section below.
Removing repeated cases in the observed and calculated values is possible by passing as second argument NOREP. For example: fiteval data_ex3.in NOREP. The program indicates the number of removals only if repeated cases present.
The fitevalconfig.txt file is required when you want to run FITEVAL with other threshold values or for calculating Legates and McCabe (1999) modified form of the coefficient of efficiency (E1) instead of NSE. This file contains six lines specifying: Acceptable NSEthreshold, Good NSEthreshold, Very good NSEthreshold, relative bias threshold value (%), the option for computing E1, and figures’ font size, and the option for canceling the on-screen display of the graphical output (useful when using FITEVAL for multiple series calculations within Matlab code or other scripting environments). The corresponding default values are 0.65, 0.80, 0.90, 5, 0, 10, and 0, respectively. FITEVAL can also apply Efron and Tibshirani (1993) bootstrap or Politis and Romano (1994) block bootstrap when dealing with time series. The latter isrecommended for autocorrelated values like is typical in time series and is set as the default option. An example of the fitevalconfig.txt file is provided below, .
0.65 %Acceptable NSE threshold value 0.80 %Good NSE threshold value 0.90 %Very good NSE threshold value 5 %BiasValue 0 %Compute Legates and McCabe modified Ceff (1=yes,0=no) 0 % Bootstrap method (1= Efrons' bootstrapping, 0= block boostrapping) 10 % FontSizeValue 0 % Do not display graphical output (1=accept) 0 % Take into account observations uncertainty (>0,1,2,3 or 4= yes) ro % Color and type of marker size 4 % Sizes of the series marker |
To find different combinations of markers and colors that can be used in MatLab, please visit http://www.mathworks.com/help/matlab/ref/linespec.html for full description of options. For example, "ro" in the example above means "red circle" marker.
Examples
Several EXAMPLE files are included in this package (fiteval_1.in through fiteval_14s.in) and fiteval.in (default). The program input file must be written in ASCII or text format (be sure to select this option when saving the file with the editor of your choice). The program is exceduted from the DOS command prompt in Windows or from the terminal unix shell in OS X. Some examples are presented below,
A) Under Windows (or inside Matlab in both Windows and OS X)
fiteval <argument_list>
fiteval
fiteval fiteval.in norep
fiteval data_ex1.in
fiteval data_ex2.in
fiteval data_ex3.in
fiteval data_ex3.in norep
fiteval data_ex1.in "Example Data Set" jpg
Example of FITEVAL numerical output
Example of FITEVAL graphical output
B) Under Mac OS X
run_fiteval.sh <directory> <argument_list>
run_fiteval.sh <directory>
run_fiteval.sh <directory> fiteval.in norep
run_fiteval.sh <directory> data_ex1.in
run_fiteval.sh <directory> data_ex2.in
run_fiteval.sh <directory> data_ex3.in
run_fiteval.sh <directory> data_ex3.in norep
run_fiteval.sh <directory> data_ex1.in "Example Data Set" jpg
<directory> is the directory where the MCR (or the MatLab) is installed (see Library Requirements below). For exampe if Matlab is installed in your computer in the directory "/Applications/MATLAB_R2011a.app", the terminal command would be:
run_fiteval.sh /Applications/MATLAB_R2011a.app data_ex1.in
If instead the runtime libraries are installed, for example in "/Applications/MATLAB/MATLAB_Compiler_Runtime/v715" the the command would be:
run_fiteval.sh /Applications/MATLAB/MATLAB_Compiler_Runtime/v715 data_ex1.in
C) Running FITEVAL accounting for uncertainty of observed data or model predictions
When selecting uncertainty several options are available controlled from the fitevalconfig.txt (see help file for details, fitevaloptions_help.pdf). Typically, to run these cases, a combination of an option value (>0-4) in line 9 of the file fitevalconfig.txt and number of columns in the data input file (i.e. fiteval.in) will be required. It is important to match the option value with the number of columns to produce the desired results.
The inputs can be simplified when a common error distribution is used for all data points. For example, in order to run FITEVAL accounting for observation uncertainty (Case 9 in fitevaloptions_help.pdf) described by a normal distribution with a coefficient of variation of 17% common to all data points, the following fievalconfig.txt file will be required.
0.65 %Acceptable NSE threshold value 0.80 %Good NSE threshold value 0.90 %Very good NSE threshold value 5 %BiasValue 0 %Compute Legates and McCabe modified Ceff (1=yes,0=no) 0 % Bootstrap method (1= Efrons' bootstrapping, 0= block boostrapping) 10 % FontSizeValue 0 % Do not display graphical output (1=accept) 3 9 N 17 % Take into account observations uncertainty (>0,1,2,3 or 4= yes) ro % Color and type of marker size 4 % Sizes of the series marker |
The input file (i.e. fiteval.in) now requires only 2 columns (Observed vs. predicted values - see example fiteval_1.in). In this case FITEVAL will automaticallycompute the correction factor (CF) and upper and lower error bounds (UB_Omin, UB_Omax) for each observed value that are used to modify the goodness of fit indicators based on Harmel et al. (2010).
If the user desires to specify the CF, UB_Omin, UB_Omax for every individual data point, line 9 in the fitevalconfig.txt file now becomes,
3 % Take into account observations uncertainty (>0,1,2,3 or 4= yes) |
and the fiteval.in file requires 5 columns (Yobs, Ypred, CF, UB_Omin, UB_Omax - see file fiteval_9s.in examples), as described in Case 9 of fitevaloptions_help.pdf. Note that FITEVAL can also compute automatically the correction factors (CF_{i}) and the uncertainty bounds based on the selected probability distribution according to the table in fitevaloptions_help.pdf.
Model errors like those generated from a Monte-Carlo uncertainty analysis, can be also considered (Case 13). For example, using the simplication of a common error distribution for all predictions, for example a triangular distribution with bounds given as 10% lower and 20% up from the predicted value, line 9 in the fitevalconfig.txt file now becomes,
4 13 T 10 20 % Take into account observations uncertainty (>0,1,2,3 or 4= yes) |
and the fiteval.in file now requires only 2 columns (Observed vs. predicted values - see example fiteval_1.in). However if individual distributions are desired for each predicted point, the default 5 columns for Case 13 will be used (Yobs, Ypred, CF, UB_Pmin, UB_Pmax - see file fiteval_13s.in examples) and line 9 in the fitevalconfig.txt file now becomes,
4 % Take into account observations uncertainty (>0,1,2,3 or 4= yes) |
An example of evaluation with both observed and simulated errors is presented below,
D) Batch processing of data files
Notice that a batch file can be used for executing many FITEVAL examples automatically (to avoid stopping the execution after each on-screen graphical output set the appropriate flag in the fitevalconfig.txt file).
An example batch processing file (test_loop.m) is included in the distribution package for two data files (Example_1.in and Example_2.in). This can be extended for any number of files and names.
Library Requirements
If Matlab is already installed in your computer no addtional libraries are needed to execute this program.If you currently do not have Matlab 7.15 (R2011a) or above installed on your computer, download and install the MCRInstaller before running the application. This is a self-extracting utility, which depends on the end user's platform:
- Download MCRInstaller for Windows [198Mb]
- Download MCRInstaller for OS X [16.5Mb]
Once downloaded follow these instructions:
- Download and open the file.
- A command window opens and begins preparation for the installation.
- When the MCRInstaller wizard appears.
- You may specify where you want to install the MCR or accept the default folder.
- The installation begins and can take a few minutes to complete.
Once the installation is completed, MatLab developed stand-alone applications can be executed.
Program License
This program is distributed as Freeware/Public Domain under the terms of GNU-License. If the program is found useful the authors ask that acknowledgment is given to its use in any resulting publication and the authors notified. The source code is available from the authors upon request:
- Dr. Axel Ritter
Profesor Titular
Área de Ingeniería Agroforestal. Dep. Ingeniería, Producción y Economía Agrarias
Universidad de La Laguna
Ctra. Geneto, 2; 38200 La Laguna (Spain)
Phone: +34 922 318 548
http:/webpages.ull.es/users/aritter
aritter@ull.es
- Dr. Rafael Muñoz-Carpena
Professor, Hydrology & Water Quality
Department of Agricultural & Biological Engineering
University of Florida
P.O. Box 110570
287 Frazier Rogers Hall
Gainesville, FL 32611-0570 (USA)
carpena@ufl.edu
© Copyright 2012 Axel Ritter & Rafael Muñoz-Carpena
References
- Ritter, A. and R. Muñoz-Carpena. 2013. Predictive ability of hydrological models: objective assessment of goodness-of-fit with statistical significance. J. of Hydrology 480(1):33-45. doi:10.1016/j.jhydrol.2012.12.004
This page was last updated on July 12, 2017.