The tidyhydro
package provides a set of commonly used metrics in hydrology (such as NSE, KGE, pBIAS) for use within a tidymodels
infrastructure. Originally inspired by the yardstick
and hydroGOF
packages, this library is mainly written in C++ and provides a very quick estimation of desired goodness-of-fit criteria.
Additionally, you’ll find here a C++ implementation of lesser-known yet powerful metrics used in reports from the United States Geological Survey (USGS) and the National Environmental Monitoring Standards (NEMS) guidelines. Examples include PRESS (Prediction Error Sum of Squares), SFE (Standard Factorial Error), and MSPE (Model Standard Percentage Error) and others. Based on the equations from Helsel et al. (2020), Rasmunsen et al. (2008), Hicks et al. (2020) and etc. (see documentation for details).
Example
The tidyhydro
package follows the philosophy of yardstick
and provides S3 class methods for vectors and data frames. For example, one can estimate KGE
, NSE
or pBIAS
for a data frame like this:
library(tidyhydro)
str(avacha)
#> Classes 'tbl_df', 'tbl' and 'data.frame': 365 obs. of 3 variables:
#> $ date: Date, format: "2022-01-01" "2022-01-02" ...
#> $ obs : num 76.2 76.2 76.3 76.3 76.4 76.4 76.5 76.5 76.6 76.6 ...
#> $ sim : num 84.8 84.3 84 83.7 83.4 ...
kge(avacha, obs, sim)
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 kge standard 0.947
or create a metric_set
and estimate several parameters at once like this:
hydro_metrics <- yardstick::metric_set(nse, pbias)
hydro_metrics(avacha, obs, sim)
#> # A tibble: 2 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 nse standard 0.895
#> 2 pbias standard 0.0540
We do understand that sometimes one needs a qualitative interpretation of the model. Therefore, we populated some functions with a performance
argument. When performance = TRUE
, the metric interpretation will be returned according to Moriasi et al. (2015).
hydro_metrics(avacha, obs, sim, performance = TRUE)
#> # A tibble: 2 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <chr>
#> 1 nse standard Excellent
#> 2 pbias standard Excellent
Installation
You can install the development version of tidyhydro
from GitHub with:
# install.packages("pak")
pak::pak("atsyplenkov/tidyhydro")
Benchmarking
Since the package uses Rcpp
in the background, it performs slightly faster than base R and other R packages (see benchmarks). This is particularly noticeable with large datasets:
set.seed(12234)
x <- runif(10^6)
y <- runif(10^6)
nse <- function(truth, estimate, na_rm = TRUE) {
#fmt: skip
1 - (sum((truth - estimate)^2, na.rm = na_rm) /
sum((truth - mean(truth, na.rm = na_rm))^2, na.rm = na_rm))
}
bench::mark(
tidyhydro = tidyhydro::nse_vec(truth = x, estimate = y),
hydroGOF = hydroGOF::NSE(sim = y, obs = x),
baseR = nse(truth = x, estimate = y),
check = TRUE,
relative = TRUE,
filter_gc = FALSE,
iterations = 50L
)
#> # A tibble: 3 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 tidyhydro 1 1 22.7 NaN NaN
#> 2 hydroGOF 15.2 19.1 1 Inf Inf
#> 3 baseR 8.66 10.6 2.44 Inf Inf
Code of Conduct
Please note that the tidyhydro project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.