Prepare data for ebase

Usage

ebase_prep(dat, Z, interval, ndays = 1)

Arguments

dat: input data frame
Z: numeric as single value for water column depth (m) or vector equal in length to number of rows in dat
interval: timestep interval in seconds
ndays: numeric for number of days in dat for optimizing the metabolic equation, see details

Value

A data frame with additional columns required for ebase. Dissolved oxygen as a volumetric concentration in dat as mg/L is returned in areal units as mmol/m2. If multiple time steps are identified, the number of rows in data frame is expanded based on the time step define by interval. Numeric values in the expanded rows will be interpolated if interp = TRUE, otherwise they will remain as NA values.

Details

Checks if all columns are present by matching those in exdat, checks if DateTimeStamp is in ascending order, converts dissolved oxygen from mg/L to mmol/m3, calculates the Schmidt number (unitless) from water temp (C) and salinity (psu), and calculates dissolved oxygen equilibrium concentration (mmol/m3) from salinity and temperature

The ndays argument defines the number of days that are used for optimizing the above mass balance equation. By default, this is done each day, i.e., ndays= 1 such that a loop is used that applies the model equation to observations within each day, evaluated iteratively from the first observation in a day to the last. Individual parameter estimates for a, R, and b are then returned for each day. However, more days can be used to estimate the unknown parameters, such that the loop can be evaluated for every ndays specified by the argument. The ndays argument will separate the input data into groups of consecutive days, where each group has a total number of days equal to ndays. The final block may not include the complete number of days specified by ndays if the number of unique dates in the input data includes a remainder when divided by ndays, e.g., if seven days are in the input data and ndays = 5, there will be two groups where the first has five days and the second has two days. The output data from ebase includes a column that specifies the grouping that was used based on ndays.

Missing values are interpolated at the interval specified by the interval argument for conformance with the core model equation. Records at the start or end of the input time series that do not include a full day are also removed. A warning is returned to the console if gaps are found or dangling records are found.

Examples

dat <- ebase_prep(exdat, Z = 1.85, interval = 900)
#> Warning: More than one time step or missing values will be interpolated
head(dat)
#>         Date       DateTimeStamp isinterp  DO_obs   DO_sat    Z Temp  Sal PAR
#> 1 2012-02-23 2012-02-23 00:00:00    FALSE 275.000 265.8630 1.85 16.4 23.0   0
#> 2 2012-02-23 2012-02-23 00:15:00    FALSE 275.000 266.1861 1.85 16.4 22.8   0
#> 3 2012-02-23 2012-02-23 00:30:00    FALSE 275.000 266.3479 1.85 16.4 22.7   0
#> 4 2012-02-23 2012-02-23 00:45:00    FALSE 275.000 266.0245 1.85 16.4 22.9   0
#> 5 2012-02-23 2012-02-23 01:00:00    FALSE 271.875 266.3479 1.85 16.4 22.7   0
#> 6 2012-02-23 2012-02-23 01:15:00    FALSE 265.625 265.2179 1.85 16.4 23.4   0
#>   WSpd       sc grp
#> 1  3.6 660.2062   1
#> 2  3.5 659.8137   1
#> 3  3.6 659.6174   1
#> 4  4.2 660.0099   1
#> 5  3.6 659.6174   1
#> 6  4.1 660.9913   1