Sample a daily water quality time series at a set monthly frequency

samp_sim(
  dat_in,
  unit = "month",
  irregular = TRUE,
  missper = 0,
  blck = 1,
  blckper = FALSE
)

Arguments

dat_in

input data.frame that is returned from lnres_sim or all_sims

unit

chr string indicating sampling unit, must be year, quarter, month, week, or yday for equivalent lubridate function

irregular

logical indicating if monthly sampling is done randomly within each unit, otherwise the first value is returned

missper

numeric from 0-1 indicating percentage of observations used for test dataset

blck

numeric indicating block size for resampling test dataset, see details

blckper

logical indicating if the value passed to blck is a proportion of missper, i.e., blocks are to be sized as a percentage of the total size of the missing data

Value

Original data frame with rows subset based on number of desired monthly samples. If missper > 0, a list is returned where the first element is the index values for the test dataset and the second is the complete subsampled dataset.

Details

This function is intended for sampling a simulated daily time series of water quality that is returned by lnres_sim or all_sims.

The missper argument is used to create a test dataset as a proportion of all observations in the sub-sampled output dataset. The test dataset is created with random block sampling appropriate for time series. Block sampling of the output dataset occurs until the number of unique observations is equal to the percentage defined by missper. Overlap of blocks are not doubly considered towards the observation counts to satisfy missper, i.e., sets of continuous observations longer than blck can be returned because of sampling overlap. Setting blck = 1 and blockper = FALSE is completely random sampling for missing data. Values for blck must be 1 or greater if blockper = FALSE and 1 or less if blckper = T. If blck = 1 and blckper = T, the missing data will be one continuous block.

See also

Examples

if (FALSE) {
## example data
data(daydat)

## simulate
tosamp <- all_sims(daydat)

## sample
samp_sim(tosamp)

## sample and create test dataset
# test dataset is 30% size of monthly subsample using block sampling with size = 4
samp_sim(tosamp, missper = 0.3, blck = 4)
}