Sample a daily water quality time series at a set monthly frequency
samp_sim(
dat_in,
unit = "month",
irregular = TRUE,
missper = 0,
blck = 1,
blckper = FALSE
)
input data.frame
that is returned from lnres_sim
or all_sims
chr string indicating sampling unit, must be year, quarter, month, week, or yday for equivalent lubridate function
logical indicating if monthly sampling is done randomly within each unit
, otherwise the first value is returned
numeric from 0-1 indicating percentage of observations used for test dataset
numeric indicating block size for resampling test dataset, see details
logical indicating if the value passed to blck
is a proportion of missper
, i.e., blocks are to be sized as a percentage of the total size of the missing data
Original data frame with rows subset based on number of desired monthly samples. If missper > 0
, a list is returned where the first element is the index values for the test dataset and the second is the complete subsampled dataset.
This function is intended for sampling a simulated daily time series of water quality that is returned by lnres_sim
or all_sims
.
The missper
argument is used to create a test dataset as a proportion of all observations in the sub-sampled output dataset. The test dataset is created with random block sampling appropriate for time series. Block sampling of the output dataset occurs until the number of unique observations is equal to the percentage defined by missper
. Overlap of blocks are not doubly considered towards the observation counts to satisfy missper
, i.e., sets of continuous observations longer than blck
can be returned because of sampling overlap. Setting blck = 1
and blockper = FALSE
is completely random sampling for missing data. Values for blck
must be 1 or greater if blockper = FALSE
and 1 or less if blckper = T
. If blck = 1
and blckper = T
, the missing data will be one continuous block.