R/sperrorest_resampling.R
partition_factor_cv.Rd
partition_factor_cv
creates a represampling object, i.e. a
set of sample indices defining cross-validation test and training sets,
where partitions are obtained by resampling at the level of groups of
observations as defined by a given factor variable. This can be used, for
example, to resample agricultural data that is grouped by fields, at the
agricultural field level in order to preserve spatial autocorrelation
within fields.
partition_factor_cv(
data,
coords = c("x", "y"),
fac,
nfold = 10,
repetition = 1,
seed1 = NULL,
return_factor = FALSE
)
data.frame
containing at least the columns specified by
coords
vector of length 2 defining the variables in data
that
contain the x and y coordinates of sample locations.
either the name of a variable (column) in data
, or a vector of
type factor and length nrow(data)
that defines groups or clusters of
observations.
number of partitions (folds) in nfold
-fold cross-validation
partitioning
numeric vector: cross-validation repetitions to be
generated. Note that this is not the number of repetitions, but the indices
of these repetitions. E.g., use repetition = c(1:100)
to obtain (the
'first') 100 repetitions, and repetition = c(101:200)
to obtain a
different set of 100 repetitions.
seed1+i
is the random seed that will be used by set.seed in
repetition i
(i
in repetition
) to initialize the random number
generator before sampling from the data set.
if FALSE
(default), return a represampling object;
if TRUE
(used internally by other sperrorest functions), return a
list
containing factor vectors (see Value)
A represampling object, see also partition_cv for details.
In this partitioning approach, the number of factor levels in fac
must be large enough for this factor-level resampling to make sense.