represampling_factor_bootstrap resamples partitions defined by a factor variable. This can be used for non-overlapping block bootstraps and similar.

represampling_factor_bootstrap(
  data,
  fac,
  repetition = 1,
  nboot = -1,
  seed1 = NULL,
  oob = FALSE
)

Arguments

data

data.frame containing at least the columns specified by coords

fac

defines a grouping or partitioning of the samples in data; three possible types: (1) the name of a variable in data (coerced to factor if not already a factor variable); (2) a factor variable (or a vector that can be coerced to factor); (3) a list of factor variables (or vectors that can be coerced to factor); this list must be of length length(repetition), and if it is named, the names must be equal to as.character(repetition); this list will typically be generated by a partition.* function with return_factor = TRUE (see Examples below)

repetition

numeric vector: cross-validation repetitions to be generated. Note that this is not the number of repetitions, but the indices of these repetitions. E.g., use repetition = c(1:100) to obtain (the 'first') 100 repetitions, and repetition = c(101:200) to obtain a different set of 100 repetitions.

nboot

number of bootstrap replications used for generating the bootstrap training sample (nboot[1]) and the test sample (nboot[2]); nboot[2] is ignored (with a warning) if oob = TRUE. A value of -1 will be substituted with the number of levels of the factor variable, corresponding to an n out of n bootstrap at the grouping level defined by fac.

seed1

seed1+i is the random seed that will be used by set.seed in repetition i (i in repetition) to initialize the random number generator before sampling from the data set.

oob

if TRUE, the test sample will be the out-of-bag sample; if FALSE (default), the test sample is an independently drawn bootstrap sample of size nboot[2].

Details

nboot refers to the number of groups (as defined by the factors) to be drawn with replacement from the set of groups. I.e., if fac is a factor variable, nboot would normally not be greater than nlevels(fac), nlevels(fac) being the default as per nboot = -1.

Examples

data(ecuador)
# a dummy example for demonstration, performing bootstrap
# at the level of an arbitrary factor variable:
parti <- represampling_factor_bootstrap(ecuador,
  factor(floor(ecuador$dem / 100)),
  oob = TRUE
)
# plot(parti,ecuador)
# using the factor bootstrap for a non-overlapping block bootstrap
# (see also represampling_tile_bootstrap):
fac <- partition_tiles(ecuador,
  return_factor = TRUE, repetition = c(1:3),
  dsplit = 500, min_n = 200, rotation = "random",
  offset = "random"
)
parti <- represampling_factor_bootstrap(ecuador, fac,
  oob = TRUE,
  repetition = c(1:3)
)
# plot(parti, ecuador)