partition_disc partitions the sample into training and tests set by selecting circular test areas (possibly surrounded by an exclusion buffer) and using the remaining samples as training samples (leave-one-disc-out cross-validation). partition_loo creates training and test sets for leave-one-out cross-validation with (optional) buffer.

partition_disc(
  data,
  coords = c("x", "y"),
  radius,
  buffer = 0,
  ndisc = nrow(data),
  seed1 = NULL,
  return_train = TRUE,
  prob = NULL,
  replace = FALSE,
  repetition = 1
)

partition_loo(data, ndisc = nrow(data), replace = FALSE, ...)

Arguments

data

data.frame containing at least the columns specified by coords

coords

vector of length 2 defining the variables in data that contain the x and y coordinates of sample locations.

radius

radius of test area discs; performs leave-one-out resampling if radius <0.

buffer

radius of additional 'neutral area' around test area discs that is excluded from training and test sets; defaults to 0, i.e. all samples are either in the test area or in the training area.

ndisc

Number of discs to be randomly selected; each disc constitutes a separate test set. Defaults to nrow(data), i.e. one disc around each sample.

seed1

seed1+i is the random seed that will be used by set.seed in repetition i (i in repetition) to initialize the random number generator before sampling from the data set.

return_train

If FALSE, returns only test sample; if TRUE, also the training area.

prob

optional argument to sample.

replace

optional argument to sample: sampling with or without replacement?

repetition

see partition_cv; however, see Note below: repetition should normally be = 1 in this function.

...

arguments to be passed to partition_disc

Value

A represampling object. Contains length(repetition) resampling objects. Each of these contains ndisc lists with indices of test and (if return_train = TRUE) training sets.

Note

Test area discs are centered at (random) samples, not at general random locations. Test area discs may (and likely will) overlap independently of the value of replace. replace only controls the replacement of the center point of discs when drawing center points from the samples.

radius < 0 does leave-one-out resampling with an optional buffer. radius = 0 is similar except that samples with identical coordinates would fall within the test area disc.

References

Brenning, A. 2005. Spatial prediction models for landslide hazards: review, comparison and evaluation. Natural Hazards and Earth System Sciences, 5(6): 853-862.

Examples

data(ecuador)
parti <- partition_disc(ecuador,
  radius = 200, buffer = 200,
  ndisc = 5, repetition = 1:2
)
# plot(parti,ecuador)
summary(parti)
#> $`1`
#>     n.train n.test
#> 545     729      8
#> 602     734      5
#> 409     712      9
#> 127     714     13
#> 338     688     19
#> 
#> $`2`
#>     n.train n.test
#> 256     723     12
#> 14      693     19
#> 242     718     10
#> 534     715      6
#> 138     721      7
#> 

# leave-one-out with buffer:
parti.loo <- partition_loo(ecuador, buffer = 200)
summary(parti)
#> $`1`
#>     n.train n.test
#> 545     729      8
#> 602     734      5
#> 409     712      9
#> 127     714     13
#> 338     688     19
#> 
#> $`2`
#>     n.train n.test
#> 256     723     12
#> 14      693     19
#> 242     718     10
#> 534     715      6
#> 138     721      7
#>