partition_kmeans divides the study area into irregularly shaped spatial partitions based on k-means (kmeans) clustering of spatial coordinates.

partition_kmeans(
  data,
  coords = c("x", "y"),
  nfold = 10,
  repetition = 1,
  seed1 = NULL,
  return_factor = FALSE,
  balancing_steps = 1,
  order_clusters = TRUE,
  ...
)

Arguments

data

data.frame containing at least the columns specified by coords

coords

vector of length 2 defining the variables in data that contain the x and y coordinates of sample locations.

nfold

number of cross-validation folds, i.e. parameter k in k-means clustering.

repetition

numeric vector: cross-validation repetitions to be generated. Note that this is not the number of repetitions, but the indices of these repetitions. E.g., use repetition = c(1:100) to obtain (the 'first') 100 repetitions, and repetition = c(101:200) to obtain a different set of 100 repetitions.

seed1

seed1+i is the random seed that will be used by set.seed in repetition i (i in repetition) to initialize the random number generator before sampling from the data set.

return_factor

if FALSE (default), return a represampling object; if TRUE (used internally by other sperrorest functions), return a list containing factor vectors (see Value)

balancing_steps

if > 1, perform nfold-means clustering balancing_steps times, and pick the clustering that minimizes the Gini index of the sample size distribution among the partitions. The idea is that 'degenerate' partitions will be avoided, but this also has the side effect of reducing variation among partitioning repetitions. More meaningful constraints (e.g., minimum number of positive and negative samples within each partition should be added in the future.

order_clusters

if TRUE, clusters are ordered by increasing x coordinate of center point.

...

additional arguments to kmeans.

Value

A represampling object, see also partition_cv for details.

Note

Default parameter settings may change in future releases.

References

Brenning, A., Long, S., & Fieguth, P. (2012). Detecting rock glacier flow structures using Gabor filters and IKONOS imagery. Remote Sensing of Environment, 125, 227-237. doi:10.1016/j.rse.2012.07.005

Russ, G. & A. Brenning. 2010a. Data mining in precision agriculture: Management of spatial information. In 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010; Dortmund; 28 June - 2 July 2010. Lecture Notes in Computer Science, 6178 LNAI: 350-359.

Examples

data(ecuador)
resamp <- partition_kmeans(ecuador, nfold = 5, repetition = 2)
# plot(resamp, ecuador)