R/sperrorest_resampling.R
partition_kmeans.Rd
partition_kmeans
divides the study area into irregularly
shaped spatial partitions based on k-means (kmeans) clustering of
spatial coordinates.
partition_kmeans(
data,
coords = c("x", "y"),
nfold = 10,
repetition = 1,
seed1 = NULL,
return_factor = FALSE,
balancing_steps = 1,
order_clusters = TRUE,
...
)
data.frame
containing at least the columns specified by
coords
vector of length 2 defining the variables in data
that
contain the x and y coordinates of sample locations.
number of cross-validation folds, i.e. parameter k in k-means clustering.
numeric vector: cross-validation repetitions to be
generated. Note that this is not the number of repetitions, but the indices
of these repetitions. E.g., use repetition = c(1:100)
to obtain (the
'first') 100 repetitions, and repetition = c(101:200)
to obtain a
different set of 100 repetitions.
seed1+i
is the random seed that will be used by set.seed in
repetition i
(i
in repetition
) to initialize the random number
generator before sampling from the data set.
if FALSE
(default), return a represampling object;
if TRUE
(used internally by other sperrorest functions), return a
list
containing factor vectors (see Value)
if > 1
, perform nfold
-means clustering
balancing_steps
times, and pick the clustering that minimizes the Gini
index of the sample size distribution among the partitions. The idea is
that 'degenerate' partitions will be avoided, but this also has the side
effect of reducing variation among partitioning repetitions. More
meaningful constraints (e.g., minimum number of positive and negative
samples within each partition should be added in the future.
if TRUE
, clusters are ordered by increasing x
coordinate of center point.
additional arguments to kmeans.
A represampling object, see also partition_cv for details.
Default parameter settings may change in future releases.
Brenning, A., Long, S., & Fieguth, P. (2012). Detecting rock glacier flow structures using Gabor filters and IKONOS imagery. Remote Sensing of Environment, 125, 227-237. doi:10.1016/j.rse.2012.07.005
Russ, G. & A. Brenning. 2010a. Data mining in precision agriculture: Management of spatial information. In 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010; Dortmund; 28 June - 2 July 2010. Lecture Notes in Computer Science, 6178 LNAI: 350-359.
data(ecuador)
resamp <- partition_kmeans(ecuador, nfold = 5, repetition = 2)
# plot(resamp, ecuador)