R/sperrorest_resampling.R
partition_tiles.Rd
partition_tiles
divides the study area into a specified number
of rectangular tiles. Optionally small partitions can be merged with
adjacent tiles to achieve a minimum number or percentage of samples in each
tile.
partition_tiles( data, coords = c("x", "y"), dsplit = NULL, nsplit = NULL, rotation = c("none", "random", "user"), user_rotation, offset = c("none", "random", "user"), user_offset, reassign = TRUE, min_frac = 0.025, min_n = 5, iterate = 1, return_factor = FALSE, repetition = 1, seed1 = NULL )
data 


coords  vector of length 2 defining the variables in 
dsplit  optional vector of length 2: equidistance of splits in
(possibly rotated) x direction ( 
nsplit  optional vector of length 2: number of splits in (possibly
rotated) x direction ( 
rotation  indicates whether and how the rectangular grid should be
rotated; random rotation is only between 
user_rotation  if 
offset  indicates whether and how the rectangular grid should be shifted by an offset. 
user_offset  if 
reassign  logical (default 
min_frac  numeric >=0, <1: minimum relative size of partition as
percentage of sample; argument passed to get_small_tiles. Will be ignored
if 
min_n  integer >=0: minimum number of samples per partition; argument
passed to get_small_tiles. Will be ignored if 
iterate  argument to be passed to tile_neighbors 
return_factor  if 
repetition  numeric vector: crossvalidation repetitions to be
generated. Note that this is not the number of repetitions, but the indices
of these repetitions. E.g., use 
seed1 

A represampling object. Contains length(repetition)
resampling
objects as repetitions. The exact number of folds / testset tiles within
each resampling objects depends on the spatial configuration of the data
set and possible cleaning steps (see min_frac
, min_n
).
Default parameter settings may change in future releases. This
function, especially the rotation and shifting part of it and the algorithm
for cleaning up small tiles is still a bit experimental. Use with caution.
For nonzero offsets (offset!='none')
), the number of tiles may actually
be greater than nsplit[1]*nsplit[2]
because of fractional tiles lurking
into the study region. reassign=TRUE
with suitable thresholds is
therefore recommended for nonzero (including random) offsets.
data(ecuador) set.seed(42) parti < partition_tiles(ecuador, nsplit = c(4, 3), reassign = FALSE) # plot(parti,ecuador) # tile A4 has only 55 samples # same partitioning, but now merge tiles with less than 100 samples to # adjacent tiles: parti2 < partition_tiles(ecuador, nsplit = c(4, 3), reassign = TRUE, min_n = 100 ) # plot(parti2,ecuador) summary(parti2)#> $`1` #> n.train n.test #> X1:Y3 600 151 #> X2:Y2 626 125 #> X3:Y1 584 167 #> X3:Y2 574 177 #> X3:Y3 620 131 #># tile B4 (in 'parti') was smaller than A3, therefore A4 was merged with B4, # not with A3 # now with random rotation and offset, and tiles of 2000 m length: parti3 < partition_tiles(ecuador, dsplit = 2000, offset = "random", rotation = "random", reassign = TRUE, min_n = 100 ) # plot(parti3, ecuador) summary(parti3)#> $`1` #> n.train n.test #> X2:Y2 452 299 #> X3:Y1 562 189 #> X3:Y2 488 263 #>