R/sperrorest_resampling.R
partition_cv_strat.Rd
partition_cv_strat
creates a set of sample indices
corresponding to crossvalidation test and training sets.
partition_cv_strat( data, coords = c("x", "y"), nfold = 10, return_factor = FALSE, repetition = 1, seed1 = NULL, strat )
data 


coords  vector of length 2 defining the variables in 
nfold  number of partitions (folds) in 
return_factor  if 
repetition  numeric vector: crossvalidation repetitions to be
generated. Note that this is not the number of repetitions, but the indices
of these repetitions. E.g., use 
seed1 

strat  character: column in 
A represampling object, see also partition_cv()
.
partition_strat_cv
, however, stratified with respect to the variable
data[,strat]
; i.e., crossvalidation partitioning is done within each set
data[data[,strat]==i,]
(i
in levels(data[, strat])
), and the i
th
folds of all levels are combined into one crossvalidation fold.
data(ecuador) parti < partition_cv_strat(ecuador, strat = "slides", nfold = 5, repetition = 1 ) idx < parti[["1"]][[1]]$train mean(ecuador$slides[idx] == "TRUE") / mean(ecuador$slides == "TRUE")#> [1] 0.9996672# always == 1 # Nonstratified crossvalidation: parti < partition_cv(ecuador, nfold = 5, repetition = 1) idx < parti[["1"]][[1]]$train mean(ecuador$slides[idx] == "TRUE") / mean(ecuador$slides == "TRUE")#> [1] 1.009664# close to 1 because of large sample size, but with some random variation