(blockCV) "Environmental blocking" resampling
Source:R/ResamplingSpCVEnv.R
mlr_resamplings_spcv_env.Rd
Splits data by clustering in the feature space.
See the upstream implementation at blockCV::cv_cluster()
and
Valavi et al. (2018) for further information.
Details
Useful when the dataset is supposed to be split on environmental information which is present in features. The method allows for a combination of multiple features for clustering.
The input of raster images directly as in blockCV::cv_cluster()
is not
supported. See mlr3spatial and its raster DataBackends for such
support in mlr3.
Parameters
folds
(integer(1)
)
Number of folds.features
(character()
)
The features to use for clustering.
References
Valavi R, Elith J, Lahoz-Monfort JJ, Guillera-Arroita G (2018). “blockCV: an R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models.” bioRxiv. doi:10.1101/357798 .
Super class
mlr3::Resampling
-> ResamplingSpCVEnv
Active bindings
iters
integer(1)
Returns the number of resampling iterations, depending on the values stored in theparam_set
.
Methods
Method new()
Create an "Environmental Block" resampling instance.
For a list of available arguments, please see blockCV::cv_cluster.
Usage
ResamplingSpCVEnv$new(id = "spcv_env")
Method instantiate()
Materializes fixed training and test splits for a given task.
Arguments
task
Task
A task to instantiate.
Examples
# \donttest{
if (mlr3misc::require_namespaces(c("sf", "blockCV"), quietly = TRUE)) {
library(mlr3)
task = tsk("ecuador")
# Instantiate Resampling
rcv = rsmp("spcv_env", folds = 4)
rcv$instantiate(task)
# Individual sets:
rcv$train_set(1)
rcv$test_set(1)
intersect(rcv$train_set(1), rcv$test_set(1))
# Internal storage:
rcv$instance
}
#> Key: <fold>
#> row_id fold
#> <int> <int>
#> 1: 682 1
#> 2: 464 2
#> 3: 3 3
#> 4: 135 3
#> 5: 192 3
#> ---
#> 747: 747 4
#> 748: 748 4
#> 749: 749 4
#> 750: 750 4
#> 751: 751 4
# }