(CAST) Repeated spatiotemporal "leave-location-and-time-out" resamplingSource:
Splits data using Leave-Location-Out (LLO), Leave-Time-Out (LTO) and
Leave-Location-and-Time-Out (LLTO) partitioning.
See the upstream implementation at
(package CAST) and Meyer et al. (2018) for further information.
LLO predicts on unknown locations i.e. complete locations are left out in the
"space" role in
Task$col_roles identifies spatial units.
TRUE, the target distribution is similar in each fold.
This is useful for land cover classification when the observations
In this case, LLO with stratification should be used to hold back complete
polygons and have a similar target distribution in each fold.
LTO leaves out complete temporal units which are identified by the
"time" role in
LLTO leaves out spatial and temporal units.
See the examples.
Number of folds.
TRUE, stratify on the target column.
Number of repeats.
Zhao Y, Karypis G (2002). “Evaluation of Hierarchical Clustering Algorithms for Document Datasets.” 11th Conference of Information and Knowledge Management (CIKM), 51-524. doi:10.1145/584792.584877 .
Returns the number of resampling iterations, depending on the values stored in the
Create a "Spacetime Folds" resampling instance.
ResamplingRepeatedSptCVCstf$new(id = "repeated_sptcv_cstf")
Translates iteration numbers to fold number.
Translates iteration numbers to repetition number.
Materializes fixed training and test splits for a given task.
A task to instantiate.
library(mlr3) task = tsk("cookfarm_mlr3") task$set_col_roles("SOURCEID", roles = "space") task$set_col_roles("Date", roles = "time") # Instantiate Resampling rcv = rsmp("repeated_sptcv_cstf", folds = 5, repeats = 2) rcv$instantiate(task) ### Individual sets: # rcv$train_set(1) # rcv$test_set(1) # check that no obs are in both sets intersect(rcv$train_set(1), rcv$test_set(1)) # good! #> integer(0) # Internal storage: # rcv$instance # table