(skmeans) Spatiotemporal clustering resampling via CLUTO
Source:R/ResamplingSptCVCluto.R
mlr_resamplings_sptcv_cluto.Rd
Spatiotemporal cluster partitioning via the vcluster
executable of the
CLUTO clustering application.
This partitioning method relies on the external CLUTO library. To use it, CLUTO's executables need to be downloaded and installed into this package.
See https://gist.github.com/pat-s/6430470cf817050e27d26c43c0e9be72 for an installation approach that should work on Windows and Linux. macOS is not supported by CLUTO.
Before using this method, please check the restrictive copyright shown below.
Details
By default, -clmethod='direct'
is passed to the vcluster
executable in
contrast to the upstream default -clmethod='rb'
.
There is no evidence or research that this method is the best among the
available ones ("rb", "rbr", "direct", "agglo", "graph", "bagglo").
Also, various other parameters can be set via argument cluto_parameters
to
achieve different clustering results.
Parameter -clusterfile
is handled by skmeans and cannot be
changed.
Copyright
CLUTO's copyright is as follows:
The CLUTO package is copyrighted by the Regents of the University of Minnesota. It can be freely used for educational and research purposes by non-profit institutions and US government agencies only. Other organizations are allowed to use CLUTO only for evaluation purposes, and any further uses will require prior approval. The software may not be sold or redistributed without prior approval. One may make copies of the software for their use provided that the copies, are not sold or distributed, are used under the same terms and conditions. As unestablished research software, this code is provided on an “as is” basis without warranty of any kind, either expressed or implied. The downloading, or executing any part of this software constitutes an implicit agreement to these terms. These terms and conditions are subject to change at any time without prior notice.
References
Zhao Y, Karypis G (2002). “Evaluation of Hierarchical Clustering Algorithms for Document Datasets.” 11th Conference of Information and Knowledge Management (CIKM), 51-524. doi:10.1145/584792.584877 .
Super class
mlr3::Resampling
-> ResamplingSptCVCluto
Public fields
clmethod
character
Name of the clustering method to use withinvcluster
. See Details for more information.cluto_parameters
character
Additional parameters to pass tovcluster
. Must be given as a single character string, e.g."param1='value1'param2='value2'"
. See the CLUTO documentation for a full list of supported parameters.verbose
logical
Whether to showvcluster
progress and summary output.
Active bindings
iters
integer(1)
Returns the number of resampling iterations, depending on the values stored in theparam_set
.
Methods
Method new()
Create an repeated resampling instance using the CLUTO algorithm.
Usage
ResamplingSptCVCluto$new(
id = "sptcv_cluto",
clmethod = "direct",
cluto_parameters = NULL,
verbose = TRUE
)
Arguments
id
character(1)
Identifier for the resampling strategy.clmethod
character
Name of the clustering method to use withinvcluster
. See Details for more information.cluto_parameters
character
Additional parameters to pass tovcluster
. Must be given as a single character string, e.g."param1='value1'param2='value2'"
. See the CLUTO documentation for a full list of supported parameters.verbose
logical
Whether to showvcluster
progress and summary output.time_var
character
The name of the variable which represents the time dimension. Must be of type numeric.
Method instantiate()
Materializes fixed training and test splits for a given task.
Arguments
task
Task
A task to instantiate.
Examples
if (FALSE) {
if (mlr3misc::require_namespaces("skmeans", quietly = TRUE)) {
library(mlr3)
library(mlr3spatiotempcv)
task = tsk("cookfarm_mlr3")
task$set_col_roles("Date", "time")
# Instantiate Resampling
rcv = rsmp("sptcv_cluto", folds = 5)
rcv$instantiate(task)
# Individual sets:
rcv$train_set(1)
rcv$test_set(1)
# check that no obs are in both sets
intersect(rcv$train_set(1), rcv$test_set(1)) # good!
# Internal storage:
rcv$instance # table
}
}