TimeSliceCrossValidator.run#

TimeSliceCrossValidator.run(X, y, sampler_config=None, yaml_path=None, mmm=None, model_names=None)[source]#

Run the complete time-slice cross-validation loop.

Executes cross-validation by iterating through all folds, fitting a model for each training set, and generating predictions on the combined train+test data.

Parameters:
Xpd.DataFrame

Feature matrix containing the date column and predictor variables.

ypd.Series

Target variable.

sampler_configdict, optional

Sampler configuration to override the validator-level configuration for all folds in this run. If provided, takes precedence over the configuration passed at construction time.

yaml_pathstr, optional

Path to a YAML configuration file for building the MMM model per fold. Mutually exclusive with mmm.

mmmobject, optional

An object with a build_model(X, y) method that returns a fitted MMM instance. Mutually exclusive with yaml_path.

model_nameslist of str, optional

Names to assign to each CV fold in the combined InferenceData. If provided, length must match the number of splits. If not provided, names are generated from each model’s _model_name attribute or as 'Iteration {i}'.

Returns:
arviz.InferenceData

Combined InferenceData where each fold is concatenated along a new coordinate named ‘cv’. Includes a ‘cv_metadata’ group with per-fold train/test data.

Raises:
ValueError

If neither yaml_path nor mmm is provided. If model_names length doesn’t match the number of splits. If no InferenceData objects are produced during CV.

See also

split

Generate train/test indices for cross-validation.

get_n_splits

Return the number of splits.

Notes

Per-fold results are also stored in self._cv_results after calling this method.

Examples

Using a YAML configuration:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, yaml_path="model_config.yml")

Using a model builder object:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, mmm=mmm_builder)