TimeSliceCrossValidator.run#

TimeSliceCrossValidator.run(X, y, sampler_config=None, yaml_path=None, mmm=None, model_names=None)[source]#

Run the complete time-slice cross-validation loop.

Executes cross-validation by iterating through all folds, fitting a model for each training set, and generating predictions on the combined train+test data.

Parameters:

Xpd.DataFrame: Feature matrix containing the date column and predictor variables.
ypd.Series: Target variable.
sampler_configdict, optional: Sampler configuration to override the validator-level configuration for all folds in this run. If provided, takes precedence over the configuration passed at construction time.
yaml_pathstr, optional: Path to a YAML configuration file for building the MMM model per fold. Mutually exclusive with mmm.
mmmobject, optional: An object with a build_model(X, y) method that returns a fitted MMM instance. Mutually exclusive with yaml_path.
model_nameslist of str, optional: Names to assign to each CV fold in the combined InferenceData. If provided, length must match the number of splits. If not provided, names are generated from each model’s _model_name attribute or as 'Iteration {i}'.

Returns:

arviz.InferenceData: Combined InferenceData where each fold is concatenated along a new coordinate named ‘cv’. Includes a ‘cv_metadata’ group with per-fold train/test data.

Raises:

ValueError: If neither yaml_path nor mmm is provided. If model_names length doesn’t match the number of splits. If no InferenceData objects are produced during CV.

See also

split: Generate train/test indices for cross-validation.
get_n_splits: Return the number of splits.

Notes

Per-fold results are also stored in self._cv_results after calling this method.

Examples

Using a YAML configuration:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, yaml_path="model_config.yml")

Using a model builder object:

>>> cv = TimeSliceCrossValidator(
...     n_init=100, forecast_horizon=10, date_column="date"
... )
>>> combined_idata = cv.run(X, y, mmm=mmm_builder)