Forecastability

tseda.forecastability — Forecast readiness scoring and leakage detection.

Classes

ForecastabilityReport

Immutable result of ForecastabilityScorer.score().

ForecastabilityScorer

Composite 0–100 forecastability scorer.

LeakageReport

Immutable result of LeakageDetector.check().

LeakageDetector

Temporal and target leakage detector for feature sets.

class tseda.forecastability.ForecastabilityReport(score, sub_scores, recommended_model, recommended_diff, recommended_period, n_obs, pct_missing, pct_outlier, is_stationary, dominant_period)[source]

Bases: object

Immutable forecastability assessment.

Parameters:
score

Overall forecastability score in [0, 100]. Higher is better.

Type:

float

sub_scores

Individual sub-scores (0–100 each) keyed by sub-score name.

Type:

dict of str → float

recommended_model

Suggested modelling approach: "ARIMA", "SARIMA", "ETS", "Prophet", or "ML".

Type:

str

recommended_diff

Recommended differencing order: 0 (already stationary) or 1.

Type:

int

recommended_period

Dominant seasonal period detected, or None if no seasonality found.

Type:

int or None

n_obs

Number of observations in the series.

Type:

int

pct_missing

Percentage of NaN values.

Type:

float

pct_outlier

Percentage of IQR-flagged outliers.

Type:

float

is_stationary

True when the ADF test rejects the unit-root null.

Type:

bool

dominant_period

Same as recommended_period.

Type:

int or None

score: float
sub_scores: Dict[str, float]
recommended_model: str
recommended_diff: int
recommended_period: int | None
n_obs: int
pct_missing: float
pct_outlier: float
is_stationary: bool
dominant_period: int | None
__repr__()[source]

Return repr(self).

Return type:

str

__init__(score, sub_scores, recommended_model, recommended_diff, recommended_period, n_obs, pct_missing, pct_outlier, is_stationary, dominant_period)
Parameters:
Return type:

None

class tseda.forecastability.ForecastabilityScorer[source]

Bases: object

Assess how forecastable a TimeSeries is.

The scorer is stateless — calling score() multiple times is safe.

score(ts, period)[source]

Return a ForecastabilityReport with an overall 0–100 score.

Parameters:
Return type:

ForecastabilityReport

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.scorer import ForecastabilityScorer
>>> rng = np.random.default_rng(1)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> r   = ForecastabilityScorer().score(ts)
>>> isinstance(r.score, float)
True
score(ts, *, period=None, alpha=0.05)[source]

Compute the forecastability score for ts.

Parameters:
  • ts (TimeSeries) – Input series.

  • period (int, optional) – Seasonal period. When None the period is detected automatically via the FFT periodogram.

  • alpha (float, optional) – Significance level used for stationarity and ACF tests. Default 0.05.

Return type:

ForecastabilityReport

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.scorer import ForecastabilityScorer
>>> rng = np.random.default_rng(2)
>>> idx = pd.date_range("2020", periods=365, freq="D")
>>> n   = 365
>>> seas = np.sin(2 * np.pi * np.arange(n) / 7) * 3
>>> ts  = TimeSeries(seas + rng.standard_normal(n) * 0.5, index=idx)
>>> r   = ForecastabilityScorer().score(ts, period=7)
>>> r.recommended_period
7
class tseda.forecastability.LeakageReport(has_temporal_leakage, has_target_leakage, temporal_leakage_columns, target_leakage_columns, target_leakage_correlations, temporal_peak_lags, horizon, n_features, n_obs, warnings)[source]

Bases: object

Immutable leakage detection result.

Parameters:
has_temporal_leakage

True if any feature shows stronger correlation with future target than with current / past target.

Type:

bool

has_target_leakage

True if any feature is correlated with the target at lag 0 above target_corr_threshold.

Type:

bool

temporal_leakage_columns

Names of feature columns flagged for temporal leakage.

Type:

list of str

target_leakage_columns

Names of feature columns flagged for target leakage.

Type:

list of str

target_leakage_correlations

Lag-0 Pearson correlation for each column in target_leakage_columns.

Type:

dict of str → float

temporal_peak_lags

For each feature column, the lag at which the cross-correlation with the target is maximised. Positive lag means feature correlates with future target.

Type:

dict of str → int

horizon

Forecast horizon passed to check().

Type:

int

n_features

Number of feature columns examined.

Type:

int

n_obs

Number of observations in the target series.

Type:

int

warnings

Human-readable diagnostic messages.

Type:

list of str

has_temporal_leakage: bool
has_target_leakage: bool
temporal_leakage_columns: List[str]
target_leakage_columns: List[str]
target_leakage_correlations: Dict[str, float]
temporal_peak_lags: Dict[str, int]
horizon: int
n_features: int
n_obs: int
warnings: List[str]
__repr__()[source]

Return repr(self).

Return type:

str

__init__(has_temporal_leakage, has_target_leakage, temporal_leakage_columns, target_leakage_columns, target_leakage_correlations, temporal_peak_lags, horizon, n_features, n_obs, warnings)
Parameters:
Return type:

None

class tseda.forecastability.LeakageDetector[source]

Bases: object

Detect temporal and target leakage in a feature set.

The detector is stateless.

check(ts, horizon, features_df, target_corr_threshold)[source]

Return a LeakageReport.

Parameters:
Return type:

LeakageReport

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.leakage import LeakageDetector

Target leakage — a feature that is the target:

>>> rng = np.random.default_rng(0)
>>> n   = 80
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> y   = rng.standard_normal(n)
>>> ts  = TimeSeries(y, index=idx)
>>> feat = pd.DataFrame({"target_copy": y}, index=idx)
>>> r = LeakageDetector().check(ts, horizon=1, features_df=feat)
>>> r.has_target_leakage
True
>>> "target_copy" in r.target_leakage_columns
True
check(ts, horizon, *, features_df=None, target_corr_threshold=0.95)[source]

Check features_df for leakage against target ts.

Parameters:
  • ts (TimeSeries) – Target time series.

  • horizon (int) – Forecast horizon in time steps. Must be >= 1.

  • features_df (pandas.DataFrame, optional) – Feature matrix with the same DatetimeIndex as ts, one column per feature. When None the report is empty with a warning.

  • target_corr_threshold (float, optional) – Pearson r threshold above which a feature is flagged as target-leaking. Default 0.95.

Return type:

LeakageReport

Raises:
  • TypeError – If ts is not a TimeSeries.

  • ValueError – If horizon < 1, target_corr_threshold ∉ (0, 1], or features_df has a different number of rows from ts.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.leakage import LeakageDetector
>>> rng = np.random.default_rng(1)
>>> n   = 60
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(n), index=idx)
>>> r   = LeakageDetector().check(ts, horizon=3)
>>> r.n_features
0

Forecastability scoring for time series.

Computes a composite 0–100 readiness score from six diagnostic sub-scores and recommends a modelling strategy.

Sub-scores and weights

Classes

ForecastabilityReport

Frozen dataclass returned by ForecastabilityScorer.score().

ForecastabilityScorer

Stateless scorer.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.scorer import ForecastabilityScorer

Simple AR(1) process — moderate forecastability:

>>> rng = np.random.default_rng(0)
>>> n   = 300
>>> idx = pd.date_range("2020-01-01", periods=n, freq="D")
>>> eps = rng.standard_normal(n)
>>> x   = np.zeros(n)
>>> for i in range(1, n): x[i] = 0.7 * x[i-1] + eps[i]
>>> ts  = TimeSeries(x, index=idx)
>>> r   = ForecastabilityScorer().score(ts)
>>> 0 <= r.score <= 100
True
>>> r.recommended_model in ("ARIMA", "SARIMA", "ETS", "Prophet", "ML")
True
class tseda.forecastability.scorer.ForecastabilityReport(score, sub_scores, recommended_model, recommended_diff, recommended_period, n_obs, pct_missing, pct_outlier, is_stationary, dominant_period)[source]

Bases: object

Immutable forecastability assessment.

Parameters:
score

Overall forecastability score in [0, 100]. Higher is better.

Type:

float

sub_scores

Individual sub-scores (0–100 each) keyed by sub-score name.

Type:

dict of str → float

recommended_model

Suggested modelling approach: "ARIMA", "SARIMA", "ETS", "Prophet", or "ML".

Type:

str

recommended_diff

Recommended differencing order: 0 (already stationary) or 1.

Type:

int

recommended_period

Dominant seasonal period detected, or None if no seasonality found.

Type:

int or None

n_obs

Number of observations in the series.

Type:

int

pct_missing

Percentage of NaN values.

Type:

float

pct_outlier

Percentage of IQR-flagged outliers.

Type:

float

is_stationary

True when the ADF test rejects the unit-root null.

Type:

bool

dominant_period

Same as recommended_period.

Type:

int or None

score: float
sub_scores: Dict[str, float]
recommended_model: str
recommended_diff: int
recommended_period: int | None
n_obs: int
pct_missing: float
pct_outlier: float
is_stationary: bool
dominant_period: int | None
__repr__()[source]

Return repr(self).

Return type:

str

__init__(score, sub_scores, recommended_model, recommended_diff, recommended_period, n_obs, pct_missing, pct_outlier, is_stationary, dominant_period)
Parameters:
Return type:

None

class tseda.forecastability.scorer.ForecastabilityScorer[source]

Bases: object

Assess how forecastable a TimeSeries is.

The scorer is stateless — calling score() multiple times is safe.

score(ts, period)[source]

Return a ForecastabilityReport with an overall 0–100 score.

Parameters:
Return type:

ForecastabilityReport

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.scorer import ForecastabilityScorer
>>> rng = np.random.default_rng(1)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> r   = ForecastabilityScorer().score(ts)
>>> isinstance(r.score, float)
True
score(ts, *, period=None, alpha=0.05)[source]

Compute the forecastability score for ts.

Parameters:
  • ts (TimeSeries) – Input series.

  • period (int, optional) – Seasonal period. When None the period is detected automatically via the FFT periodogram.

  • alpha (float, optional) – Significance level used for stationarity and ACF tests. Default 0.05.

Return type:

ForecastabilityReport

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.scorer import ForecastabilityScorer
>>> rng = np.random.default_rng(2)
>>> idx = pd.date_range("2020", periods=365, freq="D")
>>> n   = 365
>>> seas = np.sin(2 * np.pi * np.arange(n) / 7) * 3
>>> ts  = TimeSeries(seas + rng.standard_normal(n) * 0.5, index=idx)
>>> r   = ForecastabilityScorer().score(ts, period=7)
>>> r.recommended_period
7

Leakage detection for time series feature sets.

Two classes of leakage are detected:

When features_df is None the report is returned with empty leakage sets and a warning that no features were provided.

Classes

LeakageReport

Frozen dataclass returned by LeakageDetector.check().

LeakageDetector

Stateless detector.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.leakage import LeakageDetector

No leakage — lagged features only:

>>> rng  = np.random.default_rng(0)
>>> n    = 100
>>> idx  = pd.date_range("2020", periods=n, freq="D")
>>> y    = rng.standard_normal(n)
>>> ts   = TimeSeries(y, index=idx)
>>> feat = pd.DataFrame({"lag1": np.roll(y, 1), "lag2": np.roll(y, 2)}, index=idx)
>>> feat.iloc[:2] = np.nan
>>> r    = LeakageDetector().check(ts, horizon=5, features_df=feat)
>>> r.has_target_leakage
False
class tseda.forecastability.leakage.LeakageReport(has_temporal_leakage, has_target_leakage, temporal_leakage_columns, target_leakage_columns, target_leakage_correlations, temporal_peak_lags, horizon, n_features, n_obs, warnings)[source]

Bases: object

Immutable leakage detection result.

Parameters:
has_temporal_leakage

True if any feature shows stronger correlation with future target than with current / past target.

Type:

bool

has_target_leakage

True if any feature is correlated with the target at lag 0 above target_corr_threshold.

Type:

bool

temporal_leakage_columns

Names of feature columns flagged for temporal leakage.

Type:

list of str

target_leakage_columns

Names of feature columns flagged for target leakage.

Type:

list of str

target_leakage_correlations

Lag-0 Pearson correlation for each column in target_leakage_columns.

Type:

dict of str → float

temporal_peak_lags

For each feature column, the lag at which the cross-correlation with the target is maximised. Positive lag means feature correlates with future target.

Type:

dict of str → int

horizon

Forecast horizon passed to check().

Type:

int

n_features

Number of feature columns examined.

Type:

int

n_obs

Number of observations in the target series.

Type:

int

warnings

Human-readable diagnostic messages.

Type:

list of str

has_temporal_leakage: bool
has_target_leakage: bool
temporal_leakage_columns: List[str]
target_leakage_columns: List[str]
target_leakage_correlations: Dict[str, float]
temporal_peak_lags: Dict[str, int]
horizon: int
n_features: int
n_obs: int
warnings: List[str]
__repr__()[source]

Return repr(self).

Return type:

str

__init__(has_temporal_leakage, has_target_leakage, temporal_leakage_columns, target_leakage_columns, target_leakage_correlations, temporal_peak_lags, horizon, n_features, n_obs, warnings)
Parameters:
Return type:

None

class tseda.forecastability.leakage.LeakageDetector[source]

Bases: object

Detect temporal and target leakage in a feature set.

The detector is stateless.

check(ts, horizon, features_df, target_corr_threshold)[source]

Return a LeakageReport.

Parameters:
Return type:

LeakageReport

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.leakage import LeakageDetector

Target leakage — a feature that is the target:

>>> rng = np.random.default_rng(0)
>>> n   = 80
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> y   = rng.standard_normal(n)
>>> ts  = TimeSeries(y, index=idx)
>>> feat = pd.DataFrame({"target_copy": y}, index=idx)
>>> r = LeakageDetector().check(ts, horizon=1, features_df=feat)
>>> r.has_target_leakage
True
>>> "target_copy" in r.target_leakage_columns
True
check(ts, horizon, *, features_df=None, target_corr_threshold=0.95)[source]

Check features_df for leakage against target ts.

Parameters:
  • ts (TimeSeries) – Target time series.

  • horizon (int) – Forecast horizon in time steps. Must be >= 1.

  • features_df (pandas.DataFrame, optional) – Feature matrix with the same DatetimeIndex as ts, one column per feature. When None the report is empty with a warning.

  • target_corr_threshold (float, optional) – Pearson r threshold above which a feature is flagged as target-leaking. Default 0.95.

Return type:

LeakageReport

Raises:
  • TypeError – If ts is not a TimeSeries.

  • ValueError – If horizon < 1, target_corr_threshold ∉ (0, 1], or features_df has a different number of rows from ts.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.forecastability.leakage import LeakageDetector
>>> rng = np.random.default_rng(1)
>>> n   = 60
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(n), index=idx)
>>> r   = LeakageDetector().check(ts, horizon=3)
>>> r.n_features
0