tseda.changepoint
tseda.changepoint
Structural break (changepoint) detection for time series.
Public API
- ChangepointReport
Frozen dataclass with changepoint positions, timestamps, scores, and a segment-label helper.
- ChangepointDetector
Stateless detector: CUSUM, binary segmentation, variance ratio.
- class tseda.changepoint.ChangepointReport(changepoints, timestamps, n_changepoints, scores, method)[source]
Bases:
objectImmutable changepoint detection result.
- Parameters:
- changepoints
0-based integer positions of detected changepoints, sorted ascending. A changepoint at position
kmeans the break occurs between observationsk-1andk.
- timestamps
Timestamps corresponding to each changepoint position.
- Type:
- scores
Continuous changepoint score in [0, 1] for each observation. Higher values indicate stronger evidence of a structural break at or near that position.
- Type:
- timestamps: DatetimeIndex
- segment_labels(n)[source]
Return a 0-indexed integer segment label for each of n observations.
Segment 0 spans
[0, changepoints[0]), segment 1 spans[changepoints[0], changepoints[1]), and so on.- Parameters:
n (int) – Total number of observations.
- Returns:
Shape
(n,).- Return type:
numpy.ndarray of int
Examples
>>> from tseda.changepoint.detector import ChangepointReport >>> import numpy as np, pandas as pd >>> r = ChangepointReport( ... changepoints=[3, 7], ... timestamps=pd.DatetimeIndex([]), ... n_changepoints=2, ... scores=np.zeros(10), ... method="test", ... ) >>> r.segment_labels(10).tolist() [0, 0, 0, 1, 1, 1, 1, 2, 2, 2]
- class tseda.changepoint.ChangepointDetector[source]
Bases:
objectDetect structural breaks in a
TimeSeries.This class is stateless — one instance, many series.
- cusum(ts, threshold, drift, target)[source]
Two-sided CUSUM control chart for mean shift.
- Parameters:
ts (TimeSeries)
threshold (float)
drift (float)
target (float | None)
- Return type:
- binary_segmentation(ts, min_size, penalty)[source]
Recursive mean-shift changepoint detection.
- Parameters:
ts (TimeSeries)
min_size (int)
penalty (float | None)
- Return type:
- variance_ratio(ts, window, alpha)[source]
Sliding F-test for variance shifts.
- Parameters:
ts (TimeSeries)
window (int)
alpha (float)
- Return type:
- segment(ts, report)[source]
Return segment labels and per-segment statistics.
- Parameters:
ts (TimeSeries)
report (ChangepointReport)
- Return type:
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector
Single level shift:
>>> rng = np.random.default_rng(0) >>> idx = pd.date_range("2020", periods=200, freq="D") >>> vals = np.concatenate([rng.standard_normal(100), ... rng.standard_normal(100) + 4.0]) >>> ts = TimeSeries(vals, index=idx) >>> det = ChangepointDetector() >>> r = det.binary_segmentation(ts) >>> abs(r.changepoints[0] - 100) <= 5 # within 5 obs of true break True
- cusum(ts, *, threshold=5.0, drift=0.5, target=None)[source]
Two-sided CUSUM (Cumulative Sum) control chart for mean shift.
CUSUM accumulates deviations from a target mean. When the cumulative sum exceeds a threshold (expressed in units of σ), a changepoint is signalled.
- Parameters:
ts (TimeSeries) – Input series.
threshold (float, optional) – Decision interval in multiples of σ (default
5.0). Higher values = less sensitive / fewer false alarms.drift (float, optional) – Allowance parameter k (default
0.5). Typically set to half the magnitude of the smallest shift to detect, in units of σ.target (float, optional) – Reference (in-control) mean. Defaults to the series mean.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If fewer than 10 non-NaN observations or threshold / drift ≤ 0.
Notes
The CUSUM chart for detecting upward shifts:
\[S_t^+ = \max\bigl(0,\; S_{t-1}^+ + (x_t - \mu_0) - k\sigma\bigr)\]A changepoint is signalled when \(S_t^+ > h\sigma\) (or similarly for \(S_t^-\)). After each signal the accumulator is reset to zero.
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(1) >>> idx = pd.date_range("2020", periods=300, freq="D") >>> vals = np.concatenate([rng.standard_normal(150), ... rng.standard_normal(150) + 3.0]) >>> ts = TimeSeries(vals, index=idx) >>> r = ChangepointDetector().cusum(ts, threshold=5.0, drift=0.5) >>> r.n_changepoints >= 1 True
- binary_segmentation(ts, *, min_size=10, penalty=None)[source]
Recursive binary segmentation for mean-shift changepoints.
Iteratively finds the position that maximises the reduction in within-segment sum-of-squares error. A split is accepted when the gain exceeds penalty; recursion continues on each sub-segment.
- Parameters:
ts (TimeSeries) – Input series.
min_size (int, optional) – Minimum number of observations per segment (default
10). Prevents detecting breaks on very small sub-sequences.penalty (float, optional) – Minimum SSE gain required to accept a split. Defaults to
n × σ²where σ is estimated from first-differences. A higher penalty → fewer changepoints.
- Return type:
Notes
The algorithm has O(n²) time complexity per level of recursion. For very long series (n > 5000) consider using a larger min_size or restricting the search.
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(0) >>> idx = pd.date_range("2020", periods=300, freq="D") >>> vals = np.concatenate([rng.standard_normal(100), ... rng.standard_normal(100) + 5.0, ... rng.standard_normal(100)]) >>> ts = TimeSeries(vals, index=idx) >>> r = ChangepointDetector().binary_segmentation(ts) >>> r.n_changepoints 2
- variance_ratio(ts, *, window=30, alpha=0.05)[source]
Detect variance shifts via a sliding two-sample F-test.
Two adjacent windows of width window are compared at each position. A significant difference in variance (p < alpha) signals a variance-change changepoint.
- Parameters:
ts (TimeSeries) – Input series.
window (int, optional) – Half-window width for each sample. Default 30.
alpha (float, optional) – Significance level for the F-test. Default 0.05.
- Returns:
changepointsare positions with a significant variance shift, with consecutive positives merged into the maximum-score position.- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If window < 3 or alpha outside (0, 1).
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(2) >>> idx = pd.date_range("2020", periods=200, freq="D") >>> vals = np.concatenate([rng.standard_normal(100) * 0.5, ... rng.standard_normal(100) * 3.0]) >>> ts = TimeSeries(vals, index=idx) >>> r = ChangepointDetector().variance_ratio(ts, window=20) >>> r.n_changepoints >= 1 True
- segment(ts, report)[source]
Return per-segment statistics for a change-point report.
- Parameters:
ts (TimeSeries) – The original series.
report (ChangepointReport) – Output of any detection method.
- Returns:
One row per segment with columns:
segment,start,end,n_obs,mean,std,min,max.- Return type:
- Raises:
TypeError – If either argument has the wrong type.
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(0) >>> idx = pd.date_range("2020", periods=200, freq="D") >>> vals = np.concatenate([rng.standard_normal(100), ... rng.standard_normal(100) + 4.0]) >>> ts = TimeSeries(vals, index=idx) >>> det = ChangepointDetector() >>> r = det.binary_segmentation(ts) >>> df = det.segment(ts, r) >>> len(df) == r.n_changepoints + 1 True
Report
- class tseda.changepoint.detector.ChangepointReport(changepoints, timestamps, n_changepoints, scores, method)[source]
Bases:
objectImmutable changepoint detection result.
- Parameters:
- changepoints
0-based integer positions of detected changepoints, sorted ascending. A changepoint at position
kmeans the break occurs between observationsk-1andk.
- timestamps
Timestamps corresponding to each changepoint position.
- Type:
- scores
Continuous changepoint score in [0, 1] for each observation. Higher values indicate stronger evidence of a structural break at or near that position.
- Type:
- timestamps: DatetimeIndex
- segment_labels(n)[source]
Return a 0-indexed integer segment label for each of n observations.
Segment 0 spans
[0, changepoints[0]), segment 1 spans[changepoints[0], changepoints[1]), and so on.- Parameters:
n (int) – Total number of observations.
- Returns:
Shape
(n,).- Return type:
numpy.ndarray of int
Examples
>>> from tseda.changepoint.detector import ChangepointReport >>> import numpy as np, pandas as pd >>> r = ChangepointReport( ... changepoints=[3, 7], ... timestamps=pd.DatetimeIndex([]), ... n_changepoints=2, ... scores=np.zeros(10), ... method="test", ... ) >>> r.segment_labels(10).tolist() [0, 0, 0, 1, 1, 1, 1, 2, 2, 2]
Detector
- class tseda.changepoint.detector.ChangepointDetector[source]
Bases:
objectDetect structural breaks in a
TimeSeries.This class is stateless — one instance, many series.
- cusum(ts, threshold, drift, target)[source]
Two-sided CUSUM control chart for mean shift.
- Parameters:
ts (TimeSeries)
threshold (float)
drift (float)
target (float | None)
- Return type:
- binary_segmentation(ts, min_size, penalty)[source]
Recursive mean-shift changepoint detection.
- Parameters:
ts (TimeSeries)
min_size (int)
penalty (float | None)
- Return type:
- variance_ratio(ts, window, alpha)[source]
Sliding F-test for variance shifts.
- Parameters:
ts (TimeSeries)
window (int)
alpha (float)
- Return type:
- segment(ts, report)[source]
Return segment labels and per-segment statistics.
- Parameters:
ts (TimeSeries)
report (ChangepointReport)
- Return type:
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector
Single level shift:
>>> rng = np.random.default_rng(0) >>> idx = pd.date_range("2020", periods=200, freq="D") >>> vals = np.concatenate([rng.standard_normal(100), ... rng.standard_normal(100) + 4.0]) >>> ts = TimeSeries(vals, index=idx) >>> det = ChangepointDetector() >>> r = det.binary_segmentation(ts) >>> abs(r.changepoints[0] - 100) <= 5 # within 5 obs of true break True
- cusum(ts, *, threshold=5.0, drift=0.5, target=None)[source]
Two-sided CUSUM (Cumulative Sum) control chart for mean shift.
CUSUM accumulates deviations from a target mean. When the cumulative sum exceeds a threshold (expressed in units of σ), a changepoint is signalled.
- Parameters:
ts (TimeSeries) – Input series.
threshold (float, optional) – Decision interval in multiples of σ (default
5.0). Higher values = less sensitive / fewer false alarms.drift (float, optional) – Allowance parameter k (default
0.5). Typically set to half the magnitude of the smallest shift to detect, in units of σ.target (float, optional) – Reference (in-control) mean. Defaults to the series mean.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If fewer than 10 non-NaN observations or threshold / drift ≤ 0.
Notes
The CUSUM chart for detecting upward shifts:
\[S_t^+ = \max\bigl(0,\; S_{t-1}^+ + (x_t - \mu_0) - k\sigma\bigr)\]A changepoint is signalled when \(S_t^+ > h\sigma\) (or similarly for \(S_t^-\)). After each signal the accumulator is reset to zero.
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(1) >>> idx = pd.date_range("2020", periods=300, freq="D") >>> vals = np.concatenate([rng.standard_normal(150), ... rng.standard_normal(150) + 3.0]) >>> ts = TimeSeries(vals, index=idx) >>> r = ChangepointDetector().cusum(ts, threshold=5.0, drift=0.5) >>> r.n_changepoints >= 1 True
- binary_segmentation(ts, *, min_size=10, penalty=None)[source]
Recursive binary segmentation for mean-shift changepoints.
Iteratively finds the position that maximises the reduction in within-segment sum-of-squares error. A split is accepted when the gain exceeds penalty; recursion continues on each sub-segment.
- Parameters:
ts (TimeSeries) – Input series.
min_size (int, optional) – Minimum number of observations per segment (default
10). Prevents detecting breaks on very small sub-sequences.penalty (float, optional) – Minimum SSE gain required to accept a split. Defaults to
n × σ²where σ is estimated from first-differences. A higher penalty → fewer changepoints.
- Return type:
Notes
The algorithm has O(n²) time complexity per level of recursion. For very long series (n > 5000) consider using a larger min_size or restricting the search.
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(0) >>> idx = pd.date_range("2020", periods=300, freq="D") >>> vals = np.concatenate([rng.standard_normal(100), ... rng.standard_normal(100) + 5.0, ... rng.standard_normal(100)]) >>> ts = TimeSeries(vals, index=idx) >>> r = ChangepointDetector().binary_segmentation(ts) >>> r.n_changepoints 2
- variance_ratio(ts, *, window=30, alpha=0.05)[source]
Detect variance shifts via a sliding two-sample F-test.
Two adjacent windows of width window are compared at each position. A significant difference in variance (p < alpha) signals a variance-change changepoint.
- Parameters:
ts (TimeSeries) – Input series.
window (int, optional) – Half-window width for each sample. Default 30.
alpha (float, optional) – Significance level for the F-test. Default 0.05.
- Returns:
changepointsare positions with a significant variance shift, with consecutive positives merged into the maximum-score position.- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If window < 3 or alpha outside (0, 1).
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(2) >>> idx = pd.date_range("2020", periods=200, freq="D") >>> vals = np.concatenate([rng.standard_normal(100) * 0.5, ... rng.standard_normal(100) * 3.0]) >>> ts = TimeSeries(vals, index=idx) >>> r = ChangepointDetector().variance_ratio(ts, window=20) >>> r.n_changepoints >= 1 True
- segment(ts, report)[source]
Return per-segment statistics for a change-point report.
- Parameters:
ts (TimeSeries) – The original series.
report (ChangepointReport) – Output of any detection method.
- Returns:
One row per segment with columns:
segment,start,end,n_obs,mean,std,min,max.- Return type:
- Raises:
TypeError – If either argument has the wrong type.
Examples
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> from tseda.changepoint.detector import ChangepointDetector >>> rng = np.random.default_rng(0) >>> idx = pd.date_range("2020", periods=200, freq="D") >>> vals = np.concatenate([rng.standard_normal(100), ... rng.standard_normal(100) + 4.0]) >>> ts = TimeSeries(vals, index=idx) >>> det = ChangepointDetector() >>> r = det.binary_segmentation(ts) >>> df = det.segment(ts, r) >>> len(df) == r.n_changepoints + 1 True