Statistics

tseda.statistics

Statistical analysis for time series.

Public API

DescriptiveStats

Frozen result of DescriptiveAnalyzer.

DescriptiveAnalyzer

Comprehensive descriptive statistics (mean, std, MAD, skew, kurtosis, quantiles, …).

StationarityResult

Frozen result of StationarityTester.

StationarityTester

ADF, KPSS, and Phillips-Perron stationarity tests with combined summary.

AutocorrelationResult

Frozen result of AutocorrelationAnalyzer.

AutocorrelationAnalyzer

ACF, PACF, Ljung-Box test, and significant-lag detection.

class tseda.statistics.DescriptiveStats(n_total, n_valid, n_nan, pct_nan, mean, median, trimmed_mean, std, var, mad, cv, min, max, range, first, last, skewness, kurtosis, quantiles, n_zeros, n_positive, n_negative)[source]

Bases: object

Comprehensive descriptive statistics for a TimeSeries.

All statistics are computed on the non-NaN subset unless otherwise noted.

Parameters:
n_total

Total number of observations (including NaN).

Type:

int

n_valid

Number of non-NaN observations.

Type:

int

n_nan

Number of NaN observations.

Type:

int

pct_nan

Percentage of NaN observations (0–100).

Type:

float

mean

Arithmetic mean.

Type:

float

median

50th percentile.

Type:

float

std

Sample standard deviation (ddof=1).

Type:

float

var

Sample variance (ddof=1).

Type:

float

mad

Median absolute deviation: median(|x - median(x)|).

Type:

float

trimmed_mean

Mean with the top and bottom 5 % of values removed.

Type:

float

min

Minimum value.

Type:

float

max

Maximum value.

Type:

float

range

max - min.

Type:

float

first

First (earliest) non-NaN value.

Type:

float

last

Last (most recent) non-NaN value.

Type:

float

cv

Coefficient of variation: std / |mean|. nan when mean == 0.

Type:

float

skewness

Fisher’s moment coefficient of skewness (bias-corrected).

Type:

float

kurtosis

Excess kurtosis (Fisher definition, bias-corrected). 0 for a normal distribution.

Type:

float

quantiles

Mapping from probability level to quantile value. Keys: [0.01, 0.05, 0.10, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99].

Type:

dict of float → float

n_zeros

Number of exact zeros.

Type:

int

n_positive

Number of strictly positive values.

Type:

int

n_negative

Number of strictly negative values.

Type:

int

n_total: int
n_valid: int
n_nan: int
pct_nan: float
mean: float
median: float
trimmed_mean: float
std: float
var: float
mad: float
cv: float
min: float
max: float
range: float
first: float
last: float
skewness: float
kurtosis: float
quantiles: Dict[float, float]
n_zeros: int
n_positive: int
n_negative: int
__repr__()[source]

Return repr(self).

Return type:

str

__init__(n_total, n_valid, n_nan, pct_nan, mean, median, trimmed_mean, std, var, mad, cv, min, max, range, first, last, skewness, kurtosis, quantiles, n_zeros, n_positive, n_negative)
Parameters:
Return type:

None

class tseda.statistics.DescriptiveAnalyzer[source]

Bases: object

Compute comprehensive descriptive statistics for a TimeSeries.

This class is stateless — one instance, many series.

analyze(ts)[source]

Return a DescriptiveStats for ts.

Parameters:

ts (TimeSeries)

Return type:

DescriptiveStats

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.descriptive import DescriptiveAnalyzer
>>> idx = pd.date_range("2020", periods=5, freq="D")
>>> ts  = TimeSeries([2.0, 4.0, 4.0, 4.0, 5.0], index=idx)
>>> r   = DescriptiveAnalyzer().analyze(ts)
>>> r.mean
3.8
>>> r.std
1.09...
analyze(ts)[source]

Compute descriptive statistics for ts.

Parameters:

ts (TimeSeries) – Input series.

Return type:

DescriptiveStats

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.descriptive import DescriptiveAnalyzer
>>> idx = pd.date_range("2020", periods=4, freq="D")
>>> ts  = TimeSeries([1.0, 2.0, 3.0, 4.0], index=idx)
>>> r   = DescriptiveAnalyzer().analyze(ts)
>>> r.median
2.5
>>> r.n_positive
4
class tseda.statistics.StationarityResult(test_name, statistic, p_value, critical_values, n_lags, regression, is_stationary, alpha, interpretation)[source]

Bases: object

Immutable result of a stationarity test.

Parameters:
test_name

Name of the test (e.g., "ADF").

Type:

str

statistic

Test statistic value.

Type:

float

p_value

Approximate p-value.

Type:

float

critical_values

Critical values at standard significance levels ("1%", "5%", "10%").

Type:

dict of str → float

n_lags

Number of lags used (None for tests that do not select lags).

Type:

int or None

regression

Regression type used ("nc", "c", or "ct").

Type:

str

is_stationary

Convenience flag. For ADF / PP: p_value < alpha (reject unit root → evidence of stationarity). For KPSS: p_value > alpha (fail to reject stationarity null).

Type:

bool

alpha

Significance level used to set is_stationary.

Type:

float

interpretation

One-sentence plain-English summary of the result.

Type:

str

test_name: str
statistic: float
p_value: float
critical_values: dict
n_lags: int | None
regression: str
is_stationary: bool
alpha: float
interpretation: str
__repr__()[source]

Return repr(self).

Return type:

str

__init__(test_name, statistic, p_value, critical_values, n_lags, regression, is_stationary, alpha, interpretation)
Parameters:
Return type:

None

class tseda.statistics.StationarityTester[source]

Bases: object

Test a TimeSeries for stationarity.

All methods return a StationarityResult and are stateless.

adf(ts, maxlag, regression, alpha)[source]

Augmented Dickey-Fuller test.

Parameters:
Return type:

StationarityResult

kpss(ts, regression, alpha)[source]

KPSS test.

Parameters:
Return type:

StationarityResult

pp(ts, regression, alpha)[source]

Phillips-Perron test (delegates to statsmodels if available).

Parameters:
Return type:

StationarityResult

summary(ts, alpha)[source]

Run ADF + KPSS and return a combined verdict string.

Parameters:
Return type:

str

Notes

When statsmodels is installed the adf and kpss methods automatically use its implementations, which have more accurate critical-value tables. Install with pip install statsmodels.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> r   = StationarityTester().adf(ts)
>>> r.is_stationary
True
adf(ts, *, maxlag=None, regression='c', alpha=0.05)[source]

Augmented Dickey-Fuller unit-root test.

H₀: The series has a unit root (is non-stationary). H₁: The series is stationary.

Reject H₀ (small p-value) → evidence of stationarity.

Parameters:
  • ts (TimeSeries) – Input series. NaN values are dropped before testing.

  • maxlag (int, optional) – Maximum lag to consider for AIC-based lag selection. Defaults to int(12 * (n / 100) ** 0.25) (Schwert 1989).

  • regression (str, optional) –

    Deterministic terms to include in the test equation.

    • "nc" — no constant, no trend.

    • "c" — constant only (default).

    • "ct" — constant + linear trend.

  • alpha (float, optional) – Significance level for is_stationary. Default 0.05.

Return type:

StationarityResult

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(1)
>>> idx = pd.date_range("2020", periods=150, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(150), index=idx)
>>> StationarityTester().adf(ts).is_stationary
True
kpss(ts, *, regression='c', alpha=0.05)[source]

Kwiatkowski-Phillips-Schmidt-Shin (KPSS) stationarity test.

H₀: The series is level (or trend) stationary. H₁: The series has a unit root.

Fail to reject H₀ (large p-value) → evidence of stationarity. This is the opposite null from ADF.

Parameters:
  • ts (TimeSeries) – Input series.

  • regression (str, optional) – "c" — test for level stationarity (default). "ct" — test for trend stationarity.

  • alpha (float, optional) – Significance level. Default 0.05.

Return type:

StationarityResult

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(2)
>>> idx = pd.date_range("2020", periods=150, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(150), index=idx)
>>> StationarityTester().kpss(ts).is_stationary
True
pp(ts, *, regression='c', alpha=0.05)[source]

Phillips-Perron unit-root test.

Like ADF but uses a non-parametric correction for serial correlation (no lag selection required). Requires statsmodels.

H₀: The series has a unit root. H₁: The series is stationary.

Parameters:
  • ts (TimeSeries) – Input series.

  • regression (str, optional) – "c" (default) or "ct".

  • alpha (float, optional) – Significance level. Default 0.05.

Return type:

StationarityResult

Raises:

ImportError – If statsmodels is not installed.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(3)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> StationarityTester().pp(ts).is_stationary
True
summary(ts, *, regression='c', alpha=0.05)[source]

Run ADF + KPSS and return a human-readable combined verdict.

The two tests have opposite nulls, so their results can be reconciled:

ADF

KPSS

Verdict

stat.

stat.

Strong evidence of stationarity

stat.

non-s.

Trend stationary — consider detrending

non-s.

stat.

Difference stationary — try differencing

non-s.

non-s.

Strong evidence of non-stationarity

Parameters:
  • ts (TimeSeries) – Input series.

  • regression (str, optional) – Passed to both ADF and KPSS. Default "c".

  • alpha (float, optional) – Significance level. Default 0.05.

Returns:

Multi-line plain-English summary.

Return type:

str

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> print(StationarityTester().summary(ts))
class tseda.statistics.AutocorrelationResult(acf, pacf, lags, conf_lower, conf_upper, lb_statistic, lb_pvalue, n_lags, n_obs, is_white_noise, alpha)[source]

Bases: object

Immutable autocorrelation analysis result.

Parameters:
acf

Autocorrelation function values at lags 0, 1, …, n_lags. acf[0] is always 1.0 (lag-0 autocorrelation).

Type:

numpy.ndarray

pacf

Partial autocorrelation function values at lags 0, 1, …, n_lags. pacf[0] is always 1.0 by convention.

Type:

numpy.ndarray

lags

Integer array [0, 1, …, n_lags].

Type:

numpy.ndarray

conf_lower

Lower 95 % confidence bound at each lag (Bartlett’s approximation).

Type:

numpy.ndarray

conf_upper

Upper 95 % confidence bound at each lag.

Type:

numpy.ndarray

lb_statistic

Ljung-Box Q-statistic at each lag from 1 to n_lags.

Type:

numpy.ndarray

lb_pvalue

P-value of the Ljung-Box test at each lag.

Type:

numpy.ndarray

n_lags

Number of lags requested (excluding lag 0).

Type:

int

n_obs

Number of non-NaN observations used.

Type:

int

is_white_noise

True when the Ljung-Box p-value at lag min(n_lags, 20) exceeds alpha.

Type:

bool

alpha

Significance level used for is_white_noise and confidence bounds.

Type:

float

acf: ndarray
pacf: ndarray
lags: ndarray
conf_lower: ndarray
conf_upper: ndarray
lb_statistic: ndarray
lb_pvalue: ndarray
n_lags: int
n_obs: int
is_white_noise: bool
alpha: float
__repr__()[source]

Return repr(self).

Return type:

str

__init__(acf, pacf, lags, conf_lower, conf_upper, lb_statistic, lb_pvalue, n_lags, n_obs, is_white_noise, alpha)
Parameters:
Return type:

None

class tseda.statistics.AutocorrelationAnalyzer[source]

Bases: object

Compute ACF, PACF, and Ljung-Box statistics for a TimeSeries.

This class is stateless.

analyze(ts, lags, alpha)[source]

Return an AutocorrelationResult.

Parameters:
Return type:

AutocorrelationResult

significant_lags(result)[source]

Return the lag numbers where ACF or PACF exceeds the CI.

Parameters:
Return type:

ndarray

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.autocorrelation import AutocorrelationAnalyzer

AR(1) process:

>>> rng = np.random.default_rng(7)
>>> n   = 300
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> eps = rng.standard_normal(n)
>>> x   = np.zeros(n)
>>> for i in range(1, n): x[i] = 0.7 * x[i-1] + eps[i]
>>> ts  = TimeSeries(x, index=idx)
>>> r   = AutocorrelationAnalyzer().analyze(ts, lags=10)
>>> r.acf[1] > 0.5          # strong lag-1 autocorrelation
True
>>> r.is_white_noise         # definitely not white noise
False
analyze(ts, lags=40, *, alpha=0.05)[source]

Compute ACF, PACF, and Ljung-Box statistics.

Parameters:
  • ts (TimeSeries) – Input series. NaN values are dropped before analysis.

  • lags (int, optional) – Number of lags to compute (lag 0 is always included). Capped at n // 2. Default 40.

  • alpha (float, optional) – Significance level for confidence bounds and is_white_noise. Default 0.05.

Return type:

AutocorrelationResult

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.autocorrelation import AutocorrelationAnalyzer
>>> idx = pd.date_range("2020", periods=50, freq="D")
>>> ts  = TimeSeries(np.ones(50), index=idx)
>>> r   = AutocorrelationAnalyzer().analyze(ts, lags=5)
>>> r.acf[0]
1.0
significant_lags(result, *, which='acf')[source]

Return lag numbers (> 0) where the function exceeds the CI.

Parameters:
Returns:

Integer array of significant lag numbers.

Return type:

numpy.ndarray

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.autocorrelation import AutocorrelationAnalyzer
>>> rng = np.random.default_rng(7)
>>> n   = 300
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> eps = rng.standard_normal(n)
>>> x   = np.zeros(n)
>>> for i in range(1, n): x[i] = 0.7 * x[i-1] + eps[i]
>>> ts  = TimeSeries(x, index=idx)
>>> r   = AutocorrelationAnalyzer().analyze(ts, lags=10)
>>> len(AutocorrelationAnalyzer().significant_lags(r)) > 0
True

Descriptive statistics for time series.

Provides a single DescriptiveStats result object and a stateless DescriptiveAnalyzer that computes it. All arithmetic uses numpy so there are no extra dependencies beyond the core stack.

The statistics reported go beyond what pandas.Series.describe() offers:

  • Robust location / spread (median, MAD, trimmed mean).

  • Shape (skewness, excess kurtosis).

  • Quantiles at multiple probability levels.

  • First/last value, range, coefficient of variation.

  • Count of zeros and near-zero values.

Classes

DescriptiveStats

Frozen dataclass containing every computed statistic.

DescriptiveAnalyzer

Stateless analyzer that produces DescriptiveStats.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.descriptive import DescriptiveAnalyzer
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020-01-01", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx, name="returns")
>>> r   = DescriptiveAnalyzer().analyze(ts)
>>> round(r.mean, 3)
0.024
class tseda.statistics.descriptive.DescriptiveStats(n_total, n_valid, n_nan, pct_nan, mean, median, trimmed_mean, std, var, mad, cv, min, max, range, first, last, skewness, kurtosis, quantiles, n_zeros, n_positive, n_negative)[source]

Bases: object

Comprehensive descriptive statistics for a TimeSeries.

All statistics are computed on the non-NaN subset unless otherwise noted.

Parameters:
n_total

Total number of observations (including NaN).

Type:

int

n_valid

Number of non-NaN observations.

Type:

int

n_nan

Number of NaN observations.

Type:

int

pct_nan

Percentage of NaN observations (0–100).

Type:

float

mean

Arithmetic mean.

Type:

float

median

50th percentile.

Type:

float

std

Sample standard deviation (ddof=1).

Type:

float

var

Sample variance (ddof=1).

Type:

float

mad

Median absolute deviation: median(|x - median(x)|).

Type:

float

trimmed_mean

Mean with the top and bottom 5 % of values removed.

Type:

float

min

Minimum value.

Type:

float

max

Maximum value.

Type:

float

range

max - min.

Type:

float

first

First (earliest) non-NaN value.

Type:

float

last

Last (most recent) non-NaN value.

Type:

float

cv

Coefficient of variation: std / |mean|. nan when mean == 0.

Type:

float

skewness

Fisher’s moment coefficient of skewness (bias-corrected).

Type:

float

kurtosis

Excess kurtosis (Fisher definition, bias-corrected). 0 for a normal distribution.

Type:

float

quantiles

Mapping from probability level to quantile value. Keys: [0.01, 0.05, 0.10, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99].

Type:

dict of float → float

n_zeros

Number of exact zeros.

Type:

int

n_positive

Number of strictly positive values.

Type:

int

n_negative

Number of strictly negative values.

Type:

int

n_total: int
n_valid: int
n_nan: int
pct_nan: float
mean: float
median: float
trimmed_mean: float
std: float
var: float
mad: float
cv: float
min: float
max: float
range: float
first: float
last: float
skewness: float
kurtosis: float
quantiles: Dict[float, float]
n_zeros: int
n_positive: int
n_negative: int
__repr__()[source]

Return repr(self).

Return type:

str

__init__(n_total, n_valid, n_nan, pct_nan, mean, median, trimmed_mean, std, var, mad, cv, min, max, range, first, last, skewness, kurtosis, quantiles, n_zeros, n_positive, n_negative)
Parameters:
Return type:

None

class tseda.statistics.descriptive.DescriptiveAnalyzer[source]

Bases: object

Compute comprehensive descriptive statistics for a TimeSeries.

This class is stateless — one instance, many series.

analyze(ts)[source]

Return a DescriptiveStats for ts.

Parameters:

ts (TimeSeries)

Return type:

DescriptiveStats

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.descriptive import DescriptiveAnalyzer
>>> idx = pd.date_range("2020", periods=5, freq="D")
>>> ts  = TimeSeries([2.0, 4.0, 4.0, 4.0, 5.0], index=idx)
>>> r   = DescriptiveAnalyzer().analyze(ts)
>>> r.mean
3.8
>>> r.std
1.09...
analyze(ts)[source]

Compute descriptive statistics for ts.

Parameters:

ts (TimeSeries) – Input series.

Return type:

DescriptiveStats

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.descriptive import DescriptiveAnalyzer
>>> idx = pd.date_range("2020", periods=4, freq="D")
>>> ts  = TimeSeries([1.0, 2.0, 3.0, 4.0], index=idx)
>>> r   = DescriptiveAnalyzer().analyze(ts)
>>> r.median
2.5
>>> r.n_positive
4

Stationarity testing for time series.

Three widely-used tests are implemented with a dual-path strategy:

  1. Primary path — pure numpy / scipy implementation so the package works without statsmodels.

  2. Fast path — if statsmodels is installed the well-tested statsmodels.tsa.stattools implementations are used instead, which have more reliable critical-value tables.

Test

H₀

Detects

ADF

Unit root exists

Evidence against unit root

KPSS

Series is level (or trend) stationary

Evidence of non-stationarity

PP

Unit root exists

Robust to serial correlation without requiring lag selection

The combined StationarityTester.summary() method reconciles all three tests and returns a human-readable verdict with recommended action.

Classes

StationarityResult

Frozen dataclass for a single test’s output.

StationarityTester

Stateless tester.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester

Stationary white noise:

>>> rng = np.random.default_rng(42)
>>> idx = pd.date_range("2020", periods=300, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(300), index=idx)
>>> r   = StationarityTester().adf(ts)
>>> r.is_stationary   # p < 0.05
True

Random walk (non-stationary):

>>> rw  = TimeSeries(np.cumsum(rng.standard_normal(300)), index=idx)
>>> r2  = StationarityTester().adf(rw)
>>> r2.is_stationary
False
class tseda.statistics.stationarity.StationarityResult(test_name, statistic, p_value, critical_values, n_lags, regression, is_stationary, alpha, interpretation)[source]

Bases: object

Immutable result of a stationarity test.

Parameters:
test_name

Name of the test (e.g., "ADF").

Type:

str

statistic

Test statistic value.

Type:

float

p_value

Approximate p-value.

Type:

float

critical_values

Critical values at standard significance levels ("1%", "5%", "10%").

Type:

dict of str → float

n_lags

Number of lags used (None for tests that do not select lags).

Type:

int or None

regression

Regression type used ("nc", "c", or "ct").

Type:

str

is_stationary

Convenience flag. For ADF / PP: p_value < alpha (reject unit root → evidence of stationarity). For KPSS: p_value > alpha (fail to reject stationarity null).

Type:

bool

alpha

Significance level used to set is_stationary.

Type:

float

interpretation

One-sentence plain-English summary of the result.

Type:

str

test_name: str
statistic: float
p_value: float
critical_values: dict
n_lags: int | None
regression: str
is_stationary: bool
alpha: float
interpretation: str
__repr__()[source]

Return repr(self).

Return type:

str

__init__(test_name, statistic, p_value, critical_values, n_lags, regression, is_stationary, alpha, interpretation)
Parameters:
Return type:

None

class tseda.statistics.stationarity.StationarityTester[source]

Bases: object

Test a TimeSeries for stationarity.

All methods return a StationarityResult and are stateless.

adf(ts, maxlag, regression, alpha)[source]

Augmented Dickey-Fuller test.

Parameters:
Return type:

StationarityResult

kpss(ts, regression, alpha)[source]

KPSS test.

Parameters:
Return type:

StationarityResult

pp(ts, regression, alpha)[source]

Phillips-Perron test (delegates to statsmodels if available).

Parameters:
Return type:

StationarityResult

summary(ts, alpha)[source]

Run ADF + KPSS and return a combined verdict string.

Parameters:
Return type:

str

Notes

When statsmodels is installed the adf and kpss methods automatically use its implementations, which have more accurate critical-value tables. Install with pip install statsmodels.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> r   = StationarityTester().adf(ts)
>>> r.is_stationary
True
adf(ts, *, maxlag=None, regression='c', alpha=0.05)[source]

Augmented Dickey-Fuller unit-root test.

H₀: The series has a unit root (is non-stationary). H₁: The series is stationary.

Reject H₀ (small p-value) → evidence of stationarity.

Parameters:
  • ts (TimeSeries) – Input series. NaN values are dropped before testing.

  • maxlag (int, optional) – Maximum lag to consider for AIC-based lag selection. Defaults to int(12 * (n / 100) ** 0.25) (Schwert 1989).

  • regression (str, optional) –

    Deterministic terms to include in the test equation.

    • "nc" — no constant, no trend.

    • "c" — constant only (default).

    • "ct" — constant + linear trend.

  • alpha (float, optional) – Significance level for is_stationary. Default 0.05.

Return type:

StationarityResult

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(1)
>>> idx = pd.date_range("2020", periods=150, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(150), index=idx)
>>> StationarityTester().adf(ts).is_stationary
True
kpss(ts, *, regression='c', alpha=0.05)[source]

Kwiatkowski-Phillips-Schmidt-Shin (KPSS) stationarity test.

H₀: The series is level (or trend) stationary. H₁: The series has a unit root.

Fail to reject H₀ (large p-value) → evidence of stationarity. This is the opposite null from ADF.

Parameters:
  • ts (TimeSeries) – Input series.

  • regression (str, optional) – "c" — test for level stationarity (default). "ct" — test for trend stationarity.

  • alpha (float, optional) – Significance level. Default 0.05.

Return type:

StationarityResult

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(2)
>>> idx = pd.date_range("2020", periods=150, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(150), index=idx)
>>> StationarityTester().kpss(ts).is_stationary
True
pp(ts, *, regression='c', alpha=0.05)[source]

Phillips-Perron unit-root test.

Like ADF but uses a non-parametric correction for serial correlation (no lag selection required). Requires statsmodels.

H₀: The series has a unit root. H₁: The series is stationary.

Parameters:
  • ts (TimeSeries) – Input series.

  • regression (str, optional) – "c" (default) or "ct".

  • alpha (float, optional) – Significance level. Default 0.05.

Return type:

StationarityResult

Raises:

ImportError – If statsmodels is not installed.

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(3)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> StationarityTester().pp(ts).is_stationary
True
summary(ts, *, regression='c', alpha=0.05)[source]

Run ADF + KPSS and return a human-readable combined verdict.

The two tests have opposite nulls, so their results can be reconciled:

ADF

KPSS

Verdict

stat.

stat.

Strong evidence of stationarity

stat.

non-s.

Trend stationary — consider detrending

non-s.

stat.

Difference stationary — try differencing

non-s.

non-s.

Strong evidence of non-stationarity

Parameters:
  • ts (TimeSeries) – Input series.

  • regression (str, optional) – Passed to both ADF and KPSS. Default "c".

  • alpha (float, optional) – Significance level. Default 0.05.

Returns:

Multi-line plain-English summary.

Return type:

str

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.stationarity import StationarityTester
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> print(StationarityTester().summary(ts))

Autocorrelation analysis for time series.

Implements ACF, PACF, and the Ljung-Box portmanteau test entirely in numpy / scipy — no statsmodels dependency.

Classes

AutocorrelationResult

Frozen dataclass containing ACF, PACF, confidence bounds, and Ljung-Box statistics.

AutocorrelationAnalyzer

Stateless analyzer.

Theory

ACF at lag k:

\[\hat{\rho}(k) = \frac{\sum_{t=k+1}^{n}(x_t - \bar{x})(x_{t-k} - \bar{x})} {\sum_{t=1}^{n}(x_t - \bar{x})^2}\]

PACF via Durbin-Levinson recursion on the ACF values:

\[\phi_{k,k} = \frac{\hat{\rho}(k) - \sum_{j=1}^{k-1} \phi_{k-1,j} \hat{\rho}(k-j)} {1 - \sum_{j=1}^{k-1} \phi_{k-1,j} \hat{\rho}(j)}\]

95 % confidence interval for ACF (Bartlett’s formula assuming white noise): ±1.96 / √n.

Ljung-Box test statistic:

\[ \begin{align}\begin{aligned}Q = n(n+2) \sum_{k=1}^{m} \frac{\hat{\rho}(k)^2}{n - k}\\Q \sim \chi^2(m) \text{ under H₀ (white noise)}\end{aligned}\end{align} \]

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.autocorrelation import AutocorrelationAnalyzer
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts  = TimeSeries(rng.standard_normal(200), index=idx)
>>> r   = AutocorrelationAnalyzer().analyze(ts, lags=20)
>>> len(r.acf)     # lag 0 … 20
21
>>> r.is_white_noise   # white noise → True
True
class tseda.statistics.autocorrelation.AutocorrelationResult(acf, pacf, lags, conf_lower, conf_upper, lb_statistic, lb_pvalue, n_lags, n_obs, is_white_noise, alpha)[source]

Bases: object

Immutable autocorrelation analysis result.

Parameters:
acf

Autocorrelation function values at lags 0, 1, …, n_lags. acf[0] is always 1.0 (lag-0 autocorrelation).

Type:

numpy.ndarray

pacf

Partial autocorrelation function values at lags 0, 1, …, n_lags. pacf[0] is always 1.0 by convention.

Type:

numpy.ndarray

lags

Integer array [0, 1, …, n_lags].

Type:

numpy.ndarray

conf_lower

Lower 95 % confidence bound at each lag (Bartlett’s approximation).

Type:

numpy.ndarray

conf_upper

Upper 95 % confidence bound at each lag.

Type:

numpy.ndarray

lb_statistic

Ljung-Box Q-statistic at each lag from 1 to n_lags.

Type:

numpy.ndarray

lb_pvalue

P-value of the Ljung-Box test at each lag.

Type:

numpy.ndarray

n_lags

Number of lags requested (excluding lag 0).

Type:

int

n_obs

Number of non-NaN observations used.

Type:

int

is_white_noise

True when the Ljung-Box p-value at lag min(n_lags, 20) exceeds alpha.

Type:

bool

alpha

Significance level used for is_white_noise and confidence bounds.

Type:

float

acf: ndarray
pacf: ndarray
lags: ndarray
conf_lower: ndarray
conf_upper: ndarray
lb_statistic: ndarray
lb_pvalue: ndarray
n_lags: int
n_obs: int
is_white_noise: bool
alpha: float
__repr__()[source]

Return repr(self).

Return type:

str

__init__(acf, pacf, lags, conf_lower, conf_upper, lb_statistic, lb_pvalue, n_lags, n_obs, is_white_noise, alpha)
Parameters:
Return type:

None

class tseda.statistics.autocorrelation.AutocorrelationAnalyzer[source]

Bases: object

Compute ACF, PACF, and Ljung-Box statistics for a TimeSeries.

This class is stateless.

analyze(ts, lags, alpha)[source]

Return an AutocorrelationResult.

Parameters:
Return type:

AutocorrelationResult

significant_lags(result)[source]

Return the lag numbers where ACF or PACF exceeds the CI.

Parameters:
Return type:

ndarray

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.autocorrelation import AutocorrelationAnalyzer

AR(1) process:

>>> rng = np.random.default_rng(7)
>>> n   = 300
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> eps = rng.standard_normal(n)
>>> x   = np.zeros(n)
>>> for i in range(1, n): x[i] = 0.7 * x[i-1] + eps[i]
>>> ts  = TimeSeries(x, index=idx)
>>> r   = AutocorrelationAnalyzer().analyze(ts, lags=10)
>>> r.acf[1] > 0.5          # strong lag-1 autocorrelation
True
>>> r.is_white_noise         # definitely not white noise
False
analyze(ts, lags=40, *, alpha=0.05)[source]

Compute ACF, PACF, and Ljung-Box statistics.

Parameters:
  • ts (TimeSeries) – Input series. NaN values are dropped before analysis.

  • lags (int, optional) – Number of lags to compute (lag 0 is always included). Capped at n // 2. Default 40.

  • alpha (float, optional) – Significance level for confidence bounds and is_white_noise. Default 0.05.

Return type:

AutocorrelationResult

Raises:

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.autocorrelation import AutocorrelationAnalyzer
>>> idx = pd.date_range("2020", periods=50, freq="D")
>>> ts  = TimeSeries(np.ones(50), index=idx)
>>> r   = AutocorrelationAnalyzer().analyze(ts, lags=5)
>>> r.acf[0]
1.0
significant_lags(result, *, which='acf')[source]

Return lag numbers (> 0) where the function exceeds the CI.

Parameters:
Returns:

Integer array of significant lag numbers.

Return type:

numpy.ndarray

Examples

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> from tseda.statistics.autocorrelation import AutocorrelationAnalyzer
>>> rng = np.random.default_rng(7)
>>> n   = 300
>>> idx = pd.date_range("2020", periods=n, freq="D")
>>> eps = rng.standard_normal(n)
>>> x   = np.zeros(n)
>>> for i in range(1, n): x[i] = 0.7 * x[i-1] + eps[i]
>>> ts  = TimeSeries(x, index=idx)
>>> r   = AutocorrelationAnalyzer().analyze(ts, lags=10)
>>> len(AutocorrelationAnalyzer().significant_lags(r)) > 0
True