tseda.core

tseda.core

Core data structures and validation utilities.

Public API

TimeSeries

Univariate time series with a DatetimeIndex.

ArrayLike

Type alias for 1-D numeric inputs.

DatetimeLike

Type alias for datetime-index inputs.

Frequency

Enum of recognised pandas offset aliases.

AggMethod

Enum of aggregation methods for resampling / rolling.

DiffMethod

Enum of differencing strategies.

class tseda.core.TimeSeries(data, *, index=None, name='value', freq=None, unit=None, description=None)[source]

Bases: object

Univariate time series with a pandas.DatetimeIndex.

Parameters:
  • data (Union[ArrayLike, pd.Series]) –

    Numeric values. Accepted types:

  • index (Optional[DatetimeLike]) –

    Datetime timestamps aligned with data. When data is a pandas.Series with a pandas.DatetimeIndex this argument may be omitted. Accepted types:

  • name (str) – Short identifier for the series (used in plots and reports). Default "value".

  • freq (Optional[str]) – Pandas offset alias (e.g., "D", "h", "MS"). When None (default) the frequency is inferred automatically.

  • unit (Optional[str]) – Physical unit of the values (e.g., "USD", "°C"). Purely informational — used in axis labels.

  • description (Optional[str]) – Free-text description stored in metadata.

Raises:
  • TypeError – If data or index have an unsupported type.

  • ValueError – If data and index have different lengths, if index is not monotonically increasing, or if index contains duplicates.

Examples

From a numpy array:

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> idx = pd.date_range("2020-01-01", periods=5, freq="D")
>>> ts = TimeSeries([10.0, 11.5, 9.8, 12.0, 11.0], index=idx)
>>> ts.n
5

From a pandas Series:

>>> s = pd.Series([1, 2, 3], index=pd.date_range("2020", periods=3, freq="D"))
>>> ts = TimeSeries.from_series(s)
__init__(data, *, index=None, name='value', freq=None, unit=None, description=None)[source]
Parameters:
Return type:

None

classmethod from_series(series, *, name=None, freq=None, unit=None, description=None)[source]

Construct a TimeSeries from a pandas.Series.

Parameters:
Return type:

TimeSeries

Examples

>>> s = pd.Series([1.0, 2.0], index=pd.date_range("2020", periods=2, freq="D"))
>>> TimeSeries.from_series(s, name="x").name
'x'
classmethod from_arrays(values, index, *, name='value', freq=None, unit=None, description=None)[source]

Construct a TimeSeries from parallel arrays.

Parameters:
Return type:

TimeSeries

Examples

>>> import numpy as np, pandas as pd
>>> vals = np.array([1.0, 2.0, 3.0])
>>> idx  = pd.date_range("2021-01-01", periods=3, freq="D")
>>> TimeSeries.from_arrays(vals, idx).n
3
classmethod from_dataframe(df, column, *, name=None, freq=None, unit=None, description=None)[source]

Extract one column from a pandas.DataFrame.

Parameters:
Return type:

TimeSeries

Raises:

KeyError – If column is not in df.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({"temp": [20.0, 21.0, 19.5]},
...                    index=pd.date_range("2020", periods=3, freq="D"))
>>> TimeSeries.from_dataframe(df, "temp").name
'temp'
property values: ndarray

1-D float64 array of observed values.

Returns:

A copy to protect the internal state.

Return type:

numpy.ndarray

property index: DatetimeIndex

Datetime index of the series.

Return type:

pandas.DatetimeIndex

property n: int

Number of observations.

Return type:

int

property start: Timestamp

Timestamp of the first observation.

Return type:

pandas.Timestamp

property end: Timestamp

Timestamp of the last observation.

Return type:

pandas.Timestamp

property duration: Timedelta

Wall-clock span from the first to the last observation.

Return type:

pandas.Timedelta

property name: str

Short identifier for the series.

Return type:

str

property unit: str | None

Physical unit of the values, or None if unspecified.

Return type:

str or None

property description: str | None

Free-text description, or None if unspecified.

Return type:

str or None

property freq: str | None

Pandas offset alias (e.g., "D"), or None for irregular data.

Return type:

str or None

property freq_label: str

Human-readable frequency label (e.g., "Daily").

Return type:

str

property has_nan: bool

True when at least one value is NaN.

Return type:

bool

property n_nan: int

Number of NaN values.

Return type:

int

property is_regular: bool

True when all consecutive time gaps are identical.

A regular series has no missing timestamps (assuming a fixed sampling interval). An irregular series may be the result of market holidays, sensor outages, or event-driven sampling.

Return type:

bool

to_series()[source]

Return the data as a pandas.Series.

The returned Series uses the same DatetimeIndex and the name attribute as its Series name.

Return type:

pandas.Series

to_frame()[source]

Return the data as a single-column pandas.DataFrame.

Returns:

Column name equals name.

Return type:

pandas.DataFrame

to_numpy()[source]

Return a copy of the raw values as a 1-D numpy array.

Return type:

numpy.ndarray

copy()[source]

Return a deep copy of this TimeSeries.

Return type:

TimeSeries

slice(start=None, end=None)[source]

Return a time-bounded subset of the series.

Both start and end are inclusive. Either may be None to leave that boundary open.

Parameters:
  • start (str | Timestamp | None) – Start timestamp (inclusive). Accepts any value parseable by pandas.Timestamp() (e.g., "2020-01-01").

  • end (str | Timestamp | None) – End timestamp (inclusive).

Return type:

TimeSeries

Raises:

ValueError – If the resulting slice is empty.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020-01-01", periods=365, freq="D")
>>> ts = TimeSeries(np.arange(365.0), index=idx)
>>> q1 = ts.slice("2020-01-01", "2020-03-31")
>>> q1.n
91
resample(freq, *, agg=AggMethod.MEAN)[source]

Resample the series to a new frequency.

Parameters:
  • freq (str) – Target pandas offset alias (e.g., "W", "MS").

  • agg (str | AggMethod) – Aggregation method. Either an AggMethod member or its string value. Default "mean".

Return type:

TimeSeries

Raises:

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020-01-01", periods=365, freq="D")
>>> ts = TimeSeries(np.ones(365), index=idx)
>>> ts.resample("MS").n    # 12 monthly values
12
diff(periods=1, *, method=DiffMethod.SIMPLE)[source]

Difference the series.

Parameters:
  • periods (int) – Number of periods to lag. Default 1 (first difference).

  • method (str | DiffMethod) –

    One of:

    • "simple"y[t] - y[t-k]

    • "log"log(y[t]) - log(y[t-k])

    • "percent"(y[t] - y[t-k]) / y[t-k]

Returns:

The leading NaN rows introduced by differencing are dropped.

Return type:

TimeSeries

Raises:

ValueError – If method is "log" or "percent" and the series contains non-positive values.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=5, freq="D")
>>> ts = TimeSeries([10.0, 11.0, 12.0, 11.0, 13.0], index=idx)
>>> ts.diff().values
array([1., 1., -1., 2.])
log()[source]

Apply the natural logarithm element-wise.

Return type:

TimeSeries

Raises:

ValueError – If the series contains non-positive values.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> TimeSeries([1.0, np.e, np.e**2], index=idx).log().values
array([0., 1., 2.])
standardize()[source]

Standardise to zero mean and unit variance (z-score).

The transform is (x - mean) / std. NaN values are ignored when computing statistics but preserved in position.

Return type:

TimeSeries

Raises:

ValueError – If the standard deviation is zero (constant series).

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=4, freq="D")
>>> ts = TimeSeries([2.0, 4.0, 6.0, 8.0], index=idx)
>>> z = ts.standardize()
>>> round(float(z.values.mean()), 10)
0.0
normalize(*, lower=0.0, upper=1.0)[source]

Min-max normalise the series to [lower, upper].

Parameters:
  • lower (float) – Target minimum value. Default 0.0.

  • upper (float) – Target maximum value. Default 1.0.

Return type:

TimeSeries

Raises:

ValueError – If the series has zero range (max == min) or lower >= upper.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> ts = TimeSeries([0.0, 5.0, 10.0], index=idx)
>>> ts.normalize().values
array([0. , 0.5, 1. ])
rolling(window, *, agg=AggMethod.MEAN, center=False, min_periods=None)[source]

Apply a rolling-window aggregation.

Parameters:
  • window (int) – Size of the rolling window in number of observations.

  • agg (str | AggMethod) – Aggregation method (default "mean").

  • center (bool) – Whether to set the window labels as the centre of the window (default False — trailing window).

  • min_periods (int | None) – Minimum number of non-NaN observations required to produce a value. Defaults to window.

Returns:

Leading/trailing NaNs introduced by the window are dropped.

Return type:

TimeSeries

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=6, freq="D")
>>> ts = TimeSeries([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], index=idx)
>>> ts.rolling(3).values
array([2., 3., 4., 5.])
apply(func, *, name=None)[source]

Apply an arbitrary element-wise function to the values.

Parameters:
  • func (Callable[[ndarray], ndarray]) – Callable that takes a 1-D numpy.ndarray and returns a 1-D array of the same length.

  • name (str | None) – Name for the resulting series. Defaults to "f({self.name})".

Return type:

TimeSeries

Raises:

ValueError – If func changes the length of the array.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> ts = TimeSeries([1.0, 4.0, 9.0], index=idx)
>>> ts.apply(np.sqrt).values
array([1., 2., 3.])
__len__()[source]
Return type:

int

__contains__(timestamp)[source]

Check whether a timestamp exists in the index.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> ts = TimeSeries([1.0, 2.0, 3.0], index=idx)
>>> pd.Timestamp("2020-01-02") in ts
True
Parameters:

timestamp (object)

Return type:

bool

__getitem__(key)[source]

Positional indexing by integer or slice.

Parameters:

key (int | slice) –

  • int — return the scalar value at that position.

  • slice — return a new TimeSeries for that range.

Return type:

float | TimeSeries

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=5, freq="D")
>>> ts = TimeSeries([10.0, 20.0, 30.0, 40.0, 50.0], index=idx)
>>> ts[0]
10.0
>>> ts[-1]
50.0
>>> ts[1:3].values
array([20., 30.])
__repr__()[source]

Return repr(self).

Return type:

str

class tseda.core.Frequency(*values)[source]

Bases: str, Enum

Canonical pandas offset aliases recognised by tseda.

The string value of each member is a valid freq argument to pandas.date_range() and pandas.Series.resample().

Examples

>>> Frequency.DAILY.value
'D'
>>> Frequency.DAILY == "D"
True
SECONDLY = 'S'
MINUTELY = 'min'
HOURLY = 'h'
DAILY = 'D'
BUSINESS_DAILY = 'B'
WEEKLY = 'W'
MONTHLY_START = 'MS'
MONTHLY_END = 'ME'
QUARTERLY_START = 'QS'
QUARTERLY_END = 'QE'
ANNUAL_START = 'YS'
ANNUAL_END = 'YE'
__repr__()

Return repr(self).

class tseda.core.AggMethod(*values)[source]

Bases: str, Enum

Aggregation functions available when resampling a TimeSeries.

The string value matches the pandas.core.resample.Resampler method name.

Examples

>>> AggMethod.MEAN.value
'mean'
MEAN = 'mean'
SUM = 'sum'
MIN = 'min'
MAX = 'max'
MEDIAN = 'median'
FIRST = 'first'
LAST = 'last'
STD = 'std'
VAR = 'var'
COUNT = 'count'
__repr__()

Return repr(self).

class tseda.core.DiffMethod(*values)[source]

Bases: str, Enum

Differencing mode for diff().

SIMPLE

y[t] - y[t-k] (standard first/kth difference).

LOG

log(y[t]) - log(y[t-k]) (log return / percent change in log scale).

PERCENT

(y[t] - y[t-k]) / y[t-k] (relative change).

SIMPLE = 'simple'
LOG = 'log'
PERCENT = 'percent'
__repr__()

Return repr(self).

tseda.core.validate_data_array(data, *, name='data')[source]

Coerce data to a 1-D float64 numpy.ndarray.

Parameters:
  • data (Any) –

    Numeric input. Accepted types:

  • name (str) – Variable name used in error messages (default "data").

Returns:

1-D array of dtype float64. NaN values are preserved.

Return type:

numpy.ndarray

Raises:
  • TypeError – If data is not a recognised type.

  • ValueError – If data is not 1-D or contains non-numeric elements.

Examples

>>> validate_data_array([1.0, 2.0, 3.0])
array([1., 2., 3.])
>>> validate_data_array(pd.Series([1, 2, 3]))
array([1., 2., 3.])
tseda.core.validate_datetime_index(index, *, name='index')[source]

Coerce index to a sorted, duplicate-free pandas.DatetimeIndex.

Parameters:
Returns:

Validated, monotonically increasing, duplicate-free index.

Return type:

pandas.DatetimeIndex

Raises:
  • TypeError – If index is not a recognised type.

  • ValueError – If index is not monotonically increasing or contains duplicates.

Examples

>>> idx = pd.date_range("2020-01-01", periods=5, freq="D")
>>> validate_datetime_index(idx)
DatetimeIndex(['2020-01-01', ..., '2020-01-05'], dtype='datetime64[ns]', freq='D')
tseda.core.validate_freq_string(freq, *, name='freq')[source]

Assert that freq is a non-empty string accepted by pandas.tseries.frequencies.to_offset().

Parameters:
  • freq (Any) – Candidate frequency string (e.g., "D", "h", "MS").

  • name (str) – Variable name used in error messages.

Returns:

The validated frequency string.

Return type:

str

Raises:

Examples

>>> validate_freq_string("D")
'D'
>>> validate_freq_string("15min")
'15min'
tseda.core.validate_lags(lags, n, *, name='lags')[source]

Assert that lags is a sensible lag count for a series of length n.

The upper bound is n // 2 because computing autocorrelations at lags approaching n produces unreliable estimates.

Parameters:
  • lags (int) – Requested number of lags.

  • n (int) – Length of the time series.

  • name (str) – Variable name used in error messages.

Returns:

The validated lag count.

Return type:

int

Raises:

ValueError – If lags is not in [1, n // 2].

Examples

>>> validate_lags(40, 100)
40
tseda.core.validate_positive_int(value, *, name='value')[source]

Assert that value is a positive integer.

Parameters:
  • value (Any) – The candidate value.

  • name (str) – Variable name used in error messages.

Returns:

The validated integer.

Return type:

int

Raises:

Examples

>>> validate_positive_int(5)
5

TimeSeries

class tseda.core.timeseries.TimeSeries(data, *, index=None, name='value', freq=None, unit=None, description=None)[source]

Bases: object

Univariate time series with a pandas.DatetimeIndex.

Parameters:
  • data (Union[ArrayLike, pd.Series]) –

    Numeric values. Accepted types:

  • index (Optional[DatetimeLike]) –

    Datetime timestamps aligned with data. When data is a pandas.Series with a pandas.DatetimeIndex this argument may be omitted. Accepted types:

  • name (str) – Short identifier for the series (used in plots and reports). Default "value".

  • freq (Optional[str]) – Pandas offset alias (e.g., "D", "h", "MS"). When None (default) the frequency is inferred automatically.

  • unit (Optional[str]) – Physical unit of the values (e.g., "USD", "°C"). Purely informational — used in axis labels.

  • description (Optional[str]) – Free-text description stored in metadata.

Raises:
  • TypeError – If data or index have an unsupported type.

  • ValueError – If data and index have different lengths, if index is not monotonically increasing, or if index contains duplicates.

Examples

From a numpy array:

>>> import numpy as np, pandas as pd
>>> from tseda import TimeSeries
>>> idx = pd.date_range("2020-01-01", periods=5, freq="D")
>>> ts = TimeSeries([10.0, 11.5, 9.8, 12.0, 11.0], index=idx)
>>> ts.n
5

From a pandas Series:

>>> s = pd.Series([1, 2, 3], index=pd.date_range("2020", periods=3, freq="D"))
>>> ts = TimeSeries.from_series(s)
__init__(data, *, index=None, name='value', freq=None, unit=None, description=None)[source]
Parameters:
Return type:

None

classmethod from_series(series, *, name=None, freq=None, unit=None, description=None)[source]

Construct a TimeSeries from a pandas.Series.

Parameters:
Return type:

TimeSeries

Examples

>>> s = pd.Series([1.0, 2.0], index=pd.date_range("2020", periods=2, freq="D"))
>>> TimeSeries.from_series(s, name="x").name
'x'
classmethod from_arrays(values, index, *, name='value', freq=None, unit=None, description=None)[source]

Construct a TimeSeries from parallel arrays.

Parameters:
Return type:

TimeSeries

Examples

>>> import numpy as np, pandas as pd
>>> vals = np.array([1.0, 2.0, 3.0])
>>> idx  = pd.date_range("2021-01-01", periods=3, freq="D")
>>> TimeSeries.from_arrays(vals, idx).n
3
classmethod from_dataframe(df, column, *, name=None, freq=None, unit=None, description=None)[source]

Extract one column from a pandas.DataFrame.

Parameters:
Return type:

TimeSeries

Raises:

KeyError – If column is not in df.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({"temp": [20.0, 21.0, 19.5]},
...                    index=pd.date_range("2020", periods=3, freq="D"))
>>> TimeSeries.from_dataframe(df, "temp").name
'temp'
property values: ndarray

1-D float64 array of observed values.

Returns:

A copy to protect the internal state.

Return type:

numpy.ndarray

property index: DatetimeIndex

Datetime index of the series.

Return type:

pandas.DatetimeIndex

property n: int

Number of observations.

Return type:

int

property start: Timestamp

Timestamp of the first observation.

Return type:

pandas.Timestamp

property end: Timestamp

Timestamp of the last observation.

Return type:

pandas.Timestamp

property duration: Timedelta

Wall-clock span from the first to the last observation.

Return type:

pandas.Timedelta

property name: str

Short identifier for the series.

Return type:

str

property unit: str | None

Physical unit of the values, or None if unspecified.

Return type:

str or None

property description: str | None

Free-text description, or None if unspecified.

Return type:

str or None

property freq: str | None

Pandas offset alias (e.g., "D"), or None for irregular data.

Return type:

str or None

property freq_label: str

Human-readable frequency label (e.g., "Daily").

Return type:

str

property has_nan: bool

True when at least one value is NaN.

Return type:

bool

property n_nan: int

Number of NaN values.

Return type:

int

property is_regular: bool

True when all consecutive time gaps are identical.

A regular series has no missing timestamps (assuming a fixed sampling interval). An irregular series may be the result of market holidays, sensor outages, or event-driven sampling.

Return type:

bool

to_series()[source]

Return the data as a pandas.Series.

The returned Series uses the same DatetimeIndex and the name attribute as its Series name.

Return type:

pandas.Series

to_frame()[source]

Return the data as a single-column pandas.DataFrame.

Returns:

Column name equals name.

Return type:

pandas.DataFrame

to_numpy()[source]

Return a copy of the raw values as a 1-D numpy array.

Return type:

numpy.ndarray

copy()[source]

Return a deep copy of this TimeSeries.

Return type:

TimeSeries

slice(start=None, end=None)[source]

Return a time-bounded subset of the series.

Both start and end are inclusive. Either may be None to leave that boundary open.

Parameters:
  • start (str | Timestamp | None) – Start timestamp (inclusive). Accepts any value parseable by pandas.Timestamp() (e.g., "2020-01-01").

  • end (str | Timestamp | None) – End timestamp (inclusive).

Return type:

TimeSeries

Raises:

ValueError – If the resulting slice is empty.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020-01-01", periods=365, freq="D")
>>> ts = TimeSeries(np.arange(365.0), index=idx)
>>> q1 = ts.slice("2020-01-01", "2020-03-31")
>>> q1.n
91
resample(freq, *, agg=AggMethod.MEAN)[source]

Resample the series to a new frequency.

Parameters:
  • freq (str) – Target pandas offset alias (e.g., "W", "MS").

  • agg (str | AggMethod) – Aggregation method. Either an AggMethod member or its string value. Default "mean".

Return type:

TimeSeries

Raises:

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020-01-01", periods=365, freq="D")
>>> ts = TimeSeries(np.ones(365), index=idx)
>>> ts.resample("MS").n    # 12 monthly values
12
diff(periods=1, *, method=DiffMethod.SIMPLE)[source]

Difference the series.

Parameters:
  • periods (int) – Number of periods to lag. Default 1 (first difference).

  • method (str | DiffMethod) –

    One of:

    • "simple"y[t] - y[t-k]

    • "log"log(y[t]) - log(y[t-k])

    • "percent"(y[t] - y[t-k]) / y[t-k]

Returns:

The leading NaN rows introduced by differencing are dropped.

Return type:

TimeSeries

Raises:

ValueError – If method is "log" or "percent" and the series contains non-positive values.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=5, freq="D")
>>> ts = TimeSeries([10.0, 11.0, 12.0, 11.0, 13.0], index=idx)
>>> ts.diff().values
array([1., 1., -1., 2.])
log()[source]

Apply the natural logarithm element-wise.

Return type:

TimeSeries

Raises:

ValueError – If the series contains non-positive values.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> TimeSeries([1.0, np.e, np.e**2], index=idx).log().values
array([0., 1., 2.])
standardize()[source]

Standardise to zero mean and unit variance (z-score).

The transform is (x - mean) / std. NaN values are ignored when computing statistics but preserved in position.

Return type:

TimeSeries

Raises:

ValueError – If the standard deviation is zero (constant series).

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=4, freq="D")
>>> ts = TimeSeries([2.0, 4.0, 6.0, 8.0], index=idx)
>>> z = ts.standardize()
>>> round(float(z.values.mean()), 10)
0.0
normalize(*, lower=0.0, upper=1.0)[source]

Min-max normalise the series to [lower, upper].

Parameters:
  • lower (float) – Target minimum value. Default 0.0.

  • upper (float) – Target maximum value. Default 1.0.

Return type:

TimeSeries

Raises:

ValueError – If the series has zero range (max == min) or lower >= upper.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> ts = TimeSeries([0.0, 5.0, 10.0], index=idx)
>>> ts.normalize().values
array([0. , 0.5, 1. ])
rolling(window, *, agg=AggMethod.MEAN, center=False, min_periods=None)[source]

Apply a rolling-window aggregation.

Parameters:
  • window (int) – Size of the rolling window in number of observations.

  • agg (str | AggMethod) – Aggregation method (default "mean").

  • center (bool) – Whether to set the window labels as the centre of the window (default False — trailing window).

  • min_periods (int | None) – Minimum number of non-NaN observations required to produce a value. Defaults to window.

Returns:

Leading/trailing NaNs introduced by the window are dropped.

Return type:

TimeSeries

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=6, freq="D")
>>> ts = TimeSeries([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], index=idx)
>>> ts.rolling(3).values
array([2., 3., 4., 5.])
apply(func, *, name=None)[source]

Apply an arbitrary element-wise function to the values.

Parameters:
  • func (Callable[[ndarray], ndarray]) – Callable that takes a 1-D numpy.ndarray and returns a 1-D array of the same length.

  • name (str | None) – Name for the resulting series. Defaults to "f({self.name})".

Return type:

TimeSeries

Raises:

ValueError – If func changes the length of the array.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> ts = TimeSeries([1.0, 4.0, 9.0], index=idx)
>>> ts.apply(np.sqrt).values
array([1., 2., 3.])
__len__()[source]
Return type:

int

__contains__(timestamp)[source]

Check whether a timestamp exists in the index.

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=3, freq="D")
>>> ts = TimeSeries([1.0, 2.0, 3.0], index=idx)
>>> pd.Timestamp("2020-01-02") in ts
True
Parameters:

timestamp (object)

Return type:

bool

__getitem__(key)[source]

Positional indexing by integer or slice.

Parameters:

key (int | slice) –

  • int — return the scalar value at that position.

  • slice — return a new TimeSeries for that range.

Return type:

float | TimeSeries

Examples

>>> import pandas as pd, numpy as np
>>> idx = pd.date_range("2020", periods=5, freq="D")
>>> ts = TimeSeries([10.0, 20.0, 30.0, 40.0, 50.0], index=idx)
>>> ts[0]
10.0
>>> ts[-1]
50.0
>>> ts[1:3].values
array([20., 30.])
__repr__()[source]

Return repr(self).

Return type:

str

Types & Enumerations

class tseda.core.types.Frequency(*values)[source]

Bases: str, Enum

Canonical pandas offset aliases recognised by tseda.

The string value of each member is a valid freq argument to pandas.date_range() and pandas.Series.resample().

Examples

>>> Frequency.DAILY.value
'D'
>>> Frequency.DAILY == "D"
True
SECONDLY = 'S'
MINUTELY = 'min'
HOURLY = 'h'
DAILY = 'D'
BUSINESS_DAILY = 'B'
WEEKLY = 'W'
MONTHLY_START = 'MS'
MONTHLY_END = 'ME'
QUARTERLY_START = 'QS'
QUARTERLY_END = 'QE'
ANNUAL_START = 'YS'
ANNUAL_END = 'YE'
__repr__()

Return repr(self).

class tseda.core.types.AggMethod(*values)[source]

Bases: str, Enum

Aggregation functions available when resampling a TimeSeries.

The string value matches the pandas.core.resample.Resampler method name.

Examples

>>> AggMethod.MEAN.value
'mean'
MEAN = 'mean'
SUM = 'sum'
MIN = 'min'
MAX = 'max'
MEDIAN = 'median'
FIRST = 'first'
LAST = 'last'
STD = 'std'
VAR = 'var'
COUNT = 'count'
__repr__()

Return repr(self).

class tseda.core.types.DiffMethod(*values)[source]

Bases: str, Enum

Differencing mode for diff().

SIMPLE

y[t] - y[t-k] (standard first/kth difference).

LOG

log(y[t]) - log(y[t-k]) (log return / percent change in log scale).

PERCENT

(y[t] - y[t-k]) / y[t-k] (relative change).

SIMPLE = 'simple'
LOG = 'log'
PERCENT = 'percent'
__repr__()

Return repr(self).

Validators

Input validation utilities for tseda.

Every public function in this module raises a descriptive TypeError or ValueError on bad input and returns the canonicalised value on success. All heavy lifting of data coercion lives here so that TimeSeries and analysis modules stay clean.

Functions

validate_data_array

Coerce arbitrary numeric input to a 1-D float64 numpy.ndarray.

validate_datetime_index

Coerce arbitrary input to a sorted, duplicate-free pandas.DatetimeIndex.

validate_positive_int

Assert that a value is a positive integer.

validate_lags

Assert that the requested lag count is sensible relative to series length.

validate_freq_string

Assert that a string is a recognised pandas offset alias.

tseda.core.validator.validate_data_array(data, *, name='data')[source]

Coerce data to a 1-D float64 numpy.ndarray.

Parameters:
  • data (Any) –

    Numeric input. Accepted types:

  • name (str) – Variable name used in error messages (default "data").

Returns:

1-D array of dtype float64. NaN values are preserved.

Return type:

numpy.ndarray

Raises:
  • TypeError – If data is not a recognised type.

  • ValueError – If data is not 1-D or contains non-numeric elements.

Examples

>>> validate_data_array([1.0, 2.0, 3.0])
array([1., 2., 3.])
>>> validate_data_array(pd.Series([1, 2, 3]))
array([1., 2., 3.])
tseda.core.validator.validate_datetime_index(index, *, name='index')[source]

Coerce index to a sorted, duplicate-free pandas.DatetimeIndex.

Parameters:
Returns:

Validated, monotonically increasing, duplicate-free index.

Return type:

pandas.DatetimeIndex

Raises:
  • TypeError – If index is not a recognised type.

  • ValueError – If index is not monotonically increasing or contains duplicates.

Examples

>>> idx = pd.date_range("2020-01-01", periods=5, freq="D")
>>> validate_datetime_index(idx)
DatetimeIndex(['2020-01-01', ..., '2020-01-05'], dtype='datetime64[ns]', freq='D')
tseda.core.validator.validate_positive_int(value, *, name='value')[source]

Assert that value is a positive integer.

Parameters:
  • value (Any) – The candidate value.

  • name (str) – Variable name used in error messages.

Returns:

The validated integer.

Return type:

int

Raises:

Examples

>>> validate_positive_int(5)
5
tseda.core.validator.validate_lags(lags, n, *, name='lags')[source]

Assert that lags is a sensible lag count for a series of length n.

The upper bound is n // 2 because computing autocorrelations at lags approaching n produces unreliable estimates.

Parameters:
  • lags (int) – Requested number of lags.

  • n (int) – Length of the time series.

  • name (str) – Variable name used in error messages.

Returns:

The validated lag count.

Return type:

int

Raises:

ValueError – If lags is not in [1, n // 2].

Examples

>>> validate_lags(40, 100)
40
tseda.core.validator.validate_freq_string(freq, *, name='freq')[source]

Assert that freq is a non-empty string accepted by pandas.tseries.frequencies.to_offset().

Parameters:
  • freq (Any) – Candidate frequency string (e.g., "D", "h", "MS").

  • name (str) – Variable name used in error messages.

Returns:

The validated frequency string.

Return type:

str

Raises:

Examples

>>> validate_freq_string("D")
'D'
>>> validate_freq_string("15min")
'15min'