tseda.core
tseda.core
Core data structures and validation utilities.
Public API
- TimeSeries
Univariate time series with a DatetimeIndex.
- ArrayLike
Type alias for 1-D numeric inputs.
- DatetimeLike
Type alias for datetime-index inputs.
- Frequency
Enum of recognised pandas offset aliases.
- AggMethod
Enum of aggregation methods for resampling / rolling.
- DiffMethod
Enum of differencing strategies.
- class tseda.core.TimeSeries(data, *, index=None, name='value', freq=None, unit=None, description=None)[source]
Bases:
objectUnivariate time series with a
pandas.DatetimeIndex.- Parameters:
data (Union[ArrayLike, pd.Series]) –
Numeric values. Accepted types:
1-D
numpy.ndarraypandas.Series— values are extracted; the Series index is used unless index is also provided.
index (Optional[DatetimeLike]) –
Datetime timestamps aligned with data. When data is a
pandas.Serieswith apandas.DatetimeIndexthis argument may be omitted. Accepted types:list/numpy.ndarrayof datetime-like strings ornumpy.datetime64objects
name (str) – Short identifier for the series (used in plots and reports). Default
"value".freq (Optional[str]) – Pandas offset alias (e.g.,
"D","h","MS"). WhenNone(default) the frequency is inferred automatically.unit (Optional[str]) – Physical unit of the values (e.g.,
"USD","°C"). Purely informational — used in axis labels.description (Optional[str]) – Free-text description stored in
metadata.
- Raises:
TypeError – If data or index have an unsupported type.
ValueError – If data and index have different lengths, if index is not monotonically increasing, or if index contains duplicates.
Examples
From a numpy array:
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> idx = pd.date_range("2020-01-01", periods=5, freq="D") >>> ts = TimeSeries([10.0, 11.5, 9.8, 12.0, 11.0], index=idx) >>> ts.n 5
From a pandas Series:
>>> s = pd.Series([1, 2, 3], index=pd.date_range("2020", periods=3, freq="D")) >>> ts = TimeSeries.from_series(s)
- classmethod from_series(series, *, name=None, freq=None, unit=None, description=None)[source]
Construct a
TimeSeriesfrom apandas.Series.- Parameters:
series (Series) – Must have a
pandas.DatetimeIndex.name (str | None) – Override the Series’
.nameattribute. WhenNonethe Series name (if any) is used, falling back to"value".freq (str | None) – Forwarded to
TimeSeries.__init__.unit (str | None) – Forwarded to
TimeSeries.__init__.description (str | None) – Forwarded to
TimeSeries.__init__.
- Return type:
Examples
>>> s = pd.Series([1.0, 2.0], index=pd.date_range("2020", periods=2, freq="D")) >>> TimeSeries.from_series(s, name="x").name 'x'
- classmethod from_arrays(values, index, *, name='value', freq=None, unit=None, description=None)[source]
Construct a
TimeSeriesfrom parallel arrays.- Parameters:
values (ndarray | list | tuple | Series) – 1-D numeric array.
index (DatetimeIndex | Series | list | ndarray) – Datetime-like array of the same length.
name (str) – Forwarded to
TimeSeries.__init__.freq (str | None) – Forwarded to
TimeSeries.__init__.unit (str | None) – Forwarded to
TimeSeries.__init__.description (str | None) – Forwarded to
TimeSeries.__init__.
- Return type:
Examples
>>> import numpy as np, pandas as pd >>> vals = np.array([1.0, 2.0, 3.0]) >>> idx = pd.date_range("2021-01-01", periods=3, freq="D") >>> TimeSeries.from_arrays(vals, idx).n 3
- classmethod from_dataframe(df, column, *, name=None, freq=None, unit=None, description=None)[source]
Extract one column from a
pandas.DataFrame.- Parameters:
df (DataFrame) – Source DataFrame. Must have a
pandas.DatetimeIndex.column (str) – Column name to extract.
name (str | None) – Override the column name as the series name.
freq (str | None) – Forwarded to
TimeSeries.__init__.unit (str | None) – Forwarded to
TimeSeries.__init__.description (str | None) – Forwarded to
TimeSeries.__init__.
- Return type:
- Raises:
KeyError – If column is not in df.
Examples
>>> import pandas as pd >>> df = pd.DataFrame({"temp": [20.0, 21.0, 19.5]}, ... index=pd.date_range("2020", periods=3, freq="D")) >>> TimeSeries.from_dataframe(df, "temp").name 'temp'
- property values: ndarray
1-D
float64array of observed values.- Returns:
A copy to protect the internal state.
- Return type:
- property index: DatetimeIndex
Datetime index of the series.
- Return type:
- property unit: str | None
Physical unit of the values, or
Noneif unspecified.- Return type:
str or None
- property description: str | None
Free-text description, or
Noneif unspecified.- Return type:
str or None
- property freq: str | None
Pandas offset alias (e.g.,
"D"), orNonefor irregular data.- Return type:
str or None
- property is_regular: bool
Truewhen all consecutive time gaps are identical.A regular series has no missing timestamps (assuming a fixed sampling interval). An irregular series may be the result of market holidays, sensor outages, or event-driven sampling.
- Return type:
- to_series()[source]
Return the data as a
pandas.Series.The returned Series uses the same DatetimeIndex and the
nameattribute as its Series name.- Return type:
- to_frame()[source]
Return the data as a single-column
pandas.DataFrame.- Returns:
Column name equals
name.- Return type:
- copy()[source]
Return a deep copy of this
TimeSeries.- Return type:
- slice(start=None, end=None)[source]
Return a time-bounded subset of the series.
Both start and end are inclusive. Either may be
Noneto leave that boundary open.- Parameters:
- Return type:
- Raises:
ValueError – If the resulting slice is empty.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020-01-01", periods=365, freq="D") >>> ts = TimeSeries(np.arange(365.0), index=idx) >>> q1 = ts.slice("2020-01-01", "2020-03-31") >>> q1.n 91
- resample(freq, *, agg=AggMethod.MEAN)[source]
Resample the series to a new frequency.
- Parameters:
- Return type:
- Raises:
ValueError – If freq is not recognised by pandas.
AttributeError – If agg is not a valid resampler method.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020-01-01", periods=365, freq="D") >>> ts = TimeSeries(np.ones(365), index=idx) >>> ts.resample("MS").n # 12 monthly values 12
- diff(periods=1, *, method=DiffMethod.SIMPLE)[source]
Difference the series.
- Parameters:
periods (int) – Number of periods to lag. Default 1 (first difference).
method (str | DiffMethod) –
One of:
"simple"—y[t] - y[t-k]"log"—log(y[t]) - log(y[t-k])"percent"—(y[t] - y[t-k]) / y[t-k]
- Returns:
The leading NaN rows introduced by differencing are dropped.
- Return type:
- Raises:
ValueError – If method is
"log"or"percent"and the series contains non-positive values.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=5, freq="D") >>> ts = TimeSeries([10.0, 11.0, 12.0, 11.0, 13.0], index=idx) >>> ts.diff().values array([1., 1., -1., 2.])
- log()[source]
Apply the natural logarithm element-wise.
- Return type:
- Raises:
ValueError – If the series contains non-positive values.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> TimeSeries([1.0, np.e, np.e**2], index=idx).log().values array([0., 1., 2.])
- standardize()[source]
Standardise to zero mean and unit variance (z-score).
The transform is
(x - mean) / std. NaN values are ignored when computing statistics but preserved in position.- Return type:
- Raises:
ValueError – If the standard deviation is zero (constant series).
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=4, freq="D") >>> ts = TimeSeries([2.0, 4.0, 6.0, 8.0], index=idx) >>> z = ts.standardize() >>> round(float(z.values.mean()), 10) 0.0
- normalize(*, lower=0.0, upper=1.0)[source]
Min-max normalise the series to [lower, upper].
- Parameters:
- Return type:
- Raises:
ValueError – If the series has zero range (max == min) or lower >= upper.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> ts = TimeSeries([0.0, 5.0, 10.0], index=idx) >>> ts.normalize().values array([0. , 0.5, 1. ])
- rolling(window, *, agg=AggMethod.MEAN, center=False, min_periods=None)[source]
Apply a rolling-window aggregation.
- Parameters:
window (int) – Size of the rolling window in number of observations.
agg (str | AggMethod) – Aggregation method (default
"mean").center (bool) – Whether to set the window labels as the centre of the window (default
False— trailing window).min_periods (int | None) – Minimum number of non-NaN observations required to produce a value. Defaults to window.
- Returns:
Leading/trailing NaNs introduced by the window are dropped.
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=6, freq="D") >>> ts = TimeSeries([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], index=idx) >>> ts.rolling(3).values array([2., 3., 4., 5.])
- apply(func, *, name=None)[source]
Apply an arbitrary element-wise function to the values.
- Parameters:
- Return type:
- Raises:
ValueError – If func changes the length of the array.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> ts = TimeSeries([1.0, 4.0, 9.0], index=idx) >>> ts.apply(np.sqrt).values array([1., 2., 3.])
- __contains__(timestamp)[source]
Check whether a timestamp exists in the index.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> ts = TimeSeries([1.0, 2.0, 3.0], index=idx) >>> pd.Timestamp("2020-01-02") in ts True
- __getitem__(key)[source]
Positional indexing by integer or slice.
- Parameters:
int— return the scalar value at that position.slice— return a newTimeSeriesfor that range.
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=5, freq="D") >>> ts = TimeSeries([10.0, 20.0, 30.0, 40.0, 50.0], index=idx) >>> ts[0] 10.0 >>> ts[-1] 50.0 >>> ts[1:3].values array([20., 30.])
- class tseda.core.Frequency(*values)[source]
-
Canonical pandas offset aliases recognised by tseda.
The string value of each member is a valid
freqargument topandas.date_range()andpandas.Series.resample().Examples
>>> Frequency.DAILY.value 'D' >>> Frequency.DAILY == "D" True
- SECONDLY = 'S'
- MINUTELY = 'min'
- HOURLY = 'h'
- DAILY = 'D'
- BUSINESS_DAILY = 'B'
- WEEKLY = 'W'
- MONTHLY_START = 'MS'
- MONTHLY_END = 'ME'
- QUARTERLY_START = 'QS'
- QUARTERLY_END = 'QE'
- ANNUAL_START = 'YS'
- ANNUAL_END = 'YE'
- __repr__()
Return repr(self).
- class tseda.core.AggMethod(*values)[source]
-
Aggregation functions available when resampling a
TimeSeries.The string value matches the
pandas.core.resample.Resamplermethod name.Examples
>>> AggMethod.MEAN.value 'mean'
- MEAN = 'mean'
- SUM = 'sum'
- MIN = 'min'
- MAX = 'max'
- MEDIAN = 'median'
- FIRST = 'first'
- LAST = 'last'
- STD = 'std'
- VAR = 'var'
- COUNT = 'count'
- __repr__()
Return repr(self).
- class tseda.core.DiffMethod(*values)[source]
-
Differencing mode for
diff().- SIMPLE
y[t] - y[t-k](standard first/kth difference).
- LOG
log(y[t]) - log(y[t-k])(log return / percent change in log scale).
- PERCENT
(y[t] - y[t-k]) / y[t-k](relative change).
- SIMPLE = 'simple'
- LOG = 'log'
- PERCENT = 'percent'
- __repr__()
Return repr(self).
- tseda.core.validate_data_array(data, *, name='data')[source]
Coerce data to a 1-D
float64numpy.ndarray.- Parameters:
data (Any) –
Numeric input. Accepted types:
numpy.ndarray— must be 1-D.pandas.Series— values extracted; index ignored.
name (str) – Variable name used in error messages (default
"data").
- Returns:
1-D array of dtype
float64. NaN values are preserved.- Return type:
- Raises:
TypeError – If data is not a recognised type.
ValueError – If data is not 1-D or contains non-numeric elements.
Examples
>>> validate_data_array([1.0, 2.0, 3.0]) array([1., 2., 3.]) >>> validate_data_array(pd.Series([1, 2, 3])) array([1., 2., 3.])
- tseda.core.validate_datetime_index(index, *, name='index')[source]
Coerce index to a sorted, duplicate-free
pandas.DatetimeIndex.- Parameters:
index (Any) –
Datetime-like input. Accepted types:
pandas.Serieswith datetime dtypelistornumpy.ndarrayof datetime-like strings ornumpy.datetime64values
name (str) – Variable name used in error messages (default
"index").
- Returns:
Validated, monotonically increasing, duplicate-free index.
- Return type:
- Raises:
TypeError – If index is not a recognised type.
ValueError – If index is not monotonically increasing or contains duplicates.
Examples
>>> idx = pd.date_range("2020-01-01", periods=5, freq="D") >>> validate_datetime_index(idx) DatetimeIndex(['2020-01-01', ..., '2020-01-05'], dtype='datetime64[ns]', freq='D')
- tseda.core.validate_freq_string(freq, *, name='freq')[source]
Assert that freq is a non-empty string accepted by
pandas.tseries.frequencies.to_offset().- Parameters:
- Returns:
The validated frequency string.
- Return type:
- Raises:
TypeError – If freq is not a string.
ValueError – If freq is not recognised by pandas.
Examples
>>> validate_freq_string("D") 'D' >>> validate_freq_string("15min") '15min'
- tseda.core.validate_lags(lags, n, *, name='lags')[source]
Assert that lags is a sensible lag count for a series of length n.
The upper bound is
n // 2because computing autocorrelations at lags approaching n produces unreliable estimates.- Parameters:
- Returns:
The validated lag count.
- Return type:
- Raises:
ValueError – If lags is not in
[1, n // 2].
Examples
>>> validate_lags(40, 100) 40
- tseda.core.validate_positive_int(value, *, name='value')[source]
Assert that value is a positive integer.
- Parameters:
- Returns:
The validated integer.
- Return type:
- Raises:
TypeError – If value is not an integer type.
ValueError – If value is less than 1.
Examples
>>> validate_positive_int(5) 5
TimeSeries
- class tseda.core.timeseries.TimeSeries(data, *, index=None, name='value', freq=None, unit=None, description=None)[source]
Bases:
objectUnivariate time series with a
pandas.DatetimeIndex.- Parameters:
data (Union[ArrayLike, pd.Series]) –
Numeric values. Accepted types:
1-D
numpy.ndarraypandas.Series— values are extracted; the Series index is used unless index is also provided.
index (Optional[DatetimeLike]) –
Datetime timestamps aligned with data. When data is a
pandas.Serieswith apandas.DatetimeIndexthis argument may be omitted. Accepted types:list/numpy.ndarrayof datetime-like strings ornumpy.datetime64objects
name (str) – Short identifier for the series (used in plots and reports). Default
"value".freq (Optional[str]) – Pandas offset alias (e.g.,
"D","h","MS"). WhenNone(default) the frequency is inferred automatically.unit (Optional[str]) – Physical unit of the values (e.g.,
"USD","°C"). Purely informational — used in axis labels.description (Optional[str]) – Free-text description stored in
metadata.
- Raises:
TypeError – If data or index have an unsupported type.
ValueError – If data and index have different lengths, if index is not monotonically increasing, or if index contains duplicates.
Examples
From a numpy array:
>>> import numpy as np, pandas as pd >>> from tseda import TimeSeries >>> idx = pd.date_range("2020-01-01", periods=5, freq="D") >>> ts = TimeSeries([10.0, 11.5, 9.8, 12.0, 11.0], index=idx) >>> ts.n 5
From a pandas Series:
>>> s = pd.Series([1, 2, 3], index=pd.date_range("2020", periods=3, freq="D")) >>> ts = TimeSeries.from_series(s)
- classmethod from_series(series, *, name=None, freq=None, unit=None, description=None)[source]
Construct a
TimeSeriesfrom apandas.Series.- Parameters:
series (Series) – Must have a
pandas.DatetimeIndex.name (str | None) – Override the Series’
.nameattribute. WhenNonethe Series name (if any) is used, falling back to"value".freq (str | None) – Forwarded to
TimeSeries.__init__.unit (str | None) – Forwarded to
TimeSeries.__init__.description (str | None) – Forwarded to
TimeSeries.__init__.
- Return type:
Examples
>>> s = pd.Series([1.0, 2.0], index=pd.date_range("2020", periods=2, freq="D")) >>> TimeSeries.from_series(s, name="x").name 'x'
- classmethod from_arrays(values, index, *, name='value', freq=None, unit=None, description=None)[source]
Construct a
TimeSeriesfrom parallel arrays.- Parameters:
values (ndarray | list | tuple | Series) – 1-D numeric array.
index (DatetimeIndex | Series | list | ndarray) – Datetime-like array of the same length.
name (str) – Forwarded to
TimeSeries.__init__.freq (str | None) – Forwarded to
TimeSeries.__init__.unit (str | None) – Forwarded to
TimeSeries.__init__.description (str | None) – Forwarded to
TimeSeries.__init__.
- Return type:
Examples
>>> import numpy as np, pandas as pd >>> vals = np.array([1.0, 2.0, 3.0]) >>> idx = pd.date_range("2021-01-01", periods=3, freq="D") >>> TimeSeries.from_arrays(vals, idx).n 3
- classmethod from_dataframe(df, column, *, name=None, freq=None, unit=None, description=None)[source]
Extract one column from a
pandas.DataFrame.- Parameters:
df (DataFrame) – Source DataFrame. Must have a
pandas.DatetimeIndex.column (str) – Column name to extract.
name (str | None) – Override the column name as the series name.
freq (str | None) – Forwarded to
TimeSeries.__init__.unit (str | None) – Forwarded to
TimeSeries.__init__.description (str | None) – Forwarded to
TimeSeries.__init__.
- Return type:
- Raises:
KeyError – If column is not in df.
Examples
>>> import pandas as pd >>> df = pd.DataFrame({"temp": [20.0, 21.0, 19.5]}, ... index=pd.date_range("2020", periods=3, freq="D")) >>> TimeSeries.from_dataframe(df, "temp").name 'temp'
- property values: ndarray
1-D
float64array of observed values.- Returns:
A copy to protect the internal state.
- Return type:
- property index: DatetimeIndex
Datetime index of the series.
- Return type:
- property unit: str | None
Physical unit of the values, or
Noneif unspecified.- Return type:
str or None
- property description: str | None
Free-text description, or
Noneif unspecified.- Return type:
str or None
- property freq: str | None
Pandas offset alias (e.g.,
"D"), orNonefor irregular data.- Return type:
str or None
- property is_regular: bool
Truewhen all consecutive time gaps are identical.A regular series has no missing timestamps (assuming a fixed sampling interval). An irregular series may be the result of market holidays, sensor outages, or event-driven sampling.
- Return type:
- to_series()[source]
Return the data as a
pandas.Series.The returned Series uses the same DatetimeIndex and the
nameattribute as its Series name.- Return type:
- to_frame()[source]
Return the data as a single-column
pandas.DataFrame.- Returns:
Column name equals
name.- Return type:
- copy()[source]
Return a deep copy of this
TimeSeries.- Return type:
- slice(start=None, end=None)[source]
Return a time-bounded subset of the series.
Both start and end are inclusive. Either may be
Noneto leave that boundary open.- Parameters:
- Return type:
- Raises:
ValueError – If the resulting slice is empty.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020-01-01", periods=365, freq="D") >>> ts = TimeSeries(np.arange(365.0), index=idx) >>> q1 = ts.slice("2020-01-01", "2020-03-31") >>> q1.n 91
- resample(freq, *, agg=AggMethod.MEAN)[source]
Resample the series to a new frequency.
- Parameters:
- Return type:
- Raises:
ValueError – If freq is not recognised by pandas.
AttributeError – If agg is not a valid resampler method.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020-01-01", periods=365, freq="D") >>> ts = TimeSeries(np.ones(365), index=idx) >>> ts.resample("MS").n # 12 monthly values 12
- diff(periods=1, *, method=DiffMethod.SIMPLE)[source]
Difference the series.
- Parameters:
periods (int) – Number of periods to lag. Default 1 (first difference).
method (str | DiffMethod) –
One of:
"simple"—y[t] - y[t-k]"log"—log(y[t]) - log(y[t-k])"percent"—(y[t] - y[t-k]) / y[t-k]
- Returns:
The leading NaN rows introduced by differencing are dropped.
- Return type:
- Raises:
ValueError – If method is
"log"or"percent"and the series contains non-positive values.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=5, freq="D") >>> ts = TimeSeries([10.0, 11.0, 12.0, 11.0, 13.0], index=idx) >>> ts.diff().values array([1., 1., -1., 2.])
- log()[source]
Apply the natural logarithm element-wise.
- Return type:
- Raises:
ValueError – If the series contains non-positive values.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> TimeSeries([1.0, np.e, np.e**2], index=idx).log().values array([0., 1., 2.])
- standardize()[source]
Standardise to zero mean and unit variance (z-score).
The transform is
(x - mean) / std. NaN values are ignored when computing statistics but preserved in position.- Return type:
- Raises:
ValueError – If the standard deviation is zero (constant series).
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=4, freq="D") >>> ts = TimeSeries([2.0, 4.0, 6.0, 8.0], index=idx) >>> z = ts.standardize() >>> round(float(z.values.mean()), 10) 0.0
- normalize(*, lower=0.0, upper=1.0)[source]
Min-max normalise the series to [lower, upper].
- Parameters:
- Return type:
- Raises:
ValueError – If the series has zero range (max == min) or lower >= upper.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> ts = TimeSeries([0.0, 5.0, 10.0], index=idx) >>> ts.normalize().values array([0. , 0.5, 1. ])
- rolling(window, *, agg=AggMethod.MEAN, center=False, min_periods=None)[source]
Apply a rolling-window aggregation.
- Parameters:
window (int) – Size of the rolling window in number of observations.
agg (str | AggMethod) – Aggregation method (default
"mean").center (bool) – Whether to set the window labels as the centre of the window (default
False— trailing window).min_periods (int | None) – Minimum number of non-NaN observations required to produce a value. Defaults to window.
- Returns:
Leading/trailing NaNs introduced by the window are dropped.
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=6, freq="D") >>> ts = TimeSeries([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], index=idx) >>> ts.rolling(3).values array([2., 3., 4., 5.])
- apply(func, *, name=None)[source]
Apply an arbitrary element-wise function to the values.
- Parameters:
- Return type:
- Raises:
ValueError – If func changes the length of the array.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> ts = TimeSeries([1.0, 4.0, 9.0], index=idx) >>> ts.apply(np.sqrt).values array([1., 2., 3.])
- __contains__(timestamp)[source]
Check whether a timestamp exists in the index.
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=3, freq="D") >>> ts = TimeSeries([1.0, 2.0, 3.0], index=idx) >>> pd.Timestamp("2020-01-02") in ts True
- __getitem__(key)[source]
Positional indexing by integer or slice.
- Parameters:
int— return the scalar value at that position.slice— return a newTimeSeriesfor that range.
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> idx = pd.date_range("2020", periods=5, freq="D") >>> ts = TimeSeries([10.0, 20.0, 30.0, 40.0, 50.0], index=idx) >>> ts[0] 10.0 >>> ts[-1] 50.0 >>> ts[1:3].values array([20., 30.])
Types & Enumerations
- class tseda.core.types.Frequency(*values)[source]
-
Canonical pandas offset aliases recognised by tseda.
The string value of each member is a valid
freqargument topandas.date_range()andpandas.Series.resample().Examples
>>> Frequency.DAILY.value 'D' >>> Frequency.DAILY == "D" True
- SECONDLY = 'S'
- MINUTELY = 'min'
- HOURLY = 'h'
- DAILY = 'D'
- BUSINESS_DAILY = 'B'
- WEEKLY = 'W'
- MONTHLY_START = 'MS'
- MONTHLY_END = 'ME'
- QUARTERLY_START = 'QS'
- QUARTERLY_END = 'QE'
- ANNUAL_START = 'YS'
- ANNUAL_END = 'YE'
- __repr__()
Return repr(self).
- class tseda.core.types.AggMethod(*values)[source]
-
Aggregation functions available when resampling a
TimeSeries.The string value matches the
pandas.core.resample.Resamplermethod name.Examples
>>> AggMethod.MEAN.value 'mean'
- MEAN = 'mean'
- SUM = 'sum'
- MIN = 'min'
- MAX = 'max'
- MEDIAN = 'median'
- FIRST = 'first'
- LAST = 'last'
- STD = 'std'
- VAR = 'var'
- COUNT = 'count'
- __repr__()
Return repr(self).
- class tseda.core.types.DiffMethod(*values)[source]
-
Differencing mode for
diff().- SIMPLE
y[t] - y[t-k](standard first/kth difference).
- LOG
log(y[t]) - log(y[t-k])(log return / percent change in log scale).
- PERCENT
(y[t] - y[t-k]) / y[t-k](relative change).
- SIMPLE = 'simple'
- LOG = 'log'
- PERCENT = 'percent'
- __repr__()
Return repr(self).
Validators
Input validation utilities for tseda.
Every public function in this module raises a descriptive TypeError
or ValueError on bad input and returns the canonicalised value on
success. All heavy lifting of data coercion lives here so that
TimeSeries and analysis modules stay clean.
Functions
- validate_data_array
Coerce arbitrary numeric input to a 1-D
float64numpy.ndarray.- validate_datetime_index
Coerce arbitrary input to a sorted, duplicate-free
pandas.DatetimeIndex.- validate_positive_int
Assert that a value is a positive integer.
- validate_lags
Assert that the requested lag count is sensible relative to series length.
- validate_freq_string
Assert that a string is a recognised pandas offset alias.
- tseda.core.validator.validate_data_array(data, *, name='data')[source]
Coerce data to a 1-D
float64numpy.ndarray.- Parameters:
data (Any) –
Numeric input. Accepted types:
numpy.ndarray— must be 1-D.pandas.Series— values extracted; index ignored.
name (str) – Variable name used in error messages (default
"data").
- Returns:
1-D array of dtype
float64. NaN values are preserved.- Return type:
- Raises:
TypeError – If data is not a recognised type.
ValueError – If data is not 1-D or contains non-numeric elements.
Examples
>>> validate_data_array([1.0, 2.0, 3.0]) array([1., 2., 3.]) >>> validate_data_array(pd.Series([1, 2, 3])) array([1., 2., 3.])
- tseda.core.validator.validate_datetime_index(index, *, name='index')[source]
Coerce index to a sorted, duplicate-free
pandas.DatetimeIndex.- Parameters:
index (Any) –
Datetime-like input. Accepted types:
pandas.Serieswith datetime dtypelistornumpy.ndarrayof datetime-like strings ornumpy.datetime64values
name (str) – Variable name used in error messages (default
"index").
- Returns:
Validated, monotonically increasing, duplicate-free index.
- Return type:
- Raises:
TypeError – If index is not a recognised type.
ValueError – If index is not monotonically increasing or contains duplicates.
Examples
>>> idx = pd.date_range("2020-01-01", periods=5, freq="D") >>> validate_datetime_index(idx) DatetimeIndex(['2020-01-01', ..., '2020-01-05'], dtype='datetime64[ns]', freq='D')
- tseda.core.validator.validate_positive_int(value, *, name='value')[source]
Assert that value is a positive integer.
- Parameters:
- Returns:
The validated integer.
- Return type:
- Raises:
TypeError – If value is not an integer type.
ValueError – If value is less than 1.
Examples
>>> validate_positive_int(5) 5
- tseda.core.validator.validate_lags(lags, n, *, name='lags')[source]
Assert that lags is a sensible lag count for a series of length n.
The upper bound is
n // 2because computing autocorrelations at lags approaching n produces unreliable estimates.- Parameters:
- Returns:
The validated lag count.
- Return type:
- Raises:
ValueError – If lags is not in
[1, n // 2].
Examples
>>> validate_lags(40, 100) 40
- tseda.core.validator.validate_freq_string(freq, *, name='freq')[source]
Assert that freq is a non-empty string accepted by
pandas.tseries.frequencies.to_offset().- Parameters:
- Returns:
The validated frequency string.
- Return type:
- Raises:
TypeError – If freq is not a string.
ValueError – If freq is not recognised by pandas.
Examples
>>> validate_freq_string("D") 'D' >>> validate_freq_string("15min") '15min'