Feature Extraction
tseda.features
Feature extraction for time series.
Public API
- TemporalFeatureExtractor
Calendar and cyclic time-index features →
pandas.DataFrame.- StatisticalFeatureExtractor
Distribution, complexity, and linear-structure features → single-row DataFrame.
- SpectralFeatureExtractor
Frequency-domain (FFT) features → single-row DataFrame.
- class tseda.features.TemporalFeatureExtractor[source]
Bases:
objectExtract calendar and cyclic time features from a
TimeSeries.- extract(ts, cyclic, time_index)[source]
Return a
pandas.DataFramewith one feature column per row aligned tots.index.- Parameters:
ts (TimeSeries)
cyclic (bool)
time_index (bool)
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.temporal import TemporalFeatureExtractor
>>> idx = pd.date_range("2020-01-01", periods=5, freq="D") >>> ts = TimeSeries([10.0, 11.0, 12.0, 11.5, 10.5], index=idx) >>> df = TemporalFeatureExtractor().extract(ts) >>> int(df["year"].iloc[0]) 2020 >>> int(df["month"].iloc[0]) 1
- extract(ts, *, cyclic=True, time_index=True)[source]
Extract temporal features aligned to
ts.index.- Parameters:
ts (TimeSeries) – Input series.
cyclic (bool, optional) – When
True(default), add sine/cosine encodings formonth,dayofweek, andhour.time_index (bool, optional) – When
True(default), adddays_since_startandtime_norm(0 → 1 over the series span).
- Returns:
Index matches
ts.index. Columns:- Always present:
year,month,day,dayofweek,hour,quarter,weekofyear,is_weekend,is_month_start,is_month_end.- When
cyclic=True: month_sin,month_cos,dow_sin,dow_cos,hour_sin,hour_cos.- When
time_index=True: days_since_start,time_norm.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.temporal import TemporalFeatureExtractor >>> idx = pd.date_range("2020-01-01", periods=7, freq="D") >>> ts = TimeSeries(np.ones(7), index=idx) >>> df = TemporalFeatureExtractor().extract(ts, cyclic=False, time_index=False) >>> set(df.columns) >= {"year", "month", "day", "dayofweek", "is_weekend"} True
- class tseda.features.StatisticalFeatureExtractor[source]
Bases:
objectExtract statistical features from a
TimeSeries.The extractor is stateless. It operates on the non-NaN values of the series.
- extract(ts, entropy)[source]
Return a single-row
pandas.DataFrameof features.- Parameters:
ts (TimeSeries)
entropy (bool)
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.statistical import StatisticalFeatureExtractor
>>> idx = pd.date_range("2020", periods=100, freq="D") >>> ts = TimeSeries(np.arange(100.0), index=idx) >>> df = StatisticalFeatureExtractor().extract(ts) >>> round(float(df["linear_slope"].iloc[0]), 1) 1.0 >>> round(float(df["linear_r2"].iloc[0]), 2) 1.0
- extract(ts, *, entropy=True)[source]
Compute statistical features for ts.
- Parameters:
ts (TimeSeries) – Input series.
entropy (bool, optional) – When
True(default), compute approximate entropy and sample entropy. These are O(n²) — setFalsefor large series (n > 2000) to save time.
- Returns:
One row, columns:
- Distribution:
mean,std,var,skewness,kurtosis,min,max,range,median,iqr,mad,cv,trimmed_mean,q25,q75,q05,q95.- Complexity:
turning_points_ratio,mean_crossing_rate,flatness_ratio. Ifentropy=True:approx_entropy,sample_entropy.- Linear structure:
lag1_acf,linear_slope,linear_r2.- Nonlinearity:
n_peaks,n_troughs.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If ts has fewer than 4 non-NaN observations.
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.statistical import StatisticalFeatureExtractor >>> idx = pd.date_range("2020", periods=5, freq="D") >>> ts = TimeSeries([1.0, 2.0, 1.5, 2.5, 2.0], index=idx) >>> df = StatisticalFeatureExtractor().extract(ts, entropy=False) >>> float(df["mean"].iloc[0]) 1.8
- class tseda.features.SpectralFeatureExtractor[source]
Bases:
objectExtract frequency-domain features from a
TimeSeries.The extractor is stateless. NaN values in the series are replaced by linear interpolation before FFT analysis.
- extract(ts, n_bands)[source]
Return a single-row
pandas.DataFrameof spectral features.- Parameters:
ts (TimeSeries)
n_bands (int)
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.spectral import SpectralFeatureExtractor
>>> idx = pd.date_range("2020", periods=128, freq="h") >>> ts = TimeSeries(np.sin(2 * np.pi * np.arange(128) / 24), index=idx) >>> df = SpectralFeatureExtractor().extract(ts) >>> "spectral_centroid" in df.columns True
- extract(ts, *, n_bands=3)[source]
Compute spectral features for ts.
- Parameters:
ts (TimeSeries) – Input series. NaN values are linearly interpolated before the FFT.
n_bands (int, optional) – Number of equal-width frequency bands for band power features. Default 3 (low / mid / high). Must be >= 1.
- Returns:
One row, columns:
- Energy:
total_power,band_power_0…band_power_{n_bands-1}.- Shape:
spectral_centroid,spectral_bandwidth,spectral_rolloff_0.5,spectral_rolloff_0.85,spectral_entropy,spectral_flatness.- Peak:
dominant_freq,dominant_period,n_spectral_peaks.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If the series has fewer than 8 non-NaN observations or n_bands < 1.
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.spectral import SpectralFeatureExtractor >>> idx = pd.date_range("2020", periods=256, freq="D") >>> ts = TimeSeries(np.cos(2*np.pi*np.arange(256)/7), index=idx) >>> df = SpectralFeatureExtractor().extract(ts) >>> int(round(float(df["dominant_period"].iloc[0]))) 7
Temporal feature extraction for time series.
Extracts calendar-based and time-index features from a
TimeSeries. All features are deterministic functions
of the datetime index — no statistical estimation required.
Two categories are produced:
Calendar features — year, month, day, hour, day-of-week, quarter, and boolean flags (is_weekend, is_month_start, is_month_end).
Cyclic encodings — sine/cosine projections of periodic calendar fields (month, day-of-week, hour) so that
month 12andmonth 1are close in feature space.
Classes
- TemporalFeatureExtractor
Stateless extractor returning a
pandas.DataFrame.
Examples
>>> import pandas as pd, numpy as np
>>> from tseda import TimeSeries
>>> from tseda.features.temporal import TemporalFeatureExtractor
>>> idx = pd.date_range("2020-01-01", periods=10, freq="D")
>>> ts = TimeSeries(np.arange(10.0), index=idx)
>>> df = TemporalFeatureExtractor().extract(ts)
>>> list(df.columns[:4])
['year', 'month', 'day', 'dayofweek']
- class tseda.features.temporal.TemporalFeatureExtractor[source]
Bases:
objectExtract calendar and cyclic time features from a
TimeSeries.- extract(ts, cyclic, time_index)[source]
Return a
pandas.DataFramewith one feature column per row aligned tots.index.- Parameters:
ts (TimeSeries)
cyclic (bool)
time_index (bool)
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.temporal import TemporalFeatureExtractor
>>> idx = pd.date_range("2020-01-01", periods=5, freq="D") >>> ts = TimeSeries([10.0, 11.0, 12.0, 11.5, 10.5], index=idx) >>> df = TemporalFeatureExtractor().extract(ts) >>> int(df["year"].iloc[0]) 2020 >>> int(df["month"].iloc[0]) 1
- extract(ts, *, cyclic=True, time_index=True)[source]
Extract temporal features aligned to
ts.index.- Parameters:
ts (TimeSeries) – Input series.
cyclic (bool, optional) – When
True(default), add sine/cosine encodings formonth,dayofweek, andhour.time_index (bool, optional) – When
True(default), adddays_since_startandtime_norm(0 → 1 over the series span).
- Returns:
Index matches
ts.index. Columns:- Always present:
year,month,day,dayofweek,hour,quarter,weekofyear,is_weekend,is_month_start,is_month_end.- When
cyclic=True: month_sin,month_cos,dow_sin,dow_cos,hour_sin,hour_cos.- When
time_index=True: days_since_start,time_norm.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.temporal import TemporalFeatureExtractor >>> idx = pd.date_range("2020-01-01", periods=7, freq="D") >>> ts = TimeSeries(np.ones(7), index=idx) >>> df = TemporalFeatureExtractor().extract(ts, cyclic=False, time_index=False) >>> set(df.columns) >= {"year", "month", "day", "dayofweek", "is_weekend"} True
Statistical feature extraction for time series.
Extracts a rich set of statistical descriptors that characterise the
distribution, complexity, and structure of a TimeSeries.
All features are computed in pure numpy.
Feature groups
Distribution — mean, std, skewness, kurtosis, quantiles, range, CV.
Spread / Robust — MAD, trimmed mean, IQR.
Complexity — approximate entropy, sample entropy, turning points ratio, mean-crossing rate.
Linear structure — lag-1 autocorrelation, linear-trend slope and R².
Nonlinearity — number of peaks and troughs, flatness ratio.
Classes
- StatisticalFeatureExtractor
Stateless extractor.
Examples
>>> import pandas as pd, numpy as np
>>> from tseda import TimeSeries
>>> from tseda.features.statistical import StatisticalFeatureExtractor
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020", periods=200, freq="D")
>>> ts = TimeSeries(rng.standard_normal(200), index=idx)
>>> df = StatisticalFeatureExtractor().extract(ts)
>>> "mean" in df.columns and "std" in df.columns
True
>>> df.shape[0]
1
- class tseda.features.statistical.StatisticalFeatureExtractor[source]
Bases:
objectExtract statistical features from a
TimeSeries.The extractor is stateless. It operates on the non-NaN values of the series.
- extract(ts, entropy)[source]
Return a single-row
pandas.DataFrameof features.- Parameters:
ts (TimeSeries)
entropy (bool)
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.statistical import StatisticalFeatureExtractor
>>> idx = pd.date_range("2020", periods=100, freq="D") >>> ts = TimeSeries(np.arange(100.0), index=idx) >>> df = StatisticalFeatureExtractor().extract(ts) >>> round(float(df["linear_slope"].iloc[0]), 1) 1.0 >>> round(float(df["linear_r2"].iloc[0]), 2) 1.0
- extract(ts, *, entropy=True)[source]
Compute statistical features for ts.
- Parameters:
ts (TimeSeries) – Input series.
entropy (bool, optional) – When
True(default), compute approximate entropy and sample entropy. These are O(n²) — setFalsefor large series (n > 2000) to save time.
- Returns:
One row, columns:
- Distribution:
mean,std,var,skewness,kurtosis,min,max,range,median,iqr,mad,cv,trimmed_mean,q25,q75,q05,q95.- Complexity:
turning_points_ratio,mean_crossing_rate,flatness_ratio. Ifentropy=True:approx_entropy,sample_entropy.- Linear structure:
lag1_acf,linear_slope,linear_r2.- Nonlinearity:
n_peaks,n_troughs.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If ts has fewer than 4 non-NaN observations.
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.statistical import StatisticalFeatureExtractor >>> idx = pd.date_range("2020", periods=5, freq="D") >>> ts = TimeSeries([1.0, 2.0, 1.5, 2.5, 2.0], index=idx) >>> df = StatisticalFeatureExtractor().extract(ts, entropy=False) >>> float(df["mean"].iloc[0]) 1.8
Spectral feature extraction for time series.
Computes frequency-domain descriptors of a TimeSeries
using the FFT power spectrum. All computations use pure numpy / scipy.
Feature groups
Energy / Power — total spectral power, power in low / mid / high bands.
Shape — spectral centroid, bandwidth, rolloff frequency, spectral entropy.
Peak — dominant frequency, dominant period, number of spectral peaks.
Temporal — spectral flatness (ratio of geometric to arithmetic mean power).
Classes
- SpectralFeatureExtractor
Stateless extractor.
Examples
>>> import pandas as pd, numpy as np
>>> from tseda import TimeSeries
>>> from tseda.features.spectral import SpectralFeatureExtractor
>>> rng = np.random.default_rng(0)
>>> idx = pd.date_range("2020", periods=256, freq="D")
>>> ts = TimeSeries(np.sin(2 * np.pi * np.arange(256) / 7), index=idx)
>>> df = SpectralFeatureExtractor().extract(ts)
>>> int(round(float(df["dominant_period"].iloc[0])))
7
- class tseda.features.spectral.SpectralFeatureExtractor[source]
Bases:
objectExtract frequency-domain features from a
TimeSeries.The extractor is stateless. NaN values in the series are replaced by linear interpolation before FFT analysis.
- extract(ts, n_bands)[source]
Return a single-row
pandas.DataFrameof spectral features.- Parameters:
ts (TimeSeries)
n_bands (int)
- Return type:
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.spectral import SpectralFeatureExtractor
>>> idx = pd.date_range("2020", periods=128, freq="h") >>> ts = TimeSeries(np.sin(2 * np.pi * np.arange(128) / 24), index=idx) >>> df = SpectralFeatureExtractor().extract(ts) >>> "spectral_centroid" in df.columns True
- extract(ts, *, n_bands=3)[source]
Compute spectral features for ts.
- Parameters:
ts (TimeSeries) – Input series. NaN values are linearly interpolated before the FFT.
n_bands (int, optional) – Number of equal-width frequency bands for band power features. Default 3 (low / mid / high). Must be >= 1.
- Returns:
One row, columns:
- Energy:
total_power,band_power_0…band_power_{n_bands-1}.- Shape:
spectral_centroid,spectral_bandwidth,spectral_rolloff_0.5,spectral_rolloff_0.85,spectral_entropy,spectral_flatness.- Peak:
dominant_freq,dominant_period,n_spectral_peaks.
- Return type:
- Raises:
TypeError – If ts is not a
TimeSeries.ValueError – If the series has fewer than 8 non-NaN observations or n_bands < 1.
Examples
>>> import pandas as pd, numpy as np >>> from tseda import TimeSeries >>> from tseda.features.spectral import SpectralFeatureExtractor >>> idx = pd.date_range("2020", periods=256, freq="D") >>> ts = TimeSeries(np.cos(2*np.pi*np.arange(256)/7), index=idx) >>> df = SpectralFeatureExtractor().extract(ts) >>> int(round(float(df["dominant_period"].iloc[0]))) 7