Rolling aggregations

signal.moving_average(window, freq=None, min_periods=1)

Calculate the moving average of a signal. This is typically done with noisy data in order to get a cleaner signal. For instance, looking at daily credit card transaction data doesn’t make much sense because of the noise level, but smoothing over say 90 days gives a more informative signal.

Parameters:
  • signal – The signal to calculate.

  • window – The number of calendar days to calculate moving average over (if ‘freq’ is not set), or the number of data points to calculate the moving average if ‘freq’ is set.

  • freq – Use None to interpret the window as the number of days. Specify a Pandas frequency to use a different unit. For example, if the frequency if M, the window is interpreted as the number of months. If this argument is set, it must be the same as the frequency of the signal.

  • min_periods – The minimum number of data points to require in order to calculate a value. Defaults to 1, which means that a “moving average” is calculated from the very first data point in the time series, even though it’s just an “average” of one data point (and then the next one is an average of 2 and so forth). To avoid noisy data in the beginning of the time series, increase this setting.

Examples:

Retrieve the 90-day moving average of close price data, with a minimum of 70 data points required to average over:

close_price.moving_average(90, min_periods=70)

Retrieve the 3 month moving average of US housing price index, with a minimum of 3 data points to average over:

US_PurchaseOnlyHousePriceIndex_SA_Monthly.moving_average(3, 'MS', 3)
signal.smooth(window: int = 7)

Smoothens the signal to remove noise and more easily discern the underlying trends.

The signal is smoothened with three processing steps:

  1. fill missing values with zeros

  2. calculate the moving average with the given window length

  3. shift the time series to center the moving average in the middle of the window it’s calculated over

Parameters:

window – The number of calendar days to calculate moving average over. The moving average will be shifted backwards by floor((window-1)/2) days. Defaults to 7, which is suitable for averaging out weekly seasonality in the signal.

Example:

To smoothen a daily card spend signal with a 21 day window size:

daily_spend_signal.smooth(21)
signal.rolling_aggregation(window, operation, freq=None, min_periods=1)

Calculate a rolling window operation in the time direction, this is a generalization of the moving_average operation.

Parameters:
  • window – The number of calendar days to calculate moving average over (if ‘freq’ is not set), or the number of data points to calculate the moving average if ‘freq’ is set. The current point is included in the window.

  • operation – The operation carried out on the window. This can be represented as a function, e.g. np.std, lambda expressions or strings like “mean”, “sum”, “max”, “min”, “std”.

  • freq – Use None to interpret the window as the number of days. Specify a Pandas frequency to use a different unit. For example, if the frequency if M, the window is interpreted as the number of months. If this argument is set, it must be the same as the frequency of the signal.

  • min_periods – The minimum number of data points in the window for the transform to return value for each data-point. Typically one will lose some data-points in the beginning of the interval.

Examples:

The the largest daily absolute percentage-wise price movement the last four weeks:

close_price.relative_change(days=1)\
    .rolling_aggregation(28,  lambda w: np.max(np.abs(w)), freq="D")

The largest reported sales this year, assuming standard quarterly releases:

actual('sales').rolling_aggregation(4,  "max", freq="Q")