Modifications

Transforms which can be used to modify a signal to make it more suitable to work with in various situations.

signal.truncate(before: str | None = None, after: str | None = None, between: tuple[str, str] | list[tuple[str, str]])

Truncate the time series before and/or after the given date(s).

If a before date is given, then all values before this date are removed.

If an after date is given, then all values after this date are removed.

If between dates are given, the values between the given dates, inclusive, are removed.

Example:

Remove all data points before 1 Jan 2021. If there is a data point on 1 Jan 2021, this will remain.

signal.truncate(before='2021-01-01')

Remove all data points from 2023, but keep earlier and later data points.

signal.truncate(between=('2023-01-01', '2023-12-31'))

Remove all data points from April 2023 and August 2024, but keep all other data points.

signal.truncate(between=[('2023-04-01', '2023-04-30'), ('2024-08-01', '2024-08-31')])
signal.drop_last(n: int = 1, /)

Drop the last data point(s) in the time series.

For signals producing multiple time series for each evaluation entity, note that it is the last n rows with a valid value in any column that are removed. That is, it is the same dates that are removed from all the time series.

signal.delay(align, days, weeks, months, years)

Delay (lag) a signal by a specified number of days, weeks, months or years.

Parameters:
  • align – Whether to align the delayed signal to the original (default False)

  • days – Number of days to delay

  • weeks – Number of weeks to delay

  • months – Number of months to delay

  • years – Number of years to delay

Example:

Delay a signal by 1 month:

signal.delay(months=1)
signal.clip(lower: float | None = None, upper: float | None = None)

Trim the values at the given threshold(s).

If a lower threshold is given, then all values below it are set to this value.

If an upper threshold is given, then all values above it are set to this value.

Example:

Clip small values in the denominator, in order not to get infinite values:

signal_1 / signal_2.clip(lower=1e-6)
signal.replace(to_replace, value)

Replaces the values in to_replace with the value.

If to_replace and value are both single values, any instance of to_replace is replaced by value.

If to_replace is a list and value is a single value, each value equal to any of the values in to_replace is replaced by value.

If to_replace and value are both lists, they must have the same length, and each value which is equal to a value in to_replace is replaced by the corresponding element of value.

If to_replace is a dictionary, value should not be supplied. Any value which is equal to a key in the dictionary, is replaced by the associated value.

Parameters:
  • to_replace – A single value, a list of values or a dictionary.

  • value – A single value or a list of values. If to_replace is a dictionary, this argument should not be provided.

signal.cumsum(*, start_date: str, freq: str, fiscal_entity: str)

Calculate the cumulative sum of a time series.

Without a frequency, this calculates the cumulative sum, starting at start_date (or 1900-01-01 if no start date has been provided).

If a frequency is provided, the signal calculates a periodic cumulative sum, which starts over at the beginning of each period.

Parameters:
  • start_date – The start date of the cumulative sum. (Not relevant if freq is provided.)

  • freq – A frequency, for example M or FQ.

  • fiscal_entity – In the case of fiscal frequencies (e.g. FQ), the resource name of the entity, whose fiscal calendar should be used. If this is not provided, the fiscal calendar of the evaluation entity is used.

signal.cumprod(*, start_date: str, freq: str, fiscal_entity: str)

Calculate the cumulative product of a time series.

See signal.cumsum(...) above for documentation on the arguments.

signal.normalize(normalization_period)

Normalize the signal to zero mean and unit variance. Each time series is normalized separately.

The transform requires a normalization period, which is the time period over which the mean and the variance of the signal are calculated. The entire time series is then normalized using the mean and variance from the normalization period.

Parameters:

normalization_period – The time period over which the mean and variance of the signal are estimated. The period is specified as a tuple with a start date and an end date.

Example:

One use case for normalizing data is to get nicer time series when creating a model, since some models perform better when the input and output variables are normalized:

signal.normalize(('2017-01-01', '2018-12-31')).forecast()