Modifications
Transforms which can be used to modify a signal to make it more suitable to work with in various situations.
- signal.truncate(before: str | None = None, after: str | None = None, between: tuple[str, str] | list[tuple[str, str]])
Truncate the time series before and/or after the given date(s).
If a
before
date is given, then all values before this date are removed.If an
after
date is given, then all values after this date are removed.If
between
dates are given, the values between the given dates, inclusive, are removed.
Example:
Remove all data points before 1 Jan 2021. If there is a data point on 1 Jan 2021, this will remain.
signal.truncate(before='2021-01-01')
Remove all data points from 2023, but keep earlier and later data points.
signal.truncate(between=('2023-01-01', '2023-12-31'))
Remove all data points from April 2023 and August 2024, but keep all other data points.
signal.truncate(between=[('2023-04-01', '2023-04-30'), ('2024-08-01', '2024-08-31')])
- signal.indexed(date: str = None, base: float = 100.0)
Scale the time series so that it gets the base value at the given date.
The date is optional: If a date is not provided, the start of the evaluation period is used as the index date. If you do not specify a date, the
indexed()
transform should be the last transform you apply, since other transforms will typically change the evaluation interval passed to the underlying signal. For example,signal.indexed().moving_average(14)
would index the time series two weeks before the evaluation interval you request before calculating the moving average. On the other hand,signal.moving_average(14).indexed()
would first calculate the moving average, and then properly index the result at the start of the evaluation period.If there is no value on the requested date, the next data point is used instead, if any. If there are no subsequent data points, the last data point is used.
- Parameters:
date – The date on which the signal should be indexed.
base – The value on the indexation date.
Example:
Set the value to 100 on 1 January 2020:
signal.indexed('2020-01-01')
Set the value to 1 on 1 January 2020:
signal.indexed('2020-01-01', base=1)
Set the value to 100 at the start of the evaluation interval:
signal.indexed()
- signal.drop_last(n: int = 1, /)
Drop the last data point(s) in the time series.
For signals producing multiple time series for each evaluation entity, note that it is the last n rows with a valid value in any column that are removed. That is, it is the same dates that are removed from all the time series.
- signal.delay(align, days, weeks, months, years)
Delay (lag) a signal by a specified number of days, weeks, months or years.
- Parameters:
align – Whether to align the delayed signal to the original (default False)
days – Number of days to delay
weeks – Number of weeks to delay
months – Number of months to delay
years – Number of years to delay
Example:
Delay a signal by 1 month:
signal.delay(months=1)
- signal.clip(lower: float | None = None, upper: float | None = None)
Trim the values at the given threshold(s).
If a
lower
threshold is given, then all values below it are set to this value.If an
upper
threshold is given, then all values above it are set to this value.
Example:
Clip small values in the denominator, in order not to get infinite values:
signal_1 / signal_2.clip(lower=1e-6)
- signal.replace(to_replace, value)
Replaces the values in
to_replace
with the value.If
to_replace
andvalue
are both single values, any instance ofto_replace
is replaced byvalue
.If
to_replace
is a list andvalue
is a single value, each value equal to any of the values into_replace
is replaced byvalue
.If
to_replace
andvalue
are both lists, they must have the same length, and each value which is equal to a value into_replace
is replaced by the corresponding element ofvalue
.If
to_replace
is a dictionary,value
should not be supplied. Any value which is equal to a key in the dictionary, is replaced by the associated value.- Parameters:
to_replace – A single value, a list of values or a dictionary.
value – A single value or a list of values. If
to_replace
is a dictionary, this argument should not be provided.
- signal.cumsum(*, start_date: str, freq: str, fiscal_entity: str)
Calculate the cumulative sum of a time series.
Without a frequency, this calculates the cumulative sum, starting at
start_date
(or 1900-01-01 if no start date has been provided).If a frequency is provided, the signal calculates a periodic cumulative sum, which starts over at the beginning of each period.
- Parameters:
start_date – The start date of the cumulative sum. (Not relevant if
freq
is provided.)freq – A frequency, for example
M
orFQ
.fiscal_entity – In the case of fiscal frequencies (e.g.
FQ
), the resource name of the entity, whose fiscal calendar should be used. If this is not provided, the fiscal calendar of the evaluation entity is used.
- signal.cumprod(*, start_date: str, freq: str, fiscal_entity: str)
Calculate the cumulative product of a time series.
See
signal.cumsum(...)
above for documentation on the arguments.
- signal.normalize(normalization_period)
Normalize the signal to zero mean and unit variance. Each time series is normalized separately.
The transform requires a normalization period, which is the time period over which the mean and the variance of the signal are calculated. The entire time series is then normalized using the mean and variance from the normalization period.
- Parameters:
normalization_period – The time period over which the mean and variance of the signal are estimated. The period is specified as a tuple with a start date and an end date.
Example:
One use case for normalizing data is to get nicer time series when creating a model, since some models perform better when the input and output variables are normalized:
signal.normalize(('2017-01-01', '2018-12-31')).forecast()