Advanced signal transformations

signal.surge(short_period, long_period, how, decay=None)

Calculate the surge of a signal. The surge is calculated as the fraction between a moving average of a short window and a moving average of a longer window. The moving average can be exponentially weighted.

When how is 'ma', the short and long period arguments are the window sizes in number of data points.

When how is 'ewm', the short and long period arguments are the parameters sent to the Pandas ewm function, and which decay method to use is controlled with the decay parameter.

The signal is not resampled before the surge is calculated, so the parameters specifying the window periods specify a number of data points, and not a number of days.

Parameters:
  • short_period – The parameter to use for the (exponentially weighted) moving average in the numerator.

  • long_period – The parameter to use for the (exponentially weighted) moving average in the denominator.

  • how – A string specifying what kind of moving average to use, either ‘ewm’ for an exponentially weighted moving average, or ‘ma’ for a regular moving average.

  • decay – The type of decay parameter to use in the ewm function. It can only be set when the how parameter is ewm, and it can be one of 'com', 'halflife', 'span' and 'alpha'. See the documentation of the Pandas ewm function for further details.

Examples:

Calculate the surge in close price using an exponentially weighted mean with half-lives of 5 and 20 data points:

close_price.surge(5, 20, 'ewm', 'halflife')

Calculate the surge in transactions using a regular moving average with windows 28 and 91:

TransactionVolume.surge(28, 91, 'ma')
signal.momentum(days, limit=10)

Calculate the “momentum” of a signal, defined as its relative change versus a certain number of days ago. This method is closely related to the relative_change function, but is specialized for daily time series, and automatically forward fills the underlying signal to get a smooth momentum signal without gaps.

Parameters:
  • signal – The signal to transform.

  • days – The number of days between current and prior period. For example 365 for YoY or 91 for approximately 3 months.

  • limit – The maximum number of days to forward fill the underlying signal.

Examples:

Retrieve the 3 month price momentum of the share price:

close_price.momentum(91)

To get the year-over-year change in close price (smoothed with moving average):

close_price.moving_average(90, min_periods=70).momentum(365)
signal.ewm(halflife=None, *, span=None)

Calculate the exponentially weighted mean of the signal. This is a wrapper around the pandas ewm method.

NaN values is removed from the timeseries before calling the pandas ewm function. It is therefore recommended to ensure that the data is on a known frequency, without missing values, before performing this operation.

Parameters:
  • halflife – The number of data points over which the weight should decay to its half.

  • span – Decay specified in terms of span.

Example:

The exponentially weighted mean of the close price:

close_price.ewm(halflife=14)
signal.sector_neutral(level, transform_type='winsorized_robust')

Cross-sectional normalization of the signal, applied separately for each sector and each date. The sectors refer to the FactSet RBICS classification, where the level 1 - 6 must be specified with the level argument.

Note that the signal is normalized across the set of companies that it is evaluated for. This means that if the signal is evaluated in Plotter, the result will depend on which companies are selected to be plotted. If only a single company is selected, then the result will be a flat line with the value 0, because that’s what a single value is normalized to. For sensible results, select at least three companies within the same sector when plotting in Plotter.

The intended usage is for alpha signals. When evaluated in an alpha test or a portfolio strategy, the signal is evaluated across all the companies included in such alpha test / strategy, which means the alpha signal will be sector neutral for that run.

There are different methods available for performing the normalization:

  • ‘standard’: sklearn StandardScaler/Z-score

  • ‘robust’: sklearn RobustScaler

  • ‘winsorized_standard’: ‘standard’ followed by a soft capping

  • ‘winsorized_robust’: ‘robust’ followed by a soft capping

  • ‘uniform’: sklearn QuantileScaler

  • ‘minmax’: sklearn MinMaXScaler (-1,1)

Parameters:
  • level – The level (1-6) of the FactSet RBICS classification to use.

  • transform_type – Defaults to ‘winsorized_robust’.

Example:

A use case for normalizing data by sector is to avoid sector biases in alpha tests and portfolio strategies. By making an alpha signal neutral by sector, the overall portfolio will be better balanced across sectors:

transactions_yoy.sector_neutral(level=2)
signal.country_neutral(transform_type='winsorized_robust')

Cross-sectional normalization of the signal, applied separately for each country and each date. Each company is assigned to the country of the exchange where it has its primary listing.

Parameters:

transform_type – Defaults to ‘winsorized_robust’. See sector_neutral above for available options.

A use case for normalizing data by country is to avoid country biases in alpha tests and portfolio strategies. By making an alpha signal neutral by country, the overall portfolio will be better balanced across countries.

signal.group_normalize(group_signal, transform_type='winsorized_robust')

Cross-sectional normalization of the signal, applied separately for each group of companies and each date. A separate signal, group_signal, is used to determine the groups.

The most typical use case would be to group companies by sector (using the sector_revenue() signal as the group_signal). However, for this use case there is the shorthand sector_neutral() method above.

Parameters:
  • group_signal – The signal that determines the groups by which the signal will be normalized.

  • transform_type – Defaults to ‘winsorized_robust’. See sector_neutral above for available options.

A use case for normalizing data by groups is to avoid biases in alpha tests and portfolio strategies. Typical use cases would be to normalize by sectors or by countries.

signal.factor_neutral(tag, *factors, screen_frequency)

Neutralizes the effect of one or more factors on a signal by estimating a linear regression with the main signal as the target variable and the factors as the regressors. The output is the residual of the regression.

The set of companies to estimate the regression over must be specified with the tag argument. The tag can be either a fixed set of companies or a screen (where the set of companies changes over time). The signal can be evaluated for companies that were not part of the estimation. Typically, you would use the same tag as the one that is used in the alpha test or portfolio strategy, so that the alpha signal is neutralized for the same set of companies.

The regression is run separately per day. For each date and each factor, the factor values are taken from the latest date where that factor is available (for any entities). This means that e.g. a monthly factor signal can be used with a daily alpha signal as the main signal. However, no forward filling is performed, so the user is responsible for forward filling the factors if necessary.

A typical use case is to subtract the effect of style factors from an alpha signal.

Parameters:
  • tag – the resource name of the tag or screen that defines the group of companies.

  • factors – one or several factors to neutralize.

  • screen_frequency – the frequency with which to evaluate the screen, if a screen is used to define the group of companies. Defaults to 'M' for monthly evaluation. Alternatives include 'W' for weekly or 'Q' for quarterly update of the screen.

Examples:

Remove the growth style factor from an alternative data YoY growth signal:

TransactionDataYoY.factor_neutral(
  'tags/user:2a46627e-4e03-49f2-808e-d6fdadebbc61', factor_loading_growth)

Remove the size and momentum style factors from an alternative data YoY surge signal, using a screen that is updated quarterly:

TransactionDataYoY.factor_neutral(
  'screens/1265',
  factor_loading_short_term_momentum,
  factor_loading_size,
  screen_frequency='Q')
signal.group_transform(transform_type, centering_type, centering_weight_signal, tag, screen_frequency)

The group_transform operation does a cross-sectional normalization of a signal across a set of companies.

Parameters:
  • transform_type – the transform to use, either 'robust', 'winsorized_robust', 'standard', winsorized_standard', 'uniform', 'minmax' or 'identity'. Defaults to 'identity'.

  • centering_type – the centering to use, either 'weighted_mean', 'mean', 'median' or 'none'. Defaults to 'none', which results in a centering given by the transform_type.

  • centering_weight_signal – the signal used as weights when specifying 'weighted_mean' for the centering_type. Must be specified when centering_type='weighted_mean', otherwise ignored.

  • tag – the resource name of the tag or screen that defines the group of companies.

  • screen_frequency – the frequency with which to evaluate the screen, if a screen is used to define the group of companies. Defaults to 'M' for monthly evaluation. Alternatives include 'W' for weekly or 'Q' for quarterly update of the screen.

Transform type

Description

robust

Applies sklearn’s RobustScaler. This transform subtracts the median, and then
scales the data according to the quantile range from 25th to 75th percentile.

winsorized_robust

  • First applies sklearn’s RobustScaler, which subtracts the mean and then
    the data according to the quantile range from 25th to 75th percentile.

  • Then soft-clipping is performed at ±3 standard deviations, by applying
    the tanh function.
    The number of standard deviations can be customized by
    specifying stdev_lim.

standard

Applies sklearn’s StandardScaler, which subtracts the mean and then scales to
unit variance.

winsorized_standard

  • Optionally, outliers can be removed at the very beginning, by setting the
    parameter q_remove to the fraction of the data that should be removed at
    both ends. E.g. q_remove=0.01 will remove the first and the last percentiles.
    By default, this step is not applied.

  • Then applies sklearn’s StandardScaler, which subtracts the mean and then
    scales to unit variance.

  • Finally soft-clipping is performed at ±3 standard deviations, by applying the
    tanh function. The number of standard deviations can be customized by
    specifying stdev_lim.

uniform

Transforms the data to percentiles using sklearn’s QuantileTransformer.
By default a uniform distribution is produced (evenly spread between 0 and 1).
Alternatively, a normal distribution can be obtained by
specifying output_distribution='normal'.

minmax

Scales the data linearly to the range [-1, 1].

identity

No transform, which means that only centering is applied.
Rarely used in practice.

Examples:

When doing a group transform, you must specify which group of companies should be used. This is done in the following way, using the ID of a company tag:

Market_Cap_mUSD.group_transform(
  'winsorized_robust',
  tag='tags/user:2a46627e-4e03-49f2-808e-d6fdadebbc61')

It can also be done using the ID of a company screen. In this case, the set of companies included in the group will be updated periodically based on the criteria of the screen. By default the screen is updated monthly, but this can changed with the screen_frequency parameter, for instance set to quarterly update:

Market_Cap_mUSD.group_transform(
  'winsorized_robust',
  tag='screens/1265',
  screen_frequency='Q')

Apply the uniform transform:

Market_Cap_mUSD.group_transform(
  'uniform',
  tag='tags/user:2a46627e-4e03-49f2-808e-d6fdadebbc61')