Pandas operations

Signals can be transformed with pandas operations using df_function() or series_function(). These functions first evaluate one or more signals, and then invoke a function with the signal results. For signal evaluations with entities, the function is invoked once for each entity with only the entity results.

Generally, it is easier and less error prone to use the built-in native DSL methods wherever available. These functions are meant only as a workaround to handle special cases which cannot be expressed with the normal DSL methods, and should be used with care.

df_function(function, pre_extend_months=15)(signal_1, signal_2, ..., signal_n)

Evaluate the given signals and transform the signal results with the given function. The given function is invoked once for each entity (or once, if the evaluation is without entities), with one DataFrame argument for each given signal.

If a single signal is given, which produces an empty DataFrame, the given function is not invoked.

Parameters:
  • function – The transformation function.

  • pre_extend_months – Number of months to pre-extend signal evaluation.

  • signal_j – For j=1, 2, …, n, the signals to evaluate.

series_function(function, pre_extend_months=15)(signal_1, signal_2, ..., signal_n)

Evaluate the given signals and transform the signal results with the given function. The given function is invoked once for each entity (or once, if the evaluation is without entities), with one Series argument for each given signal.

Note that an error is raised if any of the given signals produce more than a single timeseries per entity.

If a single signal is given, which produces an empty Series, the given function is not invoked.

Parameters:
  • function – The transformation function.

  • pre_extend_months – Number of months to pre-extend signal evaluation.

  • signal_j – For j=1, 2, …, n, the signals to evaluate.

Examples:

Retrieve fundamental sales, but keep only values between 2017-01-01 and 2018-12-31 (note that this is more easily accomplished with the signal.truncate(...) method):

series_function(
    lambda series: series.truncate('2017-01-01', '2018-12-31'))(fundamental("sales"))

Combine fundamental sales and actual sales (note that this is more easily accomplished with the signal.combine_first(...) method):

series_function(lambda fundamental, actual:
                pandas.concat([fundamental, actual], axis=1)
                    .apply(sorted, key=np.isnan, axis=1)
                    .apply(lambda array: array[0]))\
    (fundamental('sales', alignment='fp', currency='USD'),
     actual('sales', alignment='fp', currency='USD'))

Retrieve FactSet segment sales for segments named “China” or “Greater China” (note that this is more easily accomplished with the signal[...] operator or the signal.filter_columns(...) method):

df_function(lambda df: df.loc[:,df.columns.isin(["China","Greater China"])])\
    (graph_signal("factset.actual_sales_quarterly",
                  ["HAS_SECURITY", "HAS_PRIMARY_REGIONAL", "factset.HAS_SEGMENT"]))

Calculate the number of days from today (UTC) until the next quarterly report publishing date (this example uses two series_function to take advantage of the fact that the second function is not invoked if the first function produces an empty series, which would happen if there are no publishing dates after today):

series_function(
    lambda series: pandas.Series([(series.index[0]-today()).days], index=[today()]))\
    (series_function(lambda series: series.loc[series.index>=today()])(publication_date()))