Resampling and aggregations
Signals which modify the timestamps and possibly the values of the time series. We use the following terminology:
 reindex
To change the index of the time series without changing the values.
 resample
To change the index of the time series and apply some rule to calculate new values based on the original time series.
 upsample
To change the index so that it has higher resolution, for example, modifying a monthly time series to a daily time series.
 downsample
To change the index so that it has lower resolution, for example modifying a daily time series to a monthly time series.
 signal.resample(freq, method, *, upsample_daily=False, min_points=1, allow_partial_start=True, allow_partial_end=True, partial, ffill=0)
Resample the signal to the given frequency. The interval used for for resampling is defined by the
freq
parameter, and the method defines how to aggregate the signal.Intermediate upsampling
If
upsample_daily
is set to True, the resampling is performed by first upsampling the underlying signal to daily frequency, and then downsampling it to the desired frequency.If the method is
sum
ormean_times_days
, the upsample operation divides the value in each interval by the number of days in the interval, so that the value is “spread over” the interval. The consequence of this is that when the signal is downsampled again, the contribution of each interval in the old time series is proportional to the overlap between the new and the old intervals.If the frequency of the underlying time series cannot be detected, the time series is assumed to already have daily resolution, and it is not altered before downsampling.
Exabel recommends setting
upsample_daily=True
if the frequency of the original data does not evenly divide the frequency being converted to. For example, if resampling a weekly time series weekly to months, there will be weeks crossing the boundaries of the months. Similarly, if resampling from a weeks/months to a fiscal frequency (likeFQ
), it is advisable to first upsample to daily. Parameters:
freq – The frequency to aggregate on, for example
M
orFQ
.method – Method to use for aggregation. Standard Pandas methods like
'mean'
,'median'
,'sum'
,'bfill'
(backfill) are supported, as well as the special'mean_times_days'
method. See table below.upsample_daily – Whether to upsample the data to daily resolution before downsampling.
min_points – Minimum number of data points required within a time period for it to be included. Note that if
upsample_daily=True
, the data points are counted after the upsampling, so the requirement applies to the number of daily data points.allow_partial_start – Whether to include the first period if it is only partially covered by the time series.
allow_partial_end – Whether to include the last period if it is only partially covered by the time series.
partial – Whether to include the first and the last period if they are only partially covered by the time series. (Included as a shortcut to set both
allow_partial_start
andallow_partial_end
.)ffill – The number of periods to forward fill the result. The forward fill is performed after aggregation. This can be used to fill gaps where there were not any or not enough data points to calculate the aggregate.
Supported methods:
method 
Description 

mean 
The mean of the values. 
median 
The median of the values. 
sum 
The sum of the values. 
mean_times_days 
The mean of the values multiplied by the number of days within the period.

count 
The number of values. 
std 
The standard deviation of the values. 
var 
The variance of the values. 
sem 
The unbiased standard error of the mean. 
first 
The first value. 
last 
The last value. 
min 
The minimum value. 
max 
The maximum value. 
bfill 
Backfill the values. This is typically done to upsample a time series 
ffill 
Forward fill the values. This is typically done to upsample a time series

nearest 
Take the value from the nearest date. 
interpolate 
Interpolate the values between the two nearest dates. 
The statistical measures 'std'
, 'var'
and 'sem'
normalize by N1 by default (delta
degrees of freedom is 1).
The parameters min_points
, allow_partial_start
and allow_partial_end
only apply to
aggregation / downsampling. It is an error to specify any of these parameters with the upsampling
methods 'bfill'
, 'ffill'
, 'nearest'
or 'interpolate'
.
Examples:
To calculate a monthly average from daily data:
signal.resample('M', 'mean')
To calculate a monthly sum from daily data, and not include the latest partial month:
signal.resample('M', 'sum', partial=False)
To sum up weekly data to fiscal quarters, with intermediate upsampling to daily (to handle weeks that fall into 2 quarters), and not include the latest partial quarter:
signal.resample('FQ', 'sum', upsample_daily=True, partial=False)
To sum up daily data to fiscal quarters, where missing data points are imputed with the mean, and we require at least 70 data points in each quarter, use:
signal.resample('FQ', 'mean_times_days', min_points=70)
To upsample quarterly data ('Q'
) to monthly, where we are certain there are no missing data
points, use:
signal.resample('M', 'bfill')
 signal.upsample(from_freq='auto', to_freq='D', divide=False)
Upsample the given signal to a higher resolution (daily by default). This is typically necessary as the first step when resampling between two frequencies which do not evenly divide each other.
For example, when resampling from weekly to monthly resolution, there will be weeks which overlap with two months. In such a situation, naïvely doing a sum over all the values with dates within the month, would mean that the value for week 52 of 2022, which runs from Dec 26, 2022 to Jan 1, 2023, would be included in the sum for January 2023, because the week is represented by the last day (Jan 1). The better approach would be to divide the value for that week over the seven days of the week, and include 6/7 of the value in the sum for December 2022, and only 1/7 in the sum for January 2023.
The same situation arises when resampling from either weekly or monthly time series to fiscal quarters.
The upsample logic needs to know which frequency the original data represents. The default setting of
from_freq='auto'
will attempt to detect the frequency of the original data, but if the frequency cannot be determined from the data, an error will be returned. If the frequency is known, it is therefore safer to specify an explicit frequency to upsample from. For instance, if a time series consists of a single data point for the date 31 Dec 2022, it is impossible to determine if that data point represents the quarter Q4 2022, the month of December 2022 or the week ending 31 Dec 2022.The
'auto'
setting only works for regular calendars such as weekly or monthly data, not for fiscal calendars. Parameters:
from_freq – Either
auto
or a frequency, such asM
orFQ
. as well as the fiscal calendars like'FQ'
or'FQ/FS'
or'FY'
.to_freq – The frequency to upsample to.
divide – Whether to divide the value by the number of days in the period. If set to
False
, the original value will be used for all dates within the corresponding period. If set toTrue
, the original value will be divided by the number of days in the period (so that e.g.signal.upsample('M').resample('M', 'sum')
brings back the original time series). Generally, if the data represents a sum of some sort, thendivide
should be set toTrue
, whereas if it represents an average, it should be set toFalse
.
Examples:
To upsample weekly data to daily, where each daily value is the value of the corresponding week:
signal.upsample('W')
To upsample monthly sums to daily numbers, where each daily value is the monthly value divided by the number of days in the month:
signal.upsample('M', divide=True)
To upsample a quarterly time series to monthly, where each monthly value is the value of the corresponding quarter:
signal.upsample('Q', 'M')
To resample monthly sums to fiscal quarters, where we also handle partial data with
'mean_times_days'
, use:
signal.upsample('M', divide=True).resample('FQ', 'mean_times_days')
To resample monthly averages to fiscal quarter averages, use:
signal.upsample('M').resample('FQ', 'mean')
 signal.calendarize(freq: str, method: str = 'blended', from_freq: str = None)
Calendarize financial data to the standard calendar.
This is typically used to compare KPIs across companies that use different fiscal calendars. The dates used in the calendarized output are as follows:
freq
Dates
Q
31 March, 30 June, 30 September and 31 December
S
30 June and 31 December
Y
31 December
Three different algorithms for calendarization are supported:
method
Description
blended
Default method. Recommended for “flow” KPIs such as revenue.
Calculated as a blend based on the number of days of overlap between the original
periods and the calendar periods. For example, if a company ends its financial year
in November, the calendarized value for 2022 is 334/365 of the value for the fiscal
year 2022 plus 31/365 of the value for the fiscal year 2023.fiscal
Recommended for “stock” KPIs such as balance sheet items, where it is desired
to take the value from the closest fiscal period.
This uses the company’s fiscal calendar to ensure a 11 match between fiscal
periods and calendar periods, such that each fiscal period is aligned to at most
one calendar period.
However, this works only on signals that have the same frequency as the company’s
fiscal calendar (quarterly/semiannual). For signals with other frequencies
(daily/weekly etc), usenearest
.nearest
Uses Pandas’ resampling algorithm to use the nearest value for each calendarized
data point. This is similar tonearest
, but has a drawback where companies with
irregular calendars may see two quarters mapped to the same calendar period.
This is equivalent tosignal.resample(freq, 'nearest')
. Parameters:
freq – The frequency to calendarize to:
'Q'
(quarterly),'S'
(semiannual) or'Y'
(annual). The aliases'H'
(for'S'
) and'A'
(for'Y'
) are also supported.method – How the calendarization is perfomed. The options are
'blended'
,'nearest'
and'fiscal'
.from_freq – Optionally, a hint about the frequency of the original time series. This is only required if the frequency cannot be detected from the time series itself.
 signal.reindex_like(index_signal, fill_method='ffill')
Returns the signal reindexed so that it has the same timeindex as
index_signal
given the methodfill_method
. This is a wrapper around the pandas reindex method. Parameters:
index_signal – A supplied signal whose timeindex is used for the resampling of signal.
fill_method – The operation used to align series to index_signal before sampling. Valid values are ‘None’, ‘pad’/’ffill’, ‘backfill’/’bfill’, ‘nearest’.
Example:
The price return on earnings release dates:
close_price.relative_change(days=1).reindex_like(fs_actual('sales',alignment='rd'))
 signal.filled_daily(fetch_prior_data=120, fetch_prior_data_from=None, stop_at_last_valid_value=False, fetch_later_data=7, limit=None, allow_forward_fill_for_current_dates=False, *, fill_value=None)
Transforms a signal by changing the frequency to daily and forward filling missing values.
 Parameters:
fetch_prior_data – the number of days of prior data to retrieve in order to forward fill; the default amount is sufficient for quarterly data (with quarters up to seventeen weeks), as long as there are no missing data
fetch_prior_data_from – the start date to use for forward filling. If set, it overrides the
fetch_prior_data
argument.stop_at_last_valid_value – if
True
, values will not be forward filled after the last available nonnull valuefetch_later_data – the number of days after the eval period to retrieve data for to determine the last available nonnull value. Only used if
stop_at_last_valid_value
isTrue
.limit – the maximum number of consecutive null values that are filled. If not set, all null values are filled (assuming there is a nonnull value before it). If set, must be set to 1 or higher.
allow_forward_fill_for_current_dates – if True, and the difference in days between the current date and the date with the last nonNaN value is less than the given limit, values will be forward filled even if
stop_at_last_valid_value
isTrue
. When evaluating with a version, the current date is assumed to be the version.fill_value – By default, the previous value is forward filled. If fill_value is specified, then this value will be used instead. The dates filled are exactly the same, and all the other parameters such as
limit
andallow_forward_fill_for_current_dates
apply as usual.
Examples:
Forward fill data in a monthly signal up to 35 days to make it a daily signal:
my_monthly_data.filled_daily(limit=35)
Fill in missing values with 0, up to 6 days forwards, but not past the last value in the series:
my_daily_signal.filled_daily(limit=6, fill_value=0, stop_at_last_valid_value=True)
 signal.align_to_dates(index_signal, max_forward=None, max_backward=None, pre_extend=None, post_extend=None)
Returns the signal with the values aligned to the dates of the index_signal.
For each value in the signal we find the date in the index_signal which is closest in time and assigns that date to it, provided that it satisfies the movement constraints (given by
max_forward
andmax_backward
).If there are two dates that are equally far away, the value is moved forwards.
If there is no date available within the movement constraints for some value, the value is discarded.
If there are multiple values in the signal that have the same the date as its closest one, the value that is closest in time is aligned to the date; the other values are discarded. If there is a tie between two dates, the one that would be move forward is used, while the other one is discarded.
 Parameters:
index_signal – A signal whose timeindex is used for the resampling of signal.
max_forward – Maximum number of days a data point can be moved forwards. Default is
None
, which means no limit.max_backward – Maximum number of days a data point can be moved backwards. Default is
None
, which means no limit.pre_extend – Offset to preextend the signal evaluation period with. By default, this offset is set equal to the
max_forward
constraint ifmax_forward
is not None and 1 year otherwise.post_extend – Offset to postextend the signal evaluation period with. By default, this offset is set equal to the
max_backward
constraint ifmax_backward
is not None and 1 year otherwise.
Examples:
Align the signal my_quarterly_signal
to fundamental sales, allowing data points to move forwards
10 days and backwards 5 days:
my_quarterly_signal.align_to_dates(fundamental('sales'), max_forward=10, max_backward=5)
Align the signal my_quarterly_signal
to fundamental sales, allowing data points to move
unrestricted:
my_quarterly_signal.align_to_dates(fundamental('sales'),\
pre_extend=pandas.DateOffset(months=6))