Data API signals

The main function for retrieving data in the Exabel platform is data(). This function is used both to retrieve data from vendors you subscribe to, and to retrieve data that you or someone in your organization has uploaded to your namespace.

In order to use the data() function, you need to know the name of the signal you want to retrieve data for. Each signal has a unique name, consisting of a namespace and a signal identifier, separated by a period, such as namespace.signal_name. The namespace is either the namespace belonging to a vendor you subscribe to, or the namespace of your organization. The signal identifier is an identifier the uploader has chosen to identify the signal in question. The namespaces allow different vendors to upload data with the same name.

When you evaluate a data() signal, you must evaluate it for an entity. An entity can for example be a company, a brand belonging to a company, or a sector which consists of multiple companies. When a signal is evaluated for such an entity, you will get the time series associated with that signal for the given entity, if such a time series exists.

If you evaluate an unmodified data() signal, you will get the time series associated with the evaluation entities themselves. It is, however, also possible to use the signal to traverse the entity graph, which makes this a very powerful signal. By using the methods specified below, you can traverse the graph from the evaluation entities to other entities and filter or group entities according to how they are connected to other entities.

data(signal)

Retrieve the time series for the given signal that are associated with the evaluation entities.

Parameters:

signal – A data API signal name, on the format namespace.signal_name.

signal.for_type(*entity_types)

Retrieve the time series that are associated with entities of the specific entity type, following relationship paths from the evaluation entity.

This signal follows the shortest path(s) from an evaluation entity to entities of the requested type, restricted to the combined data model of the evaluation entity namespace, the target entity namespace and the global namespace.

Although it is usually enough to specify a single entity type, it is possible to pass multiple entity types to the method. In that case, the signal follows a path through nodes with each of the specified entity types, in order, in the data model. The last entity type is the entity type for which the data signal is evaluated.

Parameters:

entity_types – The entity types to fetch the signal for, on the form namespace.entity_type, separated by commas.

signal.graph_filter(entity_type: str, entities: str | list[str])

Retrieve data only for entities that are connected to the specified entities.

This signal starts at an evaluation entity for signal and follows the shortest paths to entities of type entity_type. Only those evaluation entities that are connected to one or more of the entities are retained.

Parameters:
  • entity_type – The entity type to filter by, on the form namespace.entity_type.

  • entities – The entities to filter by, either as a single entity namespace.entity or as a list ['namespace.entity1', 'namespace.entity2'].

signal.graph_group_by(entity_type, operation)

Group and aggregate the time series by a specific entity type.

This signal follows the shortest path(s) to find the entities of the given entity type. The evaluation entities are then grouped according to which entity they are connected to, and the time series for each group are aggregated, using the specified operation.

Parameters:
  • entity_type – the entity type to group by, on the form namespace.entity_type.

  • operation – the aggregation operation, either "sum" or "mean".

Examples:

Retrieve time series directly for an evaluation entity:

data('ns.metric')

Retrieve time series for all brands that are connected to the evaluation entity:

data('ns.metric').for_type('ns.brand')

Retrieve time series for a brand that is connected to the evaluation entity through a product:

data('ns.metric').for_type('ns.brand', 'ns.product')
data('ns.metric').for_type('ns.product').for_type('ns.brand')  # Equivalent expression.

Calculate the sum of all the products sold that are connected to the evaluation entity:

data('namespace.products_sold').for_type('namespace.product').sum()

Retrieve all the data for products sold in the “Clothes” category:

data('ns.products_sold').for_type('ns.product').graph_filter('ns.category', 'ns.clothes')

Retrieve all the transactions in stores in a set of countries:

data('ns.transactions').for_type('ns.company_country')\
    .graph_filter('ns.country', ['ns.de', 'ns.fr'])

Get the sales of products, summed by category:

data('ns.products_sold').for_type('ns.product').graph_group_by('ns.category', 'sum')

In some cases, when performing the signal.graph_filter(..) operation, one can leave out the signal.for_type(..) traversal. We will then attempt to identify an associative entity type between the evaluation entity type and the filtered entity type.

Example data model

In the example above, it would be sufficient to evaluate the following signal for a company to get the time series for the “Teacher” occupation:

data('ns.jobs').graph_filter('ns.occupation', 'ns.teacher')

The user has then asked about data for the evaluation company, and the “Teacher” occupation, and we infer that we must fetch the data from the associative ns.company_and_occupation entity type.

If the associative entity is a combination of multiple entity types, one can also apply multiple filters:

data('ns.transactions')\
    .for_type('ns.category_country_channel')\
    .graph_filter('ns.category', 'ns.shoes')\
    .graph_filter('ns.channel', 'ns.online')