Data Source: Filtered data
Filtered data defines a type of data source that is a compound data source. That is, it includes another data source definition in its definition and outputs a modified stream of data. Specifically, a filtered data source contains one or more conditions that are applied to data from the input data source to determine whether that data is output by the compound (filtered) data source.
For example, a signed message data source may submit a stream of transactions providing hourly data for several tickers, like this:
DATA_SOURCE = SignedMessage{ pubkey=0xA45e...d6 }, gives:
{ ticker: 'TSLA', timestamp: '2021-12-31T00:00:00Z', price: 420.69}
{ ticker: 'BTCUSD', timestamp: '2021-12-31T00:00:00Z', price: 42069.303}
{ ticker: 'ETHGAS', timestamp: '2021-12-31T00:00:00Z', price: 100.1}
...
{ ticker: 'TSLA', timestamp: '2021-12-31T01:00:00Z', price: 469.20}
{ ticker: 'BTCUSD', timestamp: '2021-12-31T01:00:00Z', price: 52069.42}
{ ticker: 'ETHGAS', timestamp: '2021-12-31T01:00:00Z', price: 101.0}
...
{ ticker: 'TSLA', timestamp: '2021-12-31T02:00:00Z', price: 440.20}
{ ticker: 'BTCUSD', timestamp: '2021-12-31T02:00:00Z', price: 501.666}
{ ticker: 'ETHGAS', timestamp: '2021-12-31T02:00:00Z', price: 90.92}
... and so on ...
In order to use messages from this signer as, for example, the settlement trigger and data for a futures market, Vega needs a way to define a data source that will trigger settlement when a price is received for the correct underlying and the right expiry timestamp. For example:
DATA_SOURCE = Filter { data=SignedMessage{ pubkey=0xA45e...d6 }, filters=[
Equal { key='ticker', value='TSLA' },
Equal { key='timestamp', value='2021-12-31T23:59:59Z' }
]}
gives:
{ ticker: 'TSLA', timestamp: '2021-12-31T23:59:59Z', price: 694.20 }
Unlike the first example, this would be useful for trigger final settlement of a futures market.
Note that to extract the price value, this would need to be wrapped in a 'select' data source (see Data Sourcing main spec) that specifies the field of interest ('price', here), i.e.:
DATA_SOURCE = select {
field: 'price'
data: filter {
data: SignedMessage{ pubkey=0xA45e...d6 },
filters: [
equal { key: 'ticker', value: 'TSLA' },
equal { key: 'timestamp', value: '2021-12-31T23:59:59Z' }
]
}
}
gives: 694.20
To specify a filtered data source the following parameters can be specified:
data
: (required) another data source definition defining the input datafilters
: (required) a list of at least one filter to apply to the data
This can be any other data source within the data sourcing framework.
These specify the condition to apply to the data. If ALL filters match the data is emitted (note that in future we may add things like 'or' filters that combine other filters but initially this is not required).
For each filter, a key
parameter is required
Filter types:
- Equals: data must exactly match the filter, i.e.
Equals { key='ticker', value='TSLA' }
Greater/GreaterOrEqual
:GreaterOrEqual { key='timestamp', value='2021-12-31T23:59:59' }
Less/LessOrEqual
:GreaterOrEqual { key='timestamp', value='2021-12-31T23:59:59' }
Data that does not pass all filters can be ignored. Ideally this would be done before accepting the transaction into a block, this would mean that for a configured pubkey that may be submitting many transactions to a node, Vega would automatically choose to accept only the specific messages that will be processed by a product or some other part of the system.
To be clear, this also means that if the input data is the wrong "shape" or type to allow the defined filters to be applied to it, it will also be rejected. For instance if a ticker or timestamp field that is being filtered on is not present, the data does not pass the filter.
- Filters can be used with any data source provider (internal, signed message, Ethereum etc.)
- Create a filter for each type of source provider and ensure that only data matching the filter gets through. (0047-DSRF-001)
- Create the same filter for multiple types of provider and ensure that with the same input data, the output is the same. (0047-DSRF-002)
- All filter conditions are applied
- Create a filter with multiple "AND" conditions and ensure that data is only passed through if all conditions are met. (0047-DSRF-003)
- Create a filter using an "OR" sub-filter (if implemented) and ensure that data is passed through if any of the OR conditions are met. (0047-DSRF-004)
- Create a "greater than or equal" filter on the "timestamp" field of the signed message (not on the timestamp when oracle transaction is submitted) (e.g. greater than or equal "2022-04-01" and on "equal" filter on the "asset" field (e.g. equals ETH) of the signed message from Coinbase oracle. Ensure these are applied correctly (0047-DSRF-041).
- Data that is filtered out does not result in a data event but is recorded
- No data source event is emitted for a data source if the triggering event (
SubmitData
transaction, internal source, etc.) does not pass through the filter for that source. (0047-DSRF-005) - No product/market processing is triggered by a data source when the event does not pass through the filters. (0047-DSRF-006)
- When data is filtered out and no event is emitted this is recorded either in logs or on the event bus (this may only happen on the receiving node if the event is a transaction that is rejected prior to being sequenced in a block). (0047-DSRF-007)
- No data source event is emitted for a data source if the triggering event (
- Data sources are defined by the FULL definition including filters
- If two data sources originate from the same data point (transaction, event, etc.) and provider (
SignedMessage
signer group, internal market/object, etc.) but have different filters then data filtered out by one source can still be received by another, and vice versa. (0047-DSRF-008) - If two data sources originate from the same data point (transaction, event, etc.) and provider (
SignedMessage
signer group, internal market/object, etc.) but have different filters or other properties (i.e. they are not exactly the same definition) then any data that passes through and is emitted by both data sources results in a separate event/emission for each that references the appropriate source in each case. (0047-DSRF-009) - If two data sources originate from the same data point (transaction, event, etc.) and provider (
SignedMessage
signer group, internal market/object, etc.) but have different filters or other properties (i.e. they are not exactly the same definition) then any data that is filtered out by both data sources results in a separate log/event for each that references the appropriate source in each case. (0047-DSRF-010) - If two data sources originate from the same data point (transaction, event, etc.) and provider (
SignedMessage
signer group, internal market/object, etc.) but have different filters or other properties (i.e. they are not exactly the same definition) and the data is filtered out by one and emitted/passes through the other, then both the filtering out and the emission of the data are recorded in logs/events that reference the appropriate source. (0047-DSRF-011)
- If two data sources originate from the same data point (transaction, event, etc.) and provider (
- Data types and condition types
- Text fields can be filtered by equality (text matches filter data exactly). (0047-DSRF-012)
- Number fields can be filtered by equality (number matches filter data exactly). (0047-DSRF-013)
- Date + time fields can be filtered by equality (datetime matches filter data exactly). (0047-DSRF-014)
- Number fields can be filtered by less than (number is less than filter data). (0047-DSRF-015)
- Date + time fields can be filtered by less than (datetime is less than filter data). (0047-DSRF-016)
- Number fields can be filtered by less than or equal (number is less than or equal to filter data). (0047-DSRF-017)
- Date + time fields can be filtered by less than or equal (datetime is less than or equal to filter data). (0047-DSRF-018)
- Number fields can be filtered by greater than (number is greater than filter data). (0047-DSRF-019)
- Date + time fields can be filtered by greater than (datetime is greater than filter data). (0047-DSRF-020)
- Number fields can be filtered by greater than or equal (number is greater than or equal to filter data). (0047-DSRF-021)
- Date + time fields can be filtered by greater than or equal (datetime is greater than or equal to filter data). (0047-DSRF-022
- Oracle data filters can be combined together using AND operation (0047-DSRF-023)
- Filtering should cause the transaction containing the data source definition to be rejected when using filters which are deemed out scope (0047-DSRF-024)