Hybrid Configurable Feed
caution
Supported only for TimeBase streams that use "classic" 4.3 storage format with MAX distribution factor.
caution
Supported only for Data providers that have historical API
caution
Supported only for Data Connector setups when exchange time (rather than TimeBase time) is used for timestamping messages.
Important: this exchange time should me monitonic (don't jump back). In practice this means that only single exchange (per instrument) feeds are supported.
Hybrid process is used to merge historical and live data. Aggregator downloads historical data from Feed provider on process start and then subscribes for live data feed.
- Edit Aggregator Processes: right-click the Aggregator box and select Edit Processes.
- New Configurable Feed Process: Select New... drop-down menu and select New Hybrid Configurable Feed. This will launch the Data Connector Wizard that is used to define a configurable feed process.
- Refer to Configure General Process Properties section for settings.
- Specific settings for Hybrid Configurable Feed:
| Property | Description |
|---|---|
| New Instrument Backfill Depth | Backfill depth of history that will be requested for new instruments. Applied only to instruments that do not have any existing data in target stream. |
| Max backfill depth | Maximum backfill depth. Applied unconditionally. It causes a gap in data, when lastMsgTimestamp < now - maxBackfillDepth for a given symbol. Use the parameter only when gaps are ok for your scenario. |
| Live Message Buffer Size | Number of live data messages (per instrument) that will be buffered while Aggregator restores history. |
| Min Historical Depth | Minimum time range for historical query. |
| Max Duration | Maximum process duration. If specified, the process will be terminated after reaching this limit. |
| Max History Data Lag | Maximum time historical feed may possibly lag behind live data. If historical feed has not returned any data for T > "Max History Data Lag", the aggregator can then safely assume that there was no live data between (now - T) and (now - "Max History Data Lag"). |
| Switch to Live is no History | Switch to Live feed, when history cannot be loaded longer than "Max History Data Lag. |
| Max Live Data Lag | Maximum time live feed may be delated (as result of network latency or due to market data subscription constrants - e.g. 10 minutes delayed). If live feed has not returned any data for T > "Max Live Data Lag", the aggregator can then safely assume that there was no live data between (now - T) and (now - "Max Live Data Lag"). |
| Max Sequential Catch-Up Attempts | Maximum number of attempts to catch up with live data, after which aggregator temporarily gives up and pushes the symbol to the back of the catch up queue. |
| Historical Thread Count | Number of worker threads responsible for history restoration. |
- Proceed through the Data Connector Wizard and complete the process configuration.
- After hybrid process start, history and then live data will be loaded into the target stream.
History Gap Fill Algorithm
Overview
The Hybrid aggregator recovers historical market data gaps by querying a historical data source while simultaneously buffering incoming live data. The goal is to seamlessly catch up to live without data loss or duplication.
How It Works
- Start buffering live data into a fixed-size circular queue
- Query historical data for the gap period
- At the end of each history batch, check if we caught up with live:
- If historical data timestamps reach buffered live data → switch to queue, then live
- If not → repeat history query for remaining gap
- Continue until caught up or max attempts exceeded
Key Configuration Parameters
| Parameter | Description |
|---|---|
maxReasonableHistoryLag | Maximum time historical feed may lag behind live feed (in feed timestamps, not wall clock). For example, Bloomberg BPIPE historical API typically lags ~5 seconds behind live. |
maxReasonableLiveLag | Maximum delay of live feed relative to wall clock. For delayed subscriptions (e.g., 15-minute delayed Bloomberg data), set this to the delay interval. For real-time feeds, set to 0 or a small value. |
switchIfNoHistory | When true, allows switching to buffered live data if historical API returns nothing but live data is flowing. Some data may be lost in the gap. |
When Do We Switch to Live?
| Scenario | Condition | Behavior |
|---|---|---|
| Normal catch-up | historyTimestamp >= earliestBufferedLiveTimestamp | Historical data overlaps with buffered live data. Discard overlapping buffered messages, feed remaining buffer to TimeBase, then switch to live. |
| No live data buffered, market inactive | historyTimestamp + maxReasonableLiveLag + maxReasonableHistoryLag < now | No live messages observed and history appears fully caught up. The formula estimates wall clock time when history "should" have caught up, accounting for feed delay and API lag. If current time exceeds this, assume no recent market activity and switch to live. |
| Live data flowing, history unavailable | switchIfNoHistory=true AND historyTimestamp + maxReasonableHistoryLag < latestBufferedLiveTimestamp | Historical API returns no new data, but live messages are being buffered. If the gap between last history and latest live exceeds maxReasonableHistoryLag, assume no data exists in the gap and switch to buffered live data. |
| Repeated failures | Max attempts or errors exceeded | After maxRescheduleAttempts full retries, or maxErrors consecutive history request failures, channel enters Error state. |
Formula Derivation
The "market inactive" check uses domain parameters to determine when historical data has fully caught up:
estimatedWallClock = historyGapStart + maxReasonableLiveLag result = estimatedWallClock + maxReasonableHistoryLag < now
Interpretation:
historyGapStartis the timestamp of the last historical message (feed time)- Adding
maxReasonableLiveLagconverts feed time to estimated wall clock time - Adding
maxReasonableHistoryLagaccounts for historical API latency - If wall clock
nowexceeds this sum, the historical API has had sufficient time to return any data afterhistoryGapStart
This approach works correctly for both frequently and rarely traded instruments — we only switch when the data source has had enough time to provide all available data.
Example of Hybrid process configuration
This example illustrates configuration of the Hybrid data collection process for Bloomberg BPIPE configured to receive Level 1 ticks (bid/asks or trades).
maxReasonableHistoryLag– We recommend setting this parameter to 10 seconds, since the Bloomberg history API trails live data very closely (2–3 seconds according to Bloomberg Support).maxReasonableLiveLag– This parameter tells us whether market data is delayed. If some of the market data subscriptions used in this data collection process are delayed by up to 20 minutes, set this to 25 minutes.liveMsgBufferSize– Set to 5000 or less. This is how many live messages the process will keep while attempting to restore history, per instrument.
Since historical queries in the case of Bloomberg work relatively fast and return data that is only 2–3 seconds behind the live feed, we need this buffer to accommodate a few seconds of live data.
Here we assume that, in a typical case, we will be receiving fewer than 5000 messages per 3-second interval for any instrument we record.
In the worst case, when the feed rate exceeds this, Hybrid will repeat recovery attempts over and over until the spike of market data activity is over.
Avoid setting this parameter to an overly large value if you are recording market data for a lot of instruments. This buffer takes 50–100 bytes per message. So if you are recording 1000 instruments:
1000 × 5000 × 50 bytes = 240 MB of RAM just for these live message queues.minIntervalBetweenHistoricQueryRetryandmaxIntervalBetweenHistoricQueryRetrydefine how often the Hybrid process will request historical data for each instrument. Bloomberg Support recommends an interval of at least 100 milliseconds.
However, to reduce BPIPE load on repeated attempts to catch up with history, the Hybrid process uses linearly increasing intervals from the givenminup to the givenmax.
note
Configuration parameters that measure time intervals use the following suffixes: H = hours; I = minutes; S = seconds; X = milliseconds
(Only select key parameters are shown)
<liveMsgBufferSize>5000</liveMsgBufferSize>
<minHistoricalDepth>1I</minHistoricalDepth>
<maxReasonableHistoryLag>10S</maxReasonableHistoryLag>
<switchIfNoHistory>true</switchIfNoHistory>
<maxReasonableLiveLag>25I</maxReasonableLiveLag>
<maxSequentialCatchUpAttempts>5</maxSequentialCatchUpAttempts>
<historicalThreadCount>10</historicalThreadCount>
<maxRescheduleAttempts>10000</maxRescheduleAttempts>
<maxErrors>50</maxErrors>
<minIntervalBetweenHistoricQueryRetry>100X</minIntervalBetweenHistoricQueryRetry>
<maxIntervalBetweenHistoricQueryRetry>5I</maxIntervalBetweenHistoricQueryRetry>