Skip to main content

Hybrid Configurable Feed

caution

Supported only for TimeBase streams that use "classic" 4.3 storage format with MAX distribution factor.

caution

Supported only for Data providers that have historical API

caution

Supported only for Data Connector setups when exchange time (rather than TimeBase time) is used for timestamping messages.

Important: this exchange time should me monitonic (don't jump back). In practice this means that only single exchange (per instrument) feeds are supported.

Hybrid process is used to merge historical and live data. Aggregator downloads historical data from Feed provider on process start and then subscribes for live data feed.

  1. Edit Aggregator Processes: right-click the Aggregator box and select Edit Processes.
  2. New Configurable Feed Process: Select New... drop-down menu and select New Hybrid Configurable Feed. This will launch the Data Connector Wizard that is used to define a configurable feed process.
  3. Refer to Configure General Process Properties section for settings.
  4. Specific settings for Hybrid Configurable Feed:
PropertyDescription
New Instrument Backfill DepthBackfill depth of history that will be requested for new instruments. Applied only to instruments that do not have any existing data in target stream.
Max backfill depthMaximum backfill depth. Applied unconditionally. It causes a gap in data, when lastMsgTimestamp < now - maxBackfillDepth for a given symbol. Use the parameter only when gaps are ok for your scenario.
Live Message Buffer SizeNumber of live data messages (per instrument) that will be buffered while Aggregator restores history.
Min Historical DepthMinimum time range for historical query.
Max DurationMaximum process duration. If specified, the process will be terminated after reaching this limit.
Max History Data LagMaximum time historical feed may possibly lag behind live data. If historical feed has not returned any data for T > "Max History Data Lag", the aggregator can then safely assume that there was no live data between (now - T) and (now - "Max History Data Lag").
Switch to Live is no HistorySwitch to Live feed, when history cannot be loaded longer than "Max History Data Lag.
Max Live Data LagMaximum time live feed may be delated (as result of network latency or due to market data subscription constrants - e.g. 10 minutes delayed). If live feed has not returned any data for T > "Max Live Data Lag", the aggregator can then safely assume that there was no live data between (now - T) and (now - "Max Live Data Lag").
Max Sequential Catch-Up AttemptsMaximum number of attempts to catch up with live data, after which aggregator temporarily gives up and pushes the symbol to the back of the catch up queue.
Historical Thread CountNumber of worker threads responsible for history restoration.
  1. Proceed through the Data Connector Wizard and complete the process configuration.
  2. After hybrid process start, history and then live data will be loaded into the target stream.

History Gap Fill Algorithm

Overview

The Hybrid aggregator recovers historical market data gaps by querying a historical data source while simultaneously buffering incoming live data. The goal is to seamlessly catch up to live without data loss or duplication.

How It Works

  1. Start buffering live data into a fixed-size circular queue
  2. Query historical data for the gap period
  3. At the end of each history batch, check if we caught up with live:
    • If historical data timestamps reach buffered live data → switch to queue, then live
    • If not → repeat history query for remaining gap
  4. Continue until caught up or max attempts exceeded

Key Configuration Parameters

ParameterDescription
maxReasonableHistoryLagMaximum time historical feed may lag behind live feed (in feed timestamps, not wall clock). For example, Bloomberg BPIPE historical API typically lags ~5 seconds behind live.
maxReasonableLiveLagMaximum delay of live feed relative to wall clock. For delayed subscriptions (e.g., 15-minute delayed Bloomberg data), set this to the delay interval. For real-time feeds, set to 0 or a small value.
switchIfNoHistoryWhen true, allows switching to buffered live data if historical API returns nothing but live data is flowing. Some data may be lost in the gap.

When Do We Switch to Live?

ScenarioConditionBehavior
Normal catch-uphistoryTimestamp >= earliestBufferedLiveTimestampHistorical data overlaps with buffered live data. Discard overlapping buffered messages, feed remaining buffer to TimeBase, then switch to live.
No live data buffered, market inactivehistoryTimestamp + maxReasonableLiveLag + maxReasonableHistoryLag < nowNo live messages observed and history appears fully caught up. The formula estimates wall clock time when history "should" have caught up, accounting for feed delay and API lag. If current time exceeds this, assume no recent market activity and switch to live.
Live data flowing, history unavailableswitchIfNoHistory=true AND historyTimestamp + maxReasonableHistoryLag < latestBufferedLiveTimestampHistorical API returns no new data, but live messages are being buffered. If the gap between last history and latest live exceeds maxReasonableHistoryLag, assume no data exists in the gap and switch to buffered live data.
Repeated failuresMax attempts or errors exceededAfter maxRescheduleAttempts full retries, or maxErrors consecutive history request failures, channel enters Error state.

Formula Derivation

The "market inactive" check uses domain parameters to determine when historical data has fully caught up:

estimatedWallClock = historyGapStart + maxReasonableLiveLag result = estimatedWallClock + maxReasonableHistoryLag < now

Interpretation:

  • historyGapStart is the timestamp of the last historical message (feed time)
  • Adding maxReasonableLiveLag converts feed time to estimated wall clock time
  • Adding maxReasonableHistoryLag accounts for historical API latency
  • If wall clock now exceeds this sum, the historical API has had sufficient time to return any data after historyGapStart

This approach works correctly for both frequently and rarely traded instruments — we only switch when the data source has had enough time to provide all available data.

Example of Hybrid process configuration

This example illustrates configuration of the Hybrid data collection process for Bloomberg BPIPE configured to receive Level 1 ticks (bid/asks or trades).

  • maxReasonableHistoryLag – We recommend setting this parameter to 10 seconds, since the Bloomberg history API trails live data very closely (2–3 seconds according to Bloomberg Support).

  • maxReasonableLiveLag – This parameter tells us whether market data is delayed. If some of the market data subscriptions used in this data collection process are delayed by up to 20 minutes, set this to 25 minutes.

  • liveMsgBufferSize – Set to 5000 or less. This is how many live messages the process will keep while attempting to restore history, per instrument.
    Since historical queries in the case of Bloomberg work relatively fast and return data that is only 2–3 seconds behind the live feed, we need this buffer to accommodate a few seconds of live data.
    Here we assume that, in a typical case, we will be receiving fewer than 5000 messages per 3-second interval for any instrument we record.
    In the worst case, when the feed rate exceeds this, Hybrid will repeat recovery attempts over and over until the spike of market data activity is over.
    Avoid setting this parameter to an overly large value if you are recording market data for a lot of instruments. This buffer takes 50–100 bytes per message. So if you are recording 1000 instruments:
    1000 × 5000 × 50 bytes = 240 MB of RAM just for these live message queues.

  • minIntervalBetweenHistoricQueryRetry and maxIntervalBetweenHistoricQueryRetry define how often the Hybrid process will request historical data for each instrument. Bloomberg Support recommends an interval of at least 100 milliseconds.
    However, to reduce BPIPE load on repeated attempts to catch up with history, the Hybrid process uses linearly increasing intervals from the given min up to the given max.

note

Configuration parameters that measure time intervals use the following suffixes: H = hours; I = minutes; S = seconds; X = milliseconds

(Only select key parameters are shown)

    <liveMsgBufferSize>5000</liveMsgBufferSize>
<minHistoricalDepth>1I</minHistoricalDepth>
<maxReasonableHistoryLag>10S</maxReasonableHistoryLag>
<switchIfNoHistory>true</switchIfNoHistory>
<maxReasonableLiveLag>25I</maxReasonableLiveLag>
<maxSequentialCatchUpAttempts>5</maxSequentialCatchUpAttempts>
<historicalThreadCount>10</historicalThreadCount>
<maxRescheduleAttempts>10000</maxRescheduleAttempts>
<maxErrors>50</maxErrors>
<minIntervalBetweenHistoricQueryRetry>100X</minIntervalBetweenHistoricQueryRetry>
<maxIntervalBetweenHistoricQueryRetry>5I</maxIntervalBetweenHistoricQueryRetry>