Main concepts and workflows definition and overview

Time Series Data

In real life, time series can be a collection of observations produced over time by multiple sources like IoT sensors, software, stock trading platforms, payment processing systems and many other. Usually, time series is presented as relatively large and chronologically arranged data sets. Due to the nature of time series, it may be pretty hard and costly to perform complex queries or habitual manipulations with data like grouping, joins and other. This is why, in most use cases, time series data is collected and consumed chronologically in streams and does not undergo complex transformations. This nature of time series data makes traditional databases less suitable to store and process such data.

TimeBase is a powerful time series streaming database and a messaging middleware initially designed for aggregation and streaming of large volumes of ultra-high frequency, low latency time series data.

Messages

TimeBase stores time series events as Messages and each type of event has a personal message class assigned to it in TimeBase. Message class has a set of fields (attributes) that characterize, describe, identify each specific type of event. Classes also have two attributes inherited from instrument message parent class. These fields are timestamp (time and date of an event) and symbol (personal identifier of the time series source). All this makes messages similar to classes in high-level programming languages.

On the illustration below, financial market data for the specific trading instrument (symbol) is represented in a form of Trade and Best Bid/Offer (BBO) messages. Each message class has an individual set of fields and two attributes inherited from the parent class: symbol (trading instrument identifier) and a specific timestamp.

TimeBase supports a wide variety of primitive and complex data types.

Streams

Messages are stored in streams chronologically by their timestamps for each symbol. In the following illustration of a stream we see messages of different classes (Best Bid/Offer (BBO) and Trade) aggregated by symbols (AAPL, AMZN etc.). All messages are arranged in the stream chronologically according to their timestamps (T1, T2 and so on). TimeBase offers a nanosecond timestamp resolution.

TimeBase streams may contain messages of more than one class. In this case a specific stream schema determines which message classes can be recorded to each stream.

How to Write Data

Publishers (loaders) write data into streams. Loaders add messages to streams in a chronological order. Several loaders can write to the same stream; however, each loader can write data only to one specific stream.

How to Consume Data

Subscribers (cursors) read data from streams. A single stream may have several cursors that read data from it and each cursor can read data from one or multiple streams. Cursors consume messages in exactly the same chronological order as they have been written by loaders.

Timestamp, symbol and class can be perceived as “keys” in other types of databases and serve for message indexing. Cursors use keys to subscribe for a specific subset of data. Data consumed by cursors is a transformation of the original stream’s timeline of messages, where messages are filtered using any or all of the “keys” and are always chronologically arranged (T2>T1.. etc.)

Subscribe to Entire Stream

Cursors (subscribers) can consume the entire stream data. In this case, a cursor receives all messages in the very same chronological order as they have been written by loaders (publishers). Messages for all symbols are aggregated in one time sequence based on their timestamps. The image blow illustrates a stream timeline with all messages arranged chronologically.

Filter Subscription by Time

Cursors (subscribers) can filter subscription messages based on their timestamps. Original chronological order is preserved. The image blow illustrates a subscription for all stream messages starting from the timestamp T6.

Filter Subscription by Time and Class

Cursors (subscribers) can filter subscription messages based both on their timestamps and classes. Original chronological order is preserved. The image blow illustrates a subscription for all stream messages of a Trade class and starting from the timestamp T6.

Filter Subscription by Time and Symbol

Cursors (subscribers) can filter subscription messages based both on their timestamps and symbols. Original chronological order is preserved. The image blow illustrates a subscription for all stream AMZN messages and starting from the timestamp T7.

Comparison to Messaging Brokers and Databases

If we looked at other solutions that perform similar tasks, we could compare TimeBase streams to tables in relational databases. You can write data into streams and schemas determine what kind of messages can be recorded to each specific stream. The key difference is that time series data is recorded in streams and consumed sequentially, whereas in relational databases it is done via SQL queries using different groupings and joins.

On the other hand, streams and the way we work with them are similar to message brokers. Multiple readers and writers can work concurrently on the same stream. In TimeBase data may be kept both in memory and be persisted to disk.

TimeBase can be used, both, as a traditional time-series database and real-time data messaging/streaming server. Both use-cases can be implemented on the same instance of TimeBase.