Skip to main content

Data Distribution

TimeBase standard use case is when data is written chronologically in streams in a form of messages. Messages are placed in Time Slice Files (TSF) also chronologically; therefore, each TSF represents (describes) a specific time range. As a result, a TimeBase stream may be seen as a chronological collection of TS files. You may use insert mode to subsequently rewrite or modify some data in TS files. Refer to Basic Concepts to learn more.

Introduction to Data Partitions (Spaces)

TimeBase also allows working with data that is not chronologically arranged. Even though you may use insert mode to write such data, a data distribution approach allows maximizing writing performance (increase throughput and speed of writing data). In case you plan to work with data that is not chronologically arranged, you can create dedicated data partitions (spaces) to physically distribute TS files based on their time range or other criteria.

Using this approach, a loader can reference a specific space when writing data instead of just ingesting data into a stream. TimeBase allows multiple loaders to write concurrently to specific spaces and reading performance is not impacted by this approach. Similar to loaders, cursors can reference spaces to read from a specific time range.

Chronological order is preserved when working with spaces. For example, you can keep all rarely-traded instruments in your portfolio in a specific space, or you can create space, each for a specific time range. In the code example below we write data to a space that stores all messages for the year 2020.

tip

Note, that one loader can write to just one TS file, one cursor can read from more than one TS file at the same time.

caution

Note, that a new space is created (options.space = "2020") only when you start writing messages to it. Empty space cannot be created.

Writing into Spaces

Write into a space
LoadingOptions options = new LoadingOptions();
options.space = "2020";
try (TickLoader loader = stream.createLoader(options)) {
InstrumentMessage message;
// populate message attributes
...
loader.send(message);
...
}
info
  • Refer to TickLoader API to learn how to create a space and write into one.
  • Refer to How To to learn more about spaces and how to work with them.

Reading from Spaces

Select from space
SelectionOptions opts = new SelectionOptions();
opts.raw = false;
opts.live = false;
opts.space = "2020";
int count = 0;
try (TickCursor cursor = stream.select(Long.MIN_VALUE, opts)) {
while (cursor.next())
count++;
}
info
  • Refer to TickCursor API to learn how to read from spaces.
  • Refer to How To to learn more about spaces and how to work with them.

API


// Example 1

LoadingOption options = new LoadingOptions();
options.space = ‘”partition1”;
TickLoader loader = db.createLoader(options)
loader.send(message)

// Example 2

SelectionOptions options = new SelectionOptions()
options.space=”partition1”
TickCursor cursor = stream.select(time, options)

// Example 3

deltix.qsrv.hf.tickdb.pub.TickStream.listSpaces()
deltix.qsrv.hf.tickdb.pub.TickStream.listEntities(java.lang.String)
deltix.qsrv.hf.tickdb.pub.TickStream.getTimeRange(java.lang.String)

Spaces use textual identifiers. Start writing into a space to create it. By default, all data is written into a NULL space.

  • Define String value for the deltix.qsrv.hf.tickdb.pub.LoadingOptions.space attribute.
  • Create deltix.qsrv.hf.tickdb.pub.TickLoader using this options - see attached code example 1.
  • Use deltix.qsrv.hf.tickdb.pub.SelectionOptions.space to select data from a specific space - see attached code example 2.
  • Use the provided in the code example 3 methods to query stream meta-information about spaces.
info

Use Cases

  • Classic use case – several producers write data in parallel, each using a unique set of the symbols, to increase the writing performance.
  • Parallel data processing – several producers write data into a stream using independent time ranges.
  • Use data partitions to store data.

Restrictions

  • Number of spaces should be limited to 10-15 to gain optimal performance result while querying whole data stream.
  • Working with spaces, consider giving more heap memory to TimeBase Server. Each consumer reading the entire stream will require an additional 8-10M of memory per space.
info

Refer to Data Distribution for additional information in relation to spaces.