Skip to main content

Overview

Introduction

Cluster serves several key purposes. It handles large datasets that cannot be accommodated on a single machine and ensures data integrity by offering failure tolerance, preventing data loss if a machine fails. Cluster scales effectively when there is a substantial influx of data from loaders to TimeBase or outgoing data from TimeBase to readers.

Cluster achieves these objectives through specific methods. It organizes a collection of TimeBase servers into a unified logical cluster, utilizing a Raft-based consensus algorithm. Cluster leverages the existing TimeBase storage engine with minimal modifications to provide data storage capabilities.

Features

Cluster features include:

  • Data Sharding: Ability to store streams that can't fit on a single server. Data can be split manually by "spaces" and automatically by time.
  • Fault Tolerance: A copy of the data is stored on different servers.
  • High Availability: The cluster can continue work even if some nodes are unavailable.
  • Load Distribution: Limited. If clients access different data they are more likely to access different servers.

Limitations

  1. Data Processing Capacity: Cluster lacks support for scenarios where a single data producer generates more data than a single TB server can process. We can offer a partial workaround involving the creation of multiple loaders with distinct "spaces."
  2. Data Writing Mode: The TimeBase cluster version exclusively supports "append-only" data writes. No insertion in the middle of the time range or deletions from the middle of the range are permitted.
  3. Symbol-Specific Operations: Cluster does not provide operations specific to individual symbols, such as truncation, clearing, or deletion of a single instrument. Instead, these actions can only be applied to entire streams or a specific "space" within a stream.
  4. Client-Side Timestamps: Message timestamps are set on the client side, and Cluster only allows for one concurrent loader per stream "space." Utilizing multiple loaders for a single stream requires each loader to have its own designated "space."
  5. Performance Considerations: Most synchronous operations in Cluster are generally characterized by slower performance, and there is an inherent increase in end-to-end message latency when the replication factor exceeds 1.
  6. API Limitations:
    • Cluster only supports LoadingOptions.writeMode = WriteMode.APPEND.
    • No insertion in the middle of the time range, no deletions from the middle of the range.
    • Only one concurrent loader per stream partition (space).

M3 (7.2) Release

  1. Schema Modifications: Support for schema changes is limited to those that do not impact existing data. You can add new fields to the end or introduce new messages, however, removing or editing existing fields or messages is not supported.
  2. Topic API: This release does not include support for a "topic" API.
  3. Stream Type: This release exclusively supports DURABLE streams.
  4. Logging Levels: Users should be aware of excessive logging within this release. It's important to take into account the potential impact of extensive logging on Cluster performance and maintenance.