Kinesis

Create and manage Kinesis Streams

Overview

Cumulus makes configuring Kinesis streams simpler. Read the following sections to learn about configuring your streams with Cumulus. Example configuration can be found in the Cumulus repo.

Streams

Each stream is defined in its own file where the file name is the name of the stream. These files are located in a configurable folder. Streams are JSON objects with the following attributes:

Here is an example of a stream configuration:

{
  "retention-period": 24,
  "shards": 1,
  "tags": {
    "Key1": "Value1"
  }
}

Shards

After creating a stream, the number of shards in a stream can only be changed by splitting and merging shards. Since Cumulus has no way of knowing how much throughput your stream needs, caution should be taken when updating the number of shards. Cumulus allows you to either double or halve the number of shards in a stream in an effort to keep the size of shards balanced. When you double the number of shards, each shard is split in half. When you halve the number of shards, each pair of adjacent shards is merged into a single shard. If this does not meet your needs you can still use the AWS API directly to perform shard splits and merges to suit your needs, and then just update the shards in your config without doing a sync.

Diffing and Syncing Streams

Cumulus’s Kinesis module has the following usage:

cumulus kinesis [diff|help|list|migrate|sync] <asset>

Streams can be diffed, listed, synced, and migrated. The four actions do the following:

Configuration

Cumulus reads configuration from a configuration file, conf/configuration.json (the location of this file can also be specified by running cumulus with the --config option). The values in configuration.json can be changed to customized to change some of Cumulus’s behavior. The following is a list of the values in configuration.json that pertain to the Kinesis module: