Data sources may be divided into subsets called “partitions” to enable scaling up of SQLstream processing. For example:

  • Apache Kafka topics are implemented as one or more partitions
  • Amazon Kinesis streams are implemented as one or more partitions
  • The File-VFS plugin allows you to divide sets of incoming files into logical partitions (by hashing the filename).

SQLstream pipelines can be scaled out by implementing a number of shards; the partitions are assigned as evenly as possible across the shards.

For more information about how SQLstream s-Server assists in sharding SQLstream pipelines see the individual plugins:

For more about the mapping of partitions to shards, see shards.