CREATE PUMP

The CREATE PUMP statement creates a pump, either in the specified schema (if the qualified-pump-name includes a schema name) or in the current schema.

A pump is a SQLstream schema object (an extension of the SQL standard) that provides a continuously running INSERT INTO stream SELECT… FROM query functionality, thereby enabling the results of a query to be continuously added into a named stream or foreign stream. You can also use a MERGE statement for the INSERT INTO clause, but only if you are inserting into a (foreign) table.

The topic MERGE in this guide provides an example of using a MERGE statement with a pump. See the topics SELECT statement and INSERT statement in this guide for more details.

All pumps are created as STOPPED by default. You can also specify that they be created as STARTED. We recommend creating pumps as STOPPED and then using ALTER PUMP to start them.

You cannot create a network of pumps with cyclic references - that is, pumps that feed streams from which their own data has originated. s-Server will return an error if you try to create such a pump. See Pumps and Cyclic References below.

Subtopics:

Syntax

CREATE [ OR REPLACE ] PUMP [<schema-name>].<pump-name> 
[ STARTED | STOPPED ]
[ OPTIONS ( <variable> 'value' [, <variable> 'value'] ) ]
[ DESCRIPTION '<string-literal>' ] 
AS <streaming-insert>

Syntax Chart

Notes

  • <schema-name> is optional if you have already set a schema (see SET SCHEMA).
  • <variable> is the name of a session variable, which is treated as a pump OPTION. For more information about setting pump OPTIONS please see User Defined Session Variables.
  • <streaming-insert> is a streaming INSERT or MERGE statement such as:
    INSERT INTO <sink-stream-name> SELECT STREAM <expression-list> FROM <source-stream>
    
  • You should normally specify a column list for both the query and the named stream (these imply a set of source-target pairs). These column lists need to match in terms of datatype, or the SQL validator will reject them. These need not list all columns in the target stream; you can set up a pump for any one or more columns. A simple pump such as the one below that doesn’t explicitly specify the source expressions and target column names will match source expressions to target columns in the order they are defined (not by column name). If columns are wrongly ordered or the number of expressions in the select list doesn’t match the number of columns in the sink stream, the pump will not be created.
    CREATE PUMP simple_pump STOPPED
    AS
    INSERT INTO sink_stream
    SELECT STREAM * from source-stream
    

Example

The following code first creates and sets a schema, then creates two streams in this schema:

  • “OrderDataWithCreateTime” which will serve as the origin stream for the pump.
  • “OrderData” which will serve as the destination stream for the pump.
CREATE SCHEMA "Test";
SET SCHEMA '"Test"';

CREATE OR REPLACE STREAM "OrderDataWithCreateTime" (
"key_order" VARCHAR(20),
"key_user" VARCHAR(20),
"key_billing_country" VARCHAR(20),
"key_product" VARCHAR(20),
"quantity" INTEGER,
"eur" DOUBLE,
"usd" DOUBLE)
DESCRIPTION 'Creates origin stream for pump';

CREATE OR REPLACE STREAM "OrderData" (
"key_order" VARCHAR(20),
"key_user" VARCHAR(20),
"country" VARCHAR(20),
"key_product" VARCHAR(20),
"quantity" INTEGER,
"eur" DOUBLE,
"usd" DOUBLE)
DESCRIPTION 'Creates destination stream for pump';

The following code uses these two streams to create a pump. Data is selected from “OrderDataWithCreateTime” and inserted into “OrderData”.

CREATE SCHEMA "Pumps";
SET SCHEMA '"Pumps"';
CREATE OR REPLACE PUMP "200-ConditionedOrdersPump" STOPPED AS
--We recommend creating pumps as stopped
--then using ALTER PUMP [..] START to start it
INSERT INTO "Test"."OrderData" (
"key_order", "key_user", "country",
"key_product", "quantity", "eur", "usd")
--note that this list matches that of the query
SELECT STREAM
"key_order", "key_user", "key_billing_country",
"key_product", "quantity", "eur", "usd"
--note that this list matches that of the insert statement
FROM "Test"."OrderDataWithCreateTime";

To start writing data, use the following code. The code below ALTERS the pump using a qualified pump name (schema and pump).

ALTER PUMP "Pumps"."200-ConditionedOrdersPump" START;

For more detail, see the topic Pumps, Streams, and Processing Pipelines in the s-Server Concepts Guide.

Pumps and Cyclic References

The nature of streaming data means that streaming applications work in pipelines, with multiple streams connected by pumps. For example, you might have stream 1 that ingests data from a log file; stream 2 that enriches this data by joining it with data from an RDBMS system; stream 3 that performs analysis on this data; and stream 4 that provides data for StreamLab or another visualization application.

s-Server disallows cyclic references, that is, pumps that feed earlier streams from later streams. In the example above, you could not add an additional pump that feeds stream 1 from stream 4. When you execute a CREATE PUMP statement, s-Server checks for cyclic references and returns an error if it finds any.

Using Topological Sort

To determine cyclic reference, we do a topological sort of all existing pumps to determine if they are cyclic. The nature of streaming data means that streaming applications work in pipelines, with multiple streams connected by pumps. For example, you might have stream 1 that ingests data from a log file; stream 2 that enriches this data by joining it with data from an RDBMS system; stream 3 that performs analysis on this data; and stream 4 that provides data for StreamLab or another visualization application. Here, the following pumps operate:

  • Pump # 2 between streams 2 and 3
  • Pump # 1 between streams 1 and 2
  • Pump # 3 between streams 3 and 4.

These streams represent a topological order, and in order for the application to function correctly, pumps need to start in order. s-Server automatically starts pumps in the correct topological order when you use a single ALTER PUMP statement referencing multiple pumps.