STEP

The STEP function was created specifically for SQLstream. It performs arithmetic similar to FLOOR, but it allows more flexibility when defining time intervals to GROUP BY in the case of a tumbling window. For instance, FLOOR can round down to MINUTE (producing 1-minute intervals) or HOUR (producing 1-hour intervals). But defining intervals of 5 minutes or 30 seconds or 2 hours would not be simple using FLOOR. With STEP, you can customize the interval division using “steps” of any width. See the section on the GROUP BY clause for additional examples using STEP.

Syntax

STEP(<datetimeExpression> BY <intervalLiteral>)

OR

STEP(<IntegerExpression> BY <integerLiteral>)

The STEP function works on either datetime or integer inputs (the first parameter), and it returns a result of the same data type. The second parameter acts as a “flooring unit”, which defines the increment for rounding. (See the topic Expressions and Literals for more information on datetimeExpressions, intervalLiterals, and IntegerExpressions.) Note: The monotonicity of the STEP function output will be the same as that of the first input.

STEP function for Integer

STEP(<IntegerExpression> BY <integerLiteral>)

is equivalent to

(<IntegerExpression> / <IntegerLiteral>) * <IntegerLiteral>

STEP function for Datetime

STEP(<datetimeExpression> BY <intervalLiteral>)

is equivalent to

(datetimeExpression - timestamp '1970-01-01 00:00:00') / <intervalLiteral> ) * <intervalLiteral>
+ timestamp '1970-01-01 00:00:00'

Note: A timestamp at the very beginning of 1970, often referred to as the “Unix Epoch”, is an arbitrary point from which to measure time in computing, which simplifies encoding and reduces storage requirements. Effectively, it is time zero.

Using the STEP Function in a Tumbling Window

Most frequently, the STEP function is used inside a GROUP BY clause to define a tumbling window.

SELECT STREAM partition_id, SUM(measure)
FROM s
GROUP BY STEP(s.ROWTIME BY INTERVAL '5' MINUTE), partition_id;

In the example above, rows are grouped according to 5-minute intervals and, within each interval, grouped by partition_id. At the end of each 5-minute interval, the total measure for each partition_id is emitted.

The output of the STEP function illustrated above is monotonically increasing because its input, s.ROWTIME, is always monotonic. The aggregator correctly generates punctuations when the STEP function is used in a GROUP BY clause. In the previous example, rows with ROWTIME between ‘2016-01-01 07:30:00’ and ‘2016-01-01 07:34.59.999’ would be aggregated as 1 or more output rows with a ROWTIME of

STEP(s.ROWTIME BY INTERVAL '5' MINUTE) + INTERVAL '5' MINUTE
= '2016-01-01 07:35:00'

In other words, when aggregated results are emitted, the aggregator also emits a punctuation for the cutoff point of the interval (‘2016-01-01 07:35:00’), not the beginning (‘2016-01-01 07:30:00’). This reflects the closure of the interval, which is the moment when aggregation calculations can be performed.