A Streaming JOIN refers to a JOIN in a streaming query where at least one of the relations being joined is a stream.
Streaming joins work just like regular table joins, but subject to the considerations implicit in dealing with streams, that is, rolling windows and rowtimes:
Streaming joins take place over a time-based subset of records. By default, these subsets are ordered by row timestamps: rows from each stream whose ROWTIME values match are joined according to the JOIN condition.
A join with an explicit window is called a windowed join. This example matches bids and asks by ticker, shares, and price within a ten-second window. It is written with an inline window specification.
CREATE OR REPLACE VIEW "MatchBidsAndAsks"
DESCRIPTION 'match bids and asks for each ticker over a 10-second time window' AS
SELECT STREAM
B.ROWTIME AS "bidTime",
A.ROWTIME AS "askTime",
B."ticker",
B."shares" AS "bidShares",
B."price" AS "bidPrice",
A."shares" AS "askShares",
A."price" AS "askPrice"
FROM SALES.BIDS OVER (RANGE INTERVAL '10' SECOND PRECEDING) AS B
JOIN SALES.ASKS OVER (ROWS CURRENT ROW) AS A ON A."ticker" = B."ticker"
AND A."shares" = B."shares"
AND A."price" = B."price";
For readability, or for reuse of a window specification within a query, you can use a WINDOW alias to define a named window. Here is the above example rewritten to use a WINDOW alias:
CREATE OR REPLACE VIEW "MatchBidsAndAsks"
DESCRIPTION 'match bids and asks for each ticker over a 10-second time window' AS
SELECT STREAM
B.ROWTIME AS "bidTime",
A.ROWTIME AS "askTime",
B."ticker",
B."shares" AS "bidShares",
B."price" AS "bidPrice",
A."shares" AS "askShares",
A."price" AS "askPrice"
FROM SALES.BIDS OVER "lastTenSeconds" AS B
JOIN SALES.ASKS OVER (ROWS CURRENT ROW)AS A ON A."ticker" = B."ticker"
AND A."shares" = B."shares"
AND A."price" = B."price"
WINDOW "lastTenSeconds" AS (RANGE INTERVAL '10' SECOND PRECEDING);
If you do not specify an OVER clause for a stream, the join evaluates for the current row only. It may be useful in some cases to omit an OVER clause for one of the streams, but you may receive useful results if you omit the OVER clause for both streams unless you know the streams will emit data at the same rowtimes. For example, both streams represent periodic aggregates (such as hourly summaries) then they will both be emitting data in the same millisecond. In this case, there is no need for a window because the prior aggregation will sync the joined streams to the same rowtime.
SELECT STREAM
​ LF."accountNumber",
​ "loginFailureCount",
​ "transactionType",
​ "amount"
FROM "SuspectLoginFailures" AS LF
JOIN "Transactions" AS TX
​ OVER (RANGE INTERVAL '1' MINUTE PRECEDING)
​ ON LF."accountNumber" = TX."accountNumber"
WHERE ("transactionType" = 'isDebit');
You can factor out the RANGE INTERVAL window by giving it an alias, as in this equivalent SQL definition:
SELECT STREAM
​ "SuspectLoginFailures"."accountNumber",
​ "loginFailureCount",
​ "transactionType",
​ "amount"
FROM "SuspectLoginFailures" OVER "lastFew"
JOIN "Transactions" OVER "lastFew"
​ ON "SuspectLoginFailures"."accountNumber" = "Transactions"."accountNumber"
WHERE ("transactionType" = 'isDebit')
WINDOW "lastFew" AS (RANGE INTERVAL '1' MINUTE PRECEDING);
Using a separate WINDOW definition is particularly useful when joining a stream to two or more other streams as it is easier to maintain (for example, if you want to increase the gap from one minute to two.
As well as using time-based windows, you can also use a defined number of rows (row-based windows).
SELECT STREAM
ticker,
MIN("price") OVER w AS PriceMin,
MAX("price") OVER w AS PriceMax,
FROM sales.bids
WINDOW w AS (ROWS BETWEEN 3 PRECEDING AND 0 FOLLOWING);
(See Windowed Aggregation on Streams.) Note: When several streams are joined, the “secondary” streams don’t all have to use the same WINDOW. They can also mix ROWS and INTERVAL windows.