SELECT ALL and SELECT DISTINCT

If the ALL keyword is specified, the query does not eliminate duplicate rows. This is the default behavior if neither ALL nor DISTINCT is specified.

If the DISTINCT keyword is specified, a query eliminates rows that are duplicates according to the columns in the SELECT clause.

Note that for these purposes, the value NULL is considered equal to itself and not equal to any other value. These are the same semantics as for GROUP BY and the IS NOT DISTINCT FROM operator.

Streaming SELECT DISTINCT

SELECT DISTINCT can be used with streaming queries as long as there is a non-constant monotonically-increasing expression in the SELECT clause. For more information on what counts as a monotonically-increasing expression, see the topic Monotonic Expressions and Operators in this guide.

SQLstream emits rows for SELECT DISTINCT as soon as they are ready.

Using ROWTIME with SELECT DISTINCT

In order to use ROWTIME as a monotonically increasing expression in SELECT DISTINCT, you must use a ROWTIME expression qualified with the stream name as the first item in the SELECT list. Doing so specifies that ROWTIME refers to the input stream for the SELECT instead of the results stream for the SELECT. (For more information on how ROWTIME is processed in each case, see the topic ROWTIME in this guide.)

Examples of successful expressions include the following:

  • s.ROWTIME
  • STEP(s.ROWTIME BY INTERVAL ‘0.005’ SECOND)
  • FLOOR(s.ROWTIME to MINUTE)
  • STEP(s.ROWTIME BY INTERVAL ‘04:11.333’ MINUTE TO SECOND)

For example:

SELECT STREAM DISTINCT FLOOR(Orders.ROWTIME TO DAY), prodId,
FROM Orders;

displays the set of unique products that are ordered in any given day.

If you want the rowtimes of the output rows to be the time they are emitted, then in the following example you would need to change from form 1 to use form 2 instead:

Form 1

select distinct floor(s.rowtime to hour), a,b,c
  from s;

Form 2

 select min(s.rowtime) as rowtime, floor(s.rowtime to hour), a, b, c

  from s

  group by floor(s.rowtime to hour), a, b, c;