Correctness Mode

SQLstream s-Server runs in correctness mode, whereby SQLstream s-Server ensures that every query produces the correct result, even if the query’s data sources are running slow. (This is the only mode for s-Server.)

Correctness mode is closely related to the idea of query time. SQLstream s-Server lets the data determine how fast the clock at the heart of each query ticks. Without that data flowing, the query’s clock does not tick. It does not matter whether the producer has crashed, the network connection from the producer has gone down, or that the producer simply has no data to send; if the query receives no data from the producer, it will wait until it does.

Example

For example, consider the following schema and query:

CREATE STREAM Orders ( ... );
CREATE STREAM Shipments ( ... );
CREATE VIEW ShippedOrders AS
SELECT STREAM *
FROM Orders
JOIN Shipments OVER (RANGE INTERVAL '1' HOUR FOLLOWING)
ON Orders.orderId = Shipments.orderId;
SELECT STREAM *
FROM ShippedOrders
WHERE city = 'New York';

Let’s suppose that there are two JDBC clients creating orders, called Client A and Client B, as shown in Figure 1. (media/correctness-mode.png) Both clients have both prepared the following statement via JDBC:

INSERT INTO Orders (ROWTIME, orderId, custId, prodId, quantity)
VALUES (?, ?, ?, ?, ?)

A third client, Client C, is inserting into the Shipments stream. If client A has just created an order timestamped 10:00, then the query on ShippedOrders cannot emit a record containing that order until it is known that client B does not intend to create an order timestamped earlier than 10:00. It does not matter whether client B is offline, busy doing other processing, or simply has no orders to place. The system will wait for it because it is currently the slowest client.

When client B now comes online and writes orders timestamped 10:15 and 10:30, client A becomes the slowest client inserting orders into the Shipments stream. Note: If client B was to send a rowtime bound of 10:30, indicating that it had no rows before 10:30, it would have the same effect.

If client B were to close its prepared JDBC statement, it is no longer considered by the query clock. The query clock only needs to wait for connected producers. By disconnecting, client B loses the right to send those orders timestamped 10:15 and 10:30. By the time it reconnects, the query clock may have moved on, say to 10:25, and the 10:15 record will be rejected.

Timeliness Mode

Correctness mode makes it easy to write applications that always produce the right results, even in a complex distributed system with many clients and servers, and many interacting streams, views and pumps. But correctness mode is not suitable for every application, because some applications would rather produce a result that is possibly incorrect than wait for one or two producers that are running slow.

For these applications, you can use techniques to mitigate the effects of correctness mode. There are two ways to improve timeliness:

  • Send explicit rowtime bounds. This tells SQLstream s-Server that it can safely move its query time forward to the rowtime bound.
  • Close idle INSERT statements. This tells SQLstream s-Server not to wait for that producer.

The approaches described above, along with the advice on time and streaming data, will allow you to build an application that delivers rows with the right combination of timeliness and correctness.