Configuring s-Server Parameters

You can customize the operation of the s-Server in a number of ways, through SQLstream s-Studio (the user interface) and other components such as plugins and drivers. This topic examines each of these mechanisms and the standard parameters they control.

You can set and read parameters in several ways, including system parameters--which are set in SQL--Java parameters--which are set using a properties file--and adapter parameters--which are also set using SQL. It is important to distinguish how and where you do so because, for example, you need to know which parameters override others. You can also configure how s-Server manages trace files. The topic Using Trace Files covers how to configure tracing parameters.

This page covers the following subtopics:

Configuring Time and Time Zone

Because s-Server's streams make heavy use of timestamps, managing system clocks and time zones are both important considerations in running s-Server.

Data and Wallclock Time

SQLstream strongly recommends that administrators use NTP (Network Time Protocol) on all computers to ensure that system clocks are closely synchronized. However, the system will still operate correctly if there is an offset between clocks.

Time Zone

By default, s-Server and s-Studio are set to operate in UTC/GMT. SQLstream strongly recommends leaving this setting intact unless you are absolutely sure that all SQLstream components (including all agents) are running in the same time zone. Time zone is set by a JVM parameter called user.timezone defined in the defintAspenRuntime.sh script. See Using defineAspenRuntime.sh below for more details.

Configuring System Parameters

System parameters determine the extent and behavior of the SQLstream s-Server.

You set system parameters using SQL:

ALTER SYSTEM SET <name> = <value>;
ALTER SESSION SET <name> = <value>;

Note that parameter names are mixed-case identifiers, so you will need to enclose them in double quotes. For example, to set the databaseMaxSize parameter to 100000, you would type ALTER SYSTEM SET "databaseMaxSize" = 100000;

You can set the values for the entire system or for a particular session, using the ALTER SYSTEM and ALTER SESSION commands. Different parameters may take effect at different times, as indicated by the "Update Policy" entries in the following table.

The SYSTEM and SESSION designators determine the scope of the parameter being set. There are three levels of scope for a repository parameter: SYSTEM, DEFAULT, and SESSION, as explained in the following table.

Scope level Meaning
SYSTEM Value applies to the whole system (all nodes, all sessions).
DEFAULT Value applies to all SESSIONs that do not specifically override the value.
SESSION Value applies only to the current session.

SYSTEM and DEFAULT scope persist across restarts of the server. SESSION scope applies only to a running session. For details, see the topics ALTER SESSION and ALTER SYSTEM in the SQLstream SQL Reference Guide.

Most parameter changes take place either immediately or the next time the system is started. A few are immutable, but are included in the table for reference; treat them as read-only parameters.

Note, however, that configuring SQLstream s-Server to accept JDBC driver connections from other hosts, even if the server is behind a NAT router, requires additional preparation as described in the subtopic Configuring JDBC Driver Connections from Other Hosts below.

Parameter Type and Usage Default Value Update Policy
cachePagesInit 32-bit integer (number of pages) Sets number of memory buffers to allocate from the OS for Fennel's data cache (also used as a pool for volatile scratch memory allocated by ExecStreams). Increasing it after startup causes new free buffers to be allocated; decreasing it causes buffers to be released (flushing them first if they contain dirty data). Note that this memory is NOT part of the JVM heap, although it shares the same virtual memory address space. 5000 Immediate
cachePageSize 32-bit integer (number of bytes; must be a power of 2). Determines contiguous size of memory buffers allocated by cache. Also determines contiguous size of pages stored on disk; these are the elemental allocation units for all disk-based data structures such as indexes. 65536 Immutable after database creation.
cachePagesMax 32-bit integer (number of pages). Determines upper bound for number of pages cache can allocate. On startup, actual upper bound is set to the larger of this parameter and cachePagesInit. Subsequently, changes to cachePagesInit are limited by upper bound calculated at startup. 70000 Takes effect on startup.
cacheReservePercentage 32-bit integer (percentage between 1 and 99, inclusive). Usage: The percentage of the Fennel data cache that should be set aside for global resource usage. The remainder will be available to allocate to individual statements. 5 Immediate.
calcVirtualMachine String, one of CALCVM_FENNEL, CALCVM_JAVA, or CALCVM_AUTO. Usage: Controls which calculator virtual machine is used for evaluating SQL row expressions. CALCVM_FENNEL uses a virtual machine implemented inside Fennel. CALCVM_JAVA generates and compiles Java code on the fly. CALCVM_AUTO allows the optimizer to choose a blend of the two calculators based on cost or other factors. Changing this parameter causes all entries to be discarded from the code cache. CALCVM_AUTO Immediate.
checkpointInterval 32-bit integer (seconds, 0 to disable timer-based checkpoints). Sets the interval between automatic timed checkpoints. Regardless of this setting, checkpoints may occur at other times as well (e.g. when the system detects that free log space is low, or when an administrator issues the explicit CHECKPOINT command). 300 Takes effect on startup.
codeCacheMaxBytes 64-bit integer (number of bytes, 0 for disabled, -1 for unlimited). Usage: Sets the size of the Farrago code cache, which is used for optimizer plan caching and SQL/MED data wrapper pooling. Setting to MAX causes the cache to grow without bound except for the JVM heap size. Setting to MIN disables the code cache, discarding any currently cached entries. Setting to a finite value causes LRU victimization according to rough estimates on per-object memory usage. 2000000 Immediate.
databaseIncrementSize 32-bit integer (number of pages, 0 to disable auto-increment). Determines number of pages by which to automatically extend the db.dat file when all existing pages have been allocated. 1000 Takes effect on startup.
databaseInitSize 32-bit integer (number of pages). Determines the number of pages desired when calculating the initial size of Fennel's db.dat file. 1000 Only used at time of database creation; immutable after that.
databaseMaxSize 32-bit integer (number of pages, 0 for unlimited). Determines maximum size to which db.dat file can grow. Further allocations will fail even if space remains in the file system. 0 Takes effect on startup.
databaseShadowLogIncrementSize 32-bit integer (number of pages, 0 to disable auto-increment). Determines number of pages by which to automatically extend the shadowlog.dat file when all existing pages have been allocated. 1000 Takes effect on startup.
databaseShadowLogInitSize 32-bit integer (number of pages). Determines the number of pages desired when calculating the initial size of the shadowlog.dat file. This file is recreated every time Fennel is loaded, and is used as part of physical recovery. 2000 Takes effect on startup.
databaseTxnLogIncrementSize 32-bit integer (number of pages, 0 to disable auto-increment). Determines number of pages by which to automatically extend the txnlog.dat file when all existing pages have been allocated. 1000 Takes effect on startup.
databaseTxnLogInitSize 32-bit integer (number of pages). Determines the number of pages desired when calculating the initial size of the txnlog.dat file. This file is recreated every time Fennel is loaded, and is used as part of logical recovery. 2000 Takes effect on startup.
deviceSchedulerType String (name of scheduler implementation). Controls the scheduler used for disk I/O. threadPool means to fake asynchronous I/O via a thread pool, and use buffered file access. On Linux, aioLinux means to use libaio for asynchronous I/O, with O_DIRECT for unbuffered file access. The default is to to use libaio if available, otherwise to fallback to threadPool. On Windows, ioCompletionPort (the default) means to use asynchronous I/O via completion ports and unbuffered file access. In all cases, if an unrecognized or unsupported value is set, the system silently reverts to default behavior to avoid causing startup failures. null (let system decide)
expectedConcurrentStatements 32-bit integer (number of statements). Indicates to the resource governor what is expected to be the maximum number of concurrent statements executing at any given time. It is used by the resource governor to determine the maximum resource allocation for each statement. 4 Immediate.
fennelDisabled boolean. Determines whether Fennel native code components are loaded when the system starts up. false Takes effect on startup.
groupCommitInterval 32-bit integer (milliseconds). Determines the amount of time to wait during commit in the hope that the commit's disk write can be combined with that of other concurrent transactions. 30 Takes effect on startup.
jniHandleFile String (file-system path). See - Fennel Jni Handle Tracing {empty string} Takes effect on startup.
resourceDir String (file-system path). Sets the location of FennelResource.properties (or localized variant). $SQLSTREAM_HOME/s-Server/catalog/fennel
sdpListenerPort 32-bit integer, -1 for personality-based defaults. Sets the port number on which to listen for SDP connections (This is the port number specified in client JDBC URL's). However, for connections from other hosts, see also the Network Configuration subsection below entitled - Configuring JDBC Driver Connections from Other Hosts. -1 (implies 5570 for SQLstream personality) Takes effect on startup.
tempIncrementSize 32-bit integer (number of pages, 0 to disable auto-increment). Determines number of pages by which to automatically extend the temp.dat file when all existing pages have been allocated. 1000 Takes effect on startup.
tempInitSize 32-bit integer (number of pages). Determines the number of pages desired when calculating the initial size of the temp.dat file. This file is recreated every time Fennel is loaded. 1000 Takes effect on startup.

Reading System Parameters

To read the values of the system parameters, you can execute the following two queries, either using sqllineClient or SQLstream s-Studio:

SELECT * FROM SYS_FEM."Config"."FennelConfig";
SELECT * FROM SYS_FEM."Config"."FarragoConfig";

Each system parameter has its own column, named for the parameter. The value in the column is the current value of the parameter.

Configuring Java Parameters

Most of the SQLstream system is written in Java, so some of its behavior can be controlled using Java properties. Java properties are only used to control bootstrap behavior (startup activities before the server is up and running) such as setting initial tracing and network configuration.

Some of those parameters are passed on the command-line to the Java virtual machine by the startup script, s-Server and the auxiliary script defineAspenRuntime.sh.

You should not edit the file aspen.properties itself. Rather, if you want to override or add a value, put that parameter into a file called aspen.custom.properties in $SQLSTREAM_HOME. This enables you to make your own set of changes without having to modify the standard parameters. A sample aspen.custom.properties file can be found in $SQLSTREAM_HOME/support.

When the SQLstream s-Server starts up, it reads its Java parameters in the following order, with each successive source overriding any matching parameters set in the preceding ones:

  • Parameters from aspen.properties (read only)
  • Parameters from aspen.config (read only)
  • Parameters from aspen.custom.properties
  • Java system parameters with names beginning with "aspen."

As you can see, your entries in aspen.custom.properties can override the standard settings distributed in aspen.properties or aspen.config, so there is no need to modify those files.

Using defineAspenRuntime.sh

If you need to change the parameters passed to the Java runtime, such as the class path or memory sizes, you will need to make those changes to the script $SQLSTREAM_HOME/bin/defineAspenRuntime.sh.

You can use defineAspenRuntime.sh, for example, to change the heap size defined for the JVM. You can also use defineAspenRuntime.sh to add new database targets. s-Server supports Oracle, PostgreSql, Microsoft SQL Server, and Teradata, but adding an new JDBC driver requires making changes to defineAspenRuntime.sh.

General Parameters

Parameter Explanation
aspen.tmp.dir Defines directory under aspen.home.dir used for temporary files. Defaults to tmp.
aspen.tmp.trace.dir Defines directory used for temporary "tails" of the server trace log. Defaults to tmp/trace.

Performance Parameters

You can tweak parameters to adjust s-Server's performance. For example, using the parameter fennel.XO.bufferSize, you can increase the size of buffers (these are 1MB by default.) This allows scaling to multiple cores. By default, s-Server uses two buffers between XOs. You can switch to single buffering using the parameter aspen.xo.useDoubleBuffering.

You can also control how often s-Server clears aggregation hash tables on window boundaries using the parameter aspen.aggregate.maximumDeadBucketRatio. This affects tumbling windows where the ratio between input and out put row counts are close to one. If you tend to reuse aggregation keys, not clearing the aggregation hash table avoids the overhead of creating new entries in that table.

Parameter Explanation
aspen.aggregate.maximumDeadBucketRatio Controls when the streaming aggregate XO will clear its hashmap instead of marking aggregate buckets dead. Set to 0 to always clear it. Defaults to 3.0
aspen.aggregate.useResultBuffering Controls when the streaming aggregate XOs will buffer their results so that they can start processing input before downstream XOs have processed their output. Set to false to use less memory at the expense of throughput. Defaults to true.
aspen.ddl.maximumRowSize Controls the maximum width that streams can be declared as in bytes. Note that varchar fields are specified in number of chars. This will give a maximum of 4 bytes per char. Defaults to 16777216
aspen.xo.useDoubleBuffering Controls whether or not to use double buffering between XO's that that do not allocate their own buffers. If double buffering is set to true, this allows more parallelism at the expense of some scheduler overhead. Defaults to true.
aspen.sched.nthreads How many threads dedicated to execution objects. Auto means the number of CPU cores. Defaults to auto.
aspen.trace.sampleLateRows When false, every late row is reported in the trace log and the global error stream. When true, report the 1st event and then report every 15 secs. Defaults to true.
aspen.trace.sampleFailedRows When false, every row that causes a calculator error is reported in trace log and error stream. When true, report the 1st event and then report every 15 secs. Defaults to true.
aspen.perf.test.results Where to write performance test results. Relative paths are relative to aspen.home.dir. Defaults to performanceTestResults.json. Could be an absolute path such as aspen.perf.test.results=/var/sqlstream/testOutput/performanceTestResults.json
fennel.xo.bufferSize In bytes, controls the size of buffers used to transmit data between XOs. This will also be the maximum unit of work for a given execution. Defaults to 1048576.

Checkpointing Parameters

Since version 8.0.0, s-Server supports checkpointing of sliding window aggregations. When enabled, on restart, sliding windows will be populated with data from the previous run prior to the first row that is received. It is expected that data will be replayed starting on a rowtime boundary, in the same order as the previous run. For sliding window aggregations, once a checkpoint is committed, the next row in that stream will output only after the checkpoint is completed.

To configure checkpointing for all pumps, you can make use of the following parameters:

Parameter Explanation
aspen.checkpoint.enabled Enable / disable checkpoints for all pumps.
aspen.checkpoint.path The directory beneath which window state will be written.
aspen.checkpoint.interval How often, in milliseconds, should checkpoints be committed.

To do so, edit one of the following files:

  • $SQLSTREAM_HOME/aspen.properties
  • $SQLSTREAM_HOME/aspen.custom.properties

Alternatively, you can configure checkpointing on a pump-by-pump basis using session variables:

Variable name Explanation
"checkpoint.enabled" Enable checkpoints for this pump.
"checkpoint.path" The directory beneath which window state will be written.
"checkpoint.interval" How often, in milliseconds, should checkpoints be committed.
"checkpoint.interval.<window name>" You can specify checkpoint intervals for named windows. (See below.)

The following example shows how to enable checkpointing with a session variable:

CREATE SCHEMA 'TEST';
...
CREATE VIEW TEST.WithRunningAverage AS
SELECT STREAM *, AVG(X) OVER W1
FROM TEST.SRC
WINDOW W1 AS (PARTITION BY KEY RANGE INTERVAL '5' HOUR PRECEDING);

CREATE PUMP P STOPPED
OPTIONS ("checkpoint.interval.W1" '300000')
AS INSERT INTO TEST.DST
SELECT STREAM * FROM TEST.WithRunningAverage;
...
-- Enable checkpoints for all pumps in TEST.
ALTER PUMP TEST.*
SET "checkpoint.enabled" = 'true';

Checkpointing for Named Windows

In the example above, the window W1 is explicitly "named". In other words, OVER is followed by the named window, W1, whose definition is deferred until the WINDOW clause. Note: Several windows may be named in a WINDOW clause.

By contrast, windows can be anonymous. This occurs whenever the window definition is included inline as part of the SELECT list. In the following example, notice how the window definition is provided in parentheses directly after OVER, rather than given a label such as "W1":

CREATE VIEW TEST.WithAnonymousWindow AS
SELECT STREAM *, AVG(X) OVER (PARTITION BY KEY RANGE INTERVAL '5' HOUR PRECEDING)
FROM TEST.SRC

Caution: If you set aspen.checkpoint.interval.W1=30000, then ALL windows named "W1" will be affected. Similar care should be exercised with the session-variable approach. In the following example, all windows named "W1" that occur within the TEST view will be affected:

ALTER PUMP TEST.*
SET "checkpoint.interval.W1" = '30000'

SDP Parameters

Configuration Parameter Explanation
aspen.sdp.host Sets an optional hostname (or IP) for SDP. The default is the hostname of the machine or localhost if the hostname cannot be determined.
aspen.sdp.port Controls the port used by SQLstream for Streaming Data Protocol (SDP) connections. Typically, aspen.controlnode.url uses this same port. Defaults to 5570.
aspen.sdp.linkTimeoutMs Minimum number of milliseconds of idle time on a given link before timing out the link on the assumption that the connection is broken or that the peer has crashed. Defaults to 10000.
aspen.sdp.mtu Maximum transmission unit (MTU) for SDP connections. It is the maximum size of a TCP packet in a SDP connection. Defaults to 65536.
aspen.sdp.pingTimerMs Number of milliseconds between sending SDP EchoRequest frames. Echo frames are pings inside of an SDP connection. Defaults to 2000.
aspen.sdp.punctPingInterval Sets the default interval between "ping punctuation marks". To help the timely progress of rows inserted with an implicit rowtime (which is the current system time), the NetRxExecStream periodically emits a rowtime bound equal to the current time: this is truthful, since the rowtime of a forthcoming row will be a timestamp in the future. Because these pings cost a small amount of execution work, there is a cost to this liveness, and hence the interval between pings can be adjusted. Ideally this might be an attribute of each insert statement or of each stream. At present, there is a global value: set from this parameter. In milliseconds. Defaults to 40.
aspen.sdp.rxblocksize Number of independent blocks a driver can receive. Generally the number of statements running on a specific driver. This is rarely used. Minimum value is 2. Maximum value is maximum value of Java Integer class. Defaults to 4.
aspen.sdp.streamWindowHighWaterMarkRatio Defines the ratio of the current window to the maximum window at which point the window is published. Defaults to 0.75
aspen.sdp.throwOnProtocolError Throw exception: If true and link becomes mis-synchronized, exception is thrown. Exists for debugging; for production, should be set to false. Defaults to false.
aspen.cluster.uuid The magic cluster key. The UUID of the cluster to which this node belongs. Allows authentication of a node in the cluster and for multiple clusters to exist on the same network. Defaults to 12345678-90ab-cdef-1234-5678abcdef90.
aspen.sdp.blocksize Maximum size of a block of row data sent from the driver to the server. If multiple rows can fit inside of a buffer of size blocksize, then they will be packed into one TCP packet. MTU doesn't effect how many tuples are in a TCP packet, it only limits the maximum size of a TCP packet. Defaults to 65536, the maximum. Minimum value is 512.

Parameter Substitution

Java parameters can include other variables in their definitions. The syntax for these references is much like that used for variables in a shell script, a dollar sign followed by the name of the referenced variable inside curly braces, such as ${aspen.home.dir}. Undefined parameters are treated as empty strings.

Thus, you can define file paths relative to the SQLstream installation directory, for example:

my.file.name=${aspen.home.dir}/temp/myfile.name

The same syntax can also be used within SQL/MED and adapter parameters.

Configuring Adapter and SQL/MED Parameters

SQLstream adapters use parameters specified as OPTIONs in the DDL (as do other SQL/MED methods to access external data) for their FOREIGN DATA WRAPPERs, SERVERs and any FOREIGN STREAMs based on them.

Note that a Plugin JAR file added to the standard $SQLSTREAM_HOME/plugin directory is not automatically added to the SQLstream s-Server CLASSPATH unless a symlink to that JAR is put into the $SQLSTREAM_HOME/plugin/autocp directory (see the Readme file in the $SQLSTREAM_HOME/plugin/autocp directory).

To find the configuration options for a particular adapter, see the documentation for that adapter. SQL/MED and related adapter matters are covered in the topic SQLstream Software Development Kit (SDK) in the s-Server Integration Guide.

Note that, like Java parameters, adapter and other SQL/MED parameters can perform variable expansion and substitution. The variable reference takes the form of a dollar sign followed by the name of the variable in curly braces. Such a property is evaluated at the time the DDL is executed (define time).

For example, the following foreign server definition uses the variable aspen.home.dir to set the root directory for its log files under the SQLstream installation directory:

CREATE OR REPLACE SERVER "WebAppServer"
   FOREIGN DATA WRAPPER ECDA
   OPTIONS (ROOT '${aspen.home.dir}/temp')
   DESCRIPTION 'Server for webapp login stream';

Variables whose values are undefined are empty. For example, if aspen.home.dir were undefined, the ROOT parameter in the above example would evaluate as '/temp'.

Configuring Maximum Permitted Length of CHAR and VARCHAR columns

The default maximum permitted length for CHAR columns is 65535, and for VARCHAR is 1048575. These limits may be modified using the setMaximumCharacterLength() procedure.

To increase the maximum permitted length for VARCHAR columns:

call setMaximumCharacterLength('VARCHAR', 5000000);  

To increase the maximum permitted length for a CHAR column:

call setMaximumCharacterLength('CHAR', 200000);  

NOTES:

  • Wherever possible use VARCHAR for long character columns.
    • It is very unlikely that you need to use very long CHAR columns;
    • So it is extremely rare to need to change the maximum permitted length for CHAR.
  • Never reduce the maximum length of a datatype to be less than the longest column yet defined. This can cause unpredictable errors.