Kalman Filter UDX

This UDX allows customers to run Kalman filters on streams of sensor data. A Kalman filter is a technique for sharpening the measurements produced by blurry sensors.

The SQLstream Kalman filter is a system UDX which sharpens the blurry positions of a potentially large number of particles in a complex physical system. Example applications might be satellite tracking and smart-home climate-control.

Kalman Filter Overview

A Kalman filter is a technique for sharpening the measurements produced by blurry sensors. It can be used, for instance, to reduce uncertainty about the (location, speed) of a moving vehicle or the (temperature, humidity, airPressure, windSpeed) logged at a weather station. More generally, it is a technique which improves the measured trajectory of a particle through a complex physical system. It filters out noise and reduces uncertainty. It uses the following inputs to compute the next point on a particle’s trajectory:

  • The previous point on the trajectory.
  • The current measurement of the particle’s position.
  • An instantaneously linear model of how the particle normally moves.
  • Bell curves describing various uncertainties, including sensor noise.

SQLstream’s Kalman UDX is a wrapper around the KalmanFilter provided by the Apache commons-math project. The API to the UDX is phrased in the following terms, which are found in the mathematical literature on Kalman filters and in the documentation for Apache’s KalmanFilter.

We start with some notation:

  • Let M(ε x δ) denote a matrix with ε rows and δ columns.
  • Let σ be the number of sensor readings which comprise a particle position.
  • Let γ be the number of control variables in the model.
  • Let κ be a row number in the driving stream of sensor readings.

These are the inputs and outputs of the Kalman UDX:

Initialization (Model) Parameters

  • A(σ x σ)* A square *state transition* matrix describing how the position would change if the system operated at steady-state without any external influences.
  • Q(σ x σ)* A square *process noise* matrix. This is a fudge term to cover unmodelled forces which operate on the system.
  • P(σ x σ)* A square *error* matrix. It describes the uncertainty about the output values.
  • B(σ × γ)* A *control input* matrix describing external forces operating on the system. These are forces which ARE modelled. Vacuous (null) for simple systems. If specified, must have the same number of rows as A.
  • H(σ × σ)* A *measurement* matrix of initial sensor readings, used to calibrate the model.
  • R(σ × σ)* A *measurement noise* matrix describing the uncertainty of H.
  • i* Initial state vector of σ real values, the initial sensor measurements of the particle’s position.

Ongoing Streaming Inputs

  • zκ* Vector of σ current sensor measurements at time *κ*. The particle’s position at time κ.
  • uκ* Control vector of γ real values at time κ. Vacuous (null) for simple systems.

Output Stream

xκ This is the best guess of the particle’s position at row κ. It is a vector of σ real values and it is the output of the filter at each step. xκ-1* is fed back into the algorithm in order to compute *xκ*.

Internally, the algorithm calculates a noise variable wκ* for each row κ.

Each output row is calculated as follows:

*xκ* = A(xκ-1*) + B(uκ-1*) + wκ-1

Kalman Filter UDX

Model. These parameters are the matrices and initial data described in the Overview section above: A, Q, P, B, H, R, and I. Since each particle may have its own, individually quirky sensors, each particle will have its own model. That is, each particle will have its own A, Q, P, B, H,* and R *matrices and its own *I* initial position tuple.

  • Raw position. This is the infinite stream of incoming, blurry particle positions detected by the sensors.
  • It is imagined that the application is defined and operates as follows:
  • Schema. In addition to an incoming stream of blurry particle positions and an outgoing stream of focused positions, there is a separate stream for each of the model parameters: A, Q, P, B, H, R, and I.
  • Particle id. There is some way of assigning a unique BIGINT id to each particle.
  • Initialization. From time to time, new particles are added to the system. For instance, a new satellite may be launched. Or a new end-user may install a climate-control system at home. Before ingesting any blurry position data for that particle, the application must first insert the particle’s model into the A, Q, P, B, H, R,* and I* streams.
  • Steady-state. The SQLstream Kalman filter UDX will discard raw position data for particles which lack a complete model. Once a particle’s model has been fully inserted, the UDX will be able to convert blurry positions into focused positions.

Kalman Filter UDX Signature

Let Γ(ε x δ) denote a contiguous block of ε rows. Each row has a leading BIGINT column and δ trailing DOUBLE columns. The BIGINT column is the particle id. All the rows in the contiguous block have the same particle id. The DOUBLE columns are data for the Kalman filter.

CREATE FUNCTION sys_boot.mgmt.kalman_filte
(
--
-- Model parameters
--
A_STATE_TRANSITION_MATRIX CURSOR,-- NOT NULL
Q_PROCESS_NOISE_MATRIX CURSOR, -- NOT NULL
P_ERROR_MATRIX CURSOR,
B_CONTROL_INPUT_MATRIX CURSOR,
H_MEASUREMENT_MATRIX CURSOR,-- NOT NULL
R_MEASUREMENT_NOISE CURSOR, -- NOT NULL
I_INITIAL_POSITION CURSOR,
--
-- Steady-state stream of raw, blurry particle positions.
--
Z_U_POSITION_AND_CONTROL CURSOR -- NOT NULL
)
returns table
(
Z_U_POSITION_AND_CONTROL.*
)
language java
parameter style system defined java
no sql
external name 'class com.sqlstream.aspen.syslib.KalmanFilterUDX.kalman_filter';

where

  • A_STATE_TRANSITION_MATRIX is a streaming query returning a Γ(σ x σ) block of rows for each particle id.
  • Q_PROCESS_NOISE_MATRIX is a streaming query returning a Γ(σ x σ) block of rows for each particle id.
  • P_ERROR_MATRIX is a streaming query returning a Γ(σ x σ) block of rows for each particle id. P may be NULL.
  • B_CONTROL_INPUT_MATRIX is a streaming query returning a Γ(σ × β) block of rows for each particle id.
  • H_MEASUREMENT_MATRIX is a streaming query returning a Γ(σ × σ) block of rows for each particle id.
  • R_MEASUREMENT_NOISE is a streaming query returning a Γ(σ x σ) block of rows for each particle id.
  • I_INITIAL_POSITION is a streaming query returning a Γ(1 x σ) block (i.e., one row) for each particle id. I may be NULL.
  • Z_U_POSITION_AND_CONTROL is a streaming query.

If B is null, then the trailing columns of each row comprise a Γ(1 x σ) row, the current position for a single particle, as measured by sensors. Otherwise, the trailing columns of each row comprise a Γ(1 x σ+β) row, the current position and control readings for a single particle.

The row prefix consists of other, passthrough columns.

We also support a simplified version of the Kalman filter UDX:

CREATE FUNCTION sys_boot.mgmt.kalman_filter_simple
(
A_STATE_TRANSITION_MATRIX CURSOR,-- NOT NULL
Q_PROCESS_NOISE_MATRIX CURSOR, -- NOT NULL
H_MEASUREMENT_MATRIX CURSOR,-- NOT NULL
R_MEASUREMENT_NOISE CURSOR, -- NOT NULL
Z_POSITION CURSOR-- NOT NULL
)
returns table
(
Z_POSITION.*
)
language java
parameter style system defined java
no sql
external name 'class com.sqlstream.aspen.syslib.KalmanFilterUDX.kalman_filter';

Example

Here is an example of the Kalman filter UDX in action.

-- this is the blurry stream which needs to be sharpened
create stream satellitePosition(satelliteID bigint, radialDistance double, polarAngle double, azimuthAngle double);

-- these streams define the model for the Kalman filter
create stream A_stateTransition(satelliteID bigint, radialDistance double, polarAngle double, azimuthAngle double);
create stream Q_processNoise(satelliteID bigint, radialDistance double, polarAngle double, azimuthAngle double);
create stream H_calibration(satelliteID bigint, cal1 double, cal2 double, cal3 double);
create stream R_calibrationNoise(satelliteID bigint, cal1 double, cal2 double, cal3 double);

-- this query reads the models, constructs Kalman filters from them,
-- and applies those filters to an ongoing stream of blurry data
SELECT STREAM *
from stream
(
sys_boot.mgmt.kalman_filter_simple
(
cursor(SELECT STREAM * from A_stateTransition),
cursor(SELECT STREAM * from Q_processNoise),
cursor(SELECT STREAM * from H_calibration),
cursor(SELECT STREAM * from R_calibrationNoise),
cursor(SELECT STREAM * from satellitePosition)
)
);

--
-- load the models
--

-- insert 3 rows (a 3 × 3 matrix) per satellite
insert into test.A_stateTransition values ...

-- insert 3 rows (a 3 × 3 matrix) per satellite
insert into test.Q_processNoise values ...

-- insert 2 rows (3 × 3 matrix) per satellite
insert into test.H_calibration values ...

-- insert 2 rows (3 × 3 matrix) per satellite
insert into test.R_calibrationNoise values ...

-- finally, insert some real-time measurements from the satellite tracking sensors
insert into test.satellitePosition values (1, 42000, 1.3, 2.7), (1, 42000, 1.31, 2.7), (1, 42000, 1.32, 2.7);