Matched Filter UDX

Matched filters work by evaluating a template against a signal, giving a correlation coefficient for how close the match was at any point within the signal.

The Matched Filter UDX operates not within the time domain, but instead within an independent-from-time linear domain. It was first developed for examining speed patterns within a road network for detecting bottlenecks where there was likely some kind of traffic incident. It did this by maintaining speed data for every 10 meters within the network. For every pair of consecutive speed reports a vehicle made that stayed on a single road we would update the speed data of the network and reevaluate the matched filter for every part of the network that was changed.

The Matched Filter UDX works on arbitrary linear data. It normalizes the data against the template, so only the shape is compared, not the amplitude or mean. It runs at high speed, taking advantage of Intel’s array processing extensions. Guavus has released this under an Apache 2 license, so you are free to modify this as needed.

We provide the following UDXs for evaluating matched filters. You will need to build and install this before using. See Writing a Java UDX in this guide.

MatchedFilterSegmented

create or replace function matchedFilterSegmented(
filterData CURSOR,
segmentData CURSOR,
halfLife double,
minReportCorrelation double,
samples CURSOR
)
returns TABLE(
ROWTIME TIMESTAMP NOT NULL,
firstElement int,
numElements int,
MATCH_ELEMENT int,
MATCH_CORRELATION double,
MATCH_RMS double,
MATCH_MEAN double
)
language java
parameter style system defined java
no sql
external name 'test:com.sqlstream.plugin.matchedfilter.MatchFinder.findMatches';

Parameters:

  • filterData Must be relational table expression with a single column giving a sequence of values to use as a filter template.
  • segmentData Must be relational table expression with a single column giving a sequence of boundaries for segments within the linear domain. Example: CURSOR(values 0, 1000, 2000) would give two separate segments, 0-999 and 1000-1999. The filter is only evaluate within individual segments.
  • halfLife At what rate (in seconds) should we age data within the network? We treat newer data to be of higher confidence, reducing the confidence by a factor of two every half life in seconds.
  • minReportCorrelation* Minimum correlation for which we should report matches.
  • samples A stream of samples. Each sample describes a series of elements within the network to be updated. Values are given for the first and last elements. Values in between are interpolated linearly. Confidences are given for the first, last, and midpoint elements. Confidences in between are linearly interpolated.

Sample stream must start with the following columns:

  • firstElement int,
  • numElements int,
  • firstValue double,
  • firstConfidence double,
  • lastValue double,
  • lastConfidence double,
  • midConfidence double

It can be followed by other columns that can be passed through to any matches.

Returned table:

Can include any columns from the samples CURSOR. These will be passed through unchanged. Can also include any or all of:

  • MATCH_ELEMENT int, – first element of match against filter
  • MATCH_CORRELATION double, – correlation of match.
  • MATCH_RMS double, – root mean square of elements matched.
  • MATCH_MEAN double – mean of elements matched.

MatchedFilter

Just like matchedFilterSegmented, but instead of taking a cursor for segmentData, it takes a single integer and assumes a single segment of that size.

LinearFilter

create or replace function linearFilter(filterSize int)
returns TABLE(
val double not null
)
language java
parameter style system defined java
no sql
external name 'test:com.sqlstream.plugin.matchedfilter.MatchFinder.linearFilter';

Returns a table of increasing values starting at 0, increasing by 1 each row. Can be used for creating filters based on evaluation of an expression. Example:

create or replace view parabolicFilter as
select (20 - val) * val as val from table(linearFilter(21));

GenerateTestData

create or replace function generateTestData(
maxElements int,
numSamples int,
matchStart int,
matchData CURSOR,
lowRandValue double,
randRange double
)
returns TABLE(
ROWTIME TIMESTAMP NOT NULL,
firstElement int,
numElements int,
firstValue double,
firstConfidence double,
lastValue double,
lastConfidence double,
midConfidence double
)
language java
parameter style system defined java
no sql
external name 'test:com.sqlstream.plugin.matchedfilter.TestData.generateTestData';

Can be used to generate test data for testing matched filter. Generates random data except within a specific range where you can specify the exact results.

Parameters:

maxElements Size of single segment to generate data for numSamples How many samples to generate matchStart At what point in the segment to specify exactly the output matchData Table expression with one value per row where you want to specify exactly the output. lowRandValue, randRange Range for random values.

Returns: Table of samples. Each will specify 5 elements unless that would cross between the portion that should be randomly generated and the portion that is specified exactly.

Note: When a table function returns an infinite result, it needs to be wrapped with a STREAM(…) marker when invoked. s-Server will return an error otherwise.

Example of Use

create or replace view parabolicFilter as
select (20 - val) * val / 100 as val from table(linearFilter(21));

create or replace view testDataStream as
SELECT STREAM * from stream(generateTestData(
1000,
1000000,
200,
cursor(select val * 100 from parabolicFilter), 0, 100));

create or replace view foundMatches as
SELECT STREAM * from stream(matchedFilter(
CURSOR(select * from parabolicFilter),
1000,
10.0, 0.9,
CURSOR(SELECT STREAM * from testDataStream)));

SELECT STREAM * from foundMatches;

Building and Installing Matched Filter

Before you can use the Matched Filter UDX, you need to install it. It’s necessary to install Matched Filter separately from Guavus SQLstream because the Matched Filter UDX makes use of hardware specific array processing extensions and therefor needs to be built for specific platforms.

The following are instructions for building under Ubuntu 18.04. Instructions for installing prerequisites, setting environment, and so on may differ for your Linux environment.

Prerequisites: Maven and libcblas. To get these, enter the following code at a command prompt:

sudo apt-get install maven, libatlas-base-dev

Set SQLSTREAM HOME as an Environment Variable

export SQLSTREAM_HOME=/opt/sqlstream/nnnnn/s-Server (replacing nnnnn with your version)

Install Guavus SQLstream Maven Artifacts

cd $SQLSTREAM_HOME/examples/sdk
./install.sh

Build and Install Jar

cd matchedFilter
mvn install
cp target/matchedFilter*.jar $SQLSTREAM_HOME/plugin/matchedFilter.jar

Build and Install jni Component

cd src/main/native
make
cp lib/libmatchedFilter.so $SQLSTREAM_HOME/plugin

Test It

cd $SQLSTREAM_HOME
run sqllineClient
!run examples/sdk/matchedFilter/src/test/resources/unitsql/testMatch.sql

You should see most matches at element 200 where otherwise random data is spiked with a match.