Formatting Data as CSV

s-Server's CSV formatter converts rows into character-separated output based on options supplied through the options clause of a foreign table or foreign stream.

When using SQL, the CSV formatter converts rows to CSV that you define in a foreign table or foreign stream, outputting these rows to a data writer that you have defined for the foreign table or stream. When you INSERT into the foreign table or stream, s-Server begins writing CSV to the data writer that you specify.

When using the ECD agent, the CSV converts rows that you define in a properties file to CSV, outputting these rows to a data writer that you have defined for the foreign table or stream..

For a list of writers, and details on how to use them, see the Writing to Other Destinations.

For performance reasons, most formatting should be done in native SQL and passed into ECDA, not pushed into ECDA formatters.

To write CSV data, you need to give the Extensible Common Data Adapter the following information:

  • A formatter of CSV
  • A character encoding type for the file.

Using SQL to Write CSV Data

To write CSV data, you create a foreign table or stream that references one of s-Server's prebuilt server objects. Like all tables and streams, foreign tables and streams must be created within a schema. The example below creates and sets a schema called csv_data and creates a foreign stream called csv_writer_stream that writes data to the file system.

To write data over other input/output systems, such as Kafka, Kinesis, a network socket, a WebSocket, HTTP or AMQP, you would need to specify options for these formats. See Writing to Other Destinations for more details.

CREATE OR REPLACE SCHEMA csv_data;
SET SCHEMA 'csv_data';

CREATE OR REPLACE FOREIGN STREAM csv_writer_stream
("STR1" VARCHAR(32))
SERVER FILE_SERVER
OPTIONS
(directory 'path/to/myfile',
formatter 'CSV',
filename_date_format 'yyyy-MM-dd-HH:mm:ss',
filename_prefix 'test-',
filename_suffix '.csv',
character_encoding 'US-ASCII',
formatter_include_rowtime 'false',
file_rotation_size '20K');

To actually write to a file in path/to/myfile, you need to write a pump containing an INSERT statement along the following lines:

CREATE OR REPLACE PUMP "writerPump" STARTED AS
INSERT INTO "FileWriterStream"
SELECT STREAM "MyStream";
--where "MyStream" is a currently existing stream

Foreign Stream Options for Formatting CSV Data

Option Definition
FORMATTER This needs to be CSV.
CHARACTER_ENCODING Character set for data.
FORMATTER_INCLUDE_ROWTIME Whether or not to include rowtime when writing data. Defaults to 'true'.
WRITE_HEADER Whether or not to write header information (a header row) into the CSV data. Defaults to 'false'.
ROW_SEPARATOR Character(s) separating rows in CSV data. Defaults to '\n' (a newline). Supports multiple characters, and supports use of Unicode literals such as U&'\000D\000A'
SEPARATOR Character(s) separating field values within a row. Defaults to ','. Supports multi-character strings like '$$' or '@!@', and single or multi-character Unicode literals.
QUOTE_CHARACTER Lets you specify a quotation character to wrap the output value if the SEPARATOR string is present in the column value. There is no default for quote character. Only a single one-byte character may be used, which limits to code points between 0 and 127.
STARTING_OUTPUT_ROWTIME If this is set, then all rows with ROWTIME less than the specified value are silently dropped. It is specified using an ISO 8601 format like '2020-02-06T12:34:56.789'.

Using the ECD Agent to Write CSV Data

You can use the ECD agent to CSV Data to remote locations. See Writing Data to Remote Locations for more details.

The ECD agent takes similar options, but these options need to be formatted in a properties file along the lines of the following. These properties correspond to those defined for the adapter above.

# Location, date format, prefix, suffix
FORMATTER=CSV'
CHARACTER_ENCODING=UTF-8
ROW_SEPARATOR=\000A
SEPARATOR=,
WRITE_HEADER=false
DIRECTORY=/home/guavus/output
ORIGINAL_FILENAME=stocks-output.csv
FILENAME_PREFIX=output-
FILENAME_SUFFIX=.csv
FILENAME_DATE_FORMAT=yyyy-MM-dd-HH:mm:ss
FILE_ROTATION_SIZE=20K
FORMATTER_INCLUDE_ROWTIME=true
# Schema, name, and parameter signature of origin stream
SCHEMA_NAME=stocks
TABLE_NAME=output_file
#columns
ROWTYPE=RECORDTYPE(TIMESTAMP order_time, INTEGER amount, VARCHAR(100) ticker)

To invoke the agent, from the directory $SQLSTREAM_HOME/../clienttools/EcdaAgent/ enter the following at the command line ``` $ ./commondataagent.sh --output --props sample.properties --io file