Striim 3.9.4 / 3.9.5 documentation

ADLSGen1Writer

Writes to files in Azure Data Lake Storage Gen1. A common use case is to write data from on-premise sources to an ADLS staging area from which it can be consumed by Azure-based analytics tools.

property

type

default value

notes

auth token endpoint

java. lang. string

the token endpoint URL for your web application (see "Generating the Service Principal" under Using Client Keys)

clientid

java. lang. string

the application ID for your web application (see "Generating the Service Principal" under Using Client Keys)

clientkey

com. webaction. security. password

the key for your web application (see "Generating the Service Principal" under Using Client Keys)

compression type

java. lang. string

Set to gzip when the input is in gzip format. Otherwise, leave blank.

data lake store name

java. lang. string

the name of your Data Lake Storage Gen1 account

directory

java. lang. String

The full path to the directory in which to write the files. See Setting output names and rollover / upload policies for advanced options.

filename

java. lang. String 

The base name of the files to be written. See Setting output names and rollover / upload policies.

rollover on ddl

java. lang. Boolean

True

Has effect only when the input stream is the output stream of a CDC reader source. With the default value of True, rolls over to a new file when a DDL event is received. Set to False to keep writing to the same file.

rollover policy

java. lang. String

eventcount:10000, interval:30s

See Setting output names and rollover / upload policies.

This adapter has a choice of formatters. See Supported writer-formatter combinations for more information.

Data is written in 4 MB batches or whenever rollover occurs.

Sample application:

CREATE APPLICATION testADLSGen1;

CREATE SOURCE PosSource USING FileReader ( 
  wildcard: 'PosDataPreview.csv',
  directory: 'Samples/PosApp/appData',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
) 
OUTPUT TO PosSource_Stream;

CREATE CQ PosSource_Stream_CQ 
INSERT INTO PosSource_TransformedStream
SELECT TO_STRING(data[1]) AS MerchantId,
  TO_DATE(data[4]) AS DateTime,
  TO_DOUBLE(data[7]) AS AuthAmount,
  TO_STRING(data[9]) AS Zip
FROM PosSource_Stream;

CREATE TARGET testADSLGen1target USING ADSLGen1Writer (
  directory:'mydir',
  filename:'myfile.json',
  datalakestorename:'mydlsname.azuredatalakestore.net',
  clientid:'********-****-****-****-************',
  authtokenendpoint:'https://login.microsoftonline.com/********-****-****-****-************/oauth2/token',
  clientkey:'********************************************'
)
FORMAT USING JSONFormatter ()
INPUT FROM PosSource_TransformedStream;

END APPLICATION testADLSGen1;

Since the test data set is less than 10,000 events, and ADSLGen1Writer is using the default rollover policy, the data will be uploaded in 30 seconds.