Striim 3.9.4 / 3.9.5 documentation

S3Writer

Writes to Amazon S3.

property

type

default value

notes

accesskeyid

java.lang.String

Specify an AWS access key ID (created on the AWS Security Credentials page) for a user with "Write objects" permission on the bucket.

When Striim is running in Amazon EC2 and there is an IAM role with that permission associated with the VM, leave accesskeyid and secretaccesskey blank to use the IAM role.

bucketname

java.lang.String

The S3 bucket name. If it does not exist, it will be created.

See Setting output names and rollover / upload policies for advanced options. If you use dynamic bucket names, you must specify a value for the Region property.

Note the limitations in Amazon's Rules for Bucket Naming.

clientconfiguration

java.lang.String

Optionally, specify one or more of the following property-value pairs, separated by commas.

If you access S3 through a proxy server, specify it here using the syntax ProxyHost=<IP address>,ProxyPort=<port number>,ProxyUserName=<user name>,ProxyPassword=<password>. Omit the user name and password if not required by your proxy server.

Specify any of the following to override Amazon's defaults:

  • ConnectionTimeout=<timeout in milliseconds>: how long to wait to establish the HTTP connection, default is 50000

  • MaxErrorRetry=<number of retries>: the number of times to retry failed requests (for example, 5xx errors), default is 3

  • SocketErrorSizeHints=<size in bytes>: TCP buffer size, default is 2000000

See http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/section-client-configuration.html for more information about these settings.

compressiontype

java.lang.String

Set to gzip when the input is in gzip format. Otherwise, leave blank.

foldername

java.lang.String

Optionally, specify a folder within the specified bucket. If it does not exist, it will be created.

See Setting output names and rollover / upload policies for advanced options.

objectname

java.lang.String

The base name of the files to be written. See Setting output names and rollover / upload policies.

objecttags

java.lang.String

Optionally, specify one or more object tags (see Object Tagging) to be associated with the file as key-value pairs <tag name>=<value> separated by commas. Values may include field, metadata, and/or userdata values (see Setting output names and rollover / upload policies) and/or environment variables (specified as $<variable name>).

ParallelThreads

java.lang.Integer

see Creating multiple writer instances

PartitionKey

java.lang.String

If you enable ParallelThreads, specify a field to be used to partition the events among the threads.  Events will be distributed among multiple S3 folders based on this field's values. 

If the input stream is of any type except WAEvent, specify the name of one of its fields.

If the input stream is of the WAEvent type, specify a field in the METADATA map (see HP NonStop reader WAEvent fieldsMySQLReader WAEvent fieldsOracleReader WAEvent fields, or MSSQLReader WAEvent fields) using the syntax @METADATA(<field name>), or a field in the USERDATA map (see Adding user-defined data to WAEvent streams), using the syntax @USERDATA(<field name>). If appropriate, you may concatenate multiple METADATA and/or USERDATA fields.

region

java.lang.String

Optionally, specify an AWS region. This is required to use dynamic bucket names (see Setting output names and rollover / upload policies).

rolloveronddl

java.lang.Boolean

True

Has effect only when the input stream is the output stream of a CDC reader source. With the default value of True, rolls over to a new file when a DDL event is received. Set to False to keep writing to the same file.

secretaccesskey

com.webaction. security.Password

Specify the AWS secret access key for the specified access key.

uploadpolicy

java.lang.String

eventcount:10000, interval:5m

The upload policy may include eventcount, interval, and/or filesize (see Setting output names and rollover / upload policies for syntax). Cached data is written to S3 every time any of the specified values is exceeded. With the default value, data will be written every five minutes or sooner if the cache contains 10,000 events. When the app is undeployed, all remaining data is written to S3.

This adapter has a choice of formatters. See Supported writer-formatter combinations for more information.

For example:

CREATE APPLICATION testS3;

CREATE SOURCE PosSource USING FileReader ( 
  wildcard: 'PosDataPreview.csv',
  directory: 'Samples/PosApp/appData',
    positionByEOF:false )
PARSE USING DSVParser (
  header:Yes,
  trimquote:false ) 
OUTPUT TO PosSource_Stream;

CREATE CQ PosSource_Stream_CQ 
INSERT INTO PosSource_TransformedStream
SELECT TO_STRING(data[1]) AS MerchantId,
  TO_DATE(data[4]) AS DateTime,
  TO_DOUBLE(data[7]) AS AuthAmount,
  TO_STRING(data[9]) AS Zip
FROM PosSource_Stream;

CREATE TARGET testS3target USING S3Writer (
  bucketname:'mybucket',
  objectname:'myfile.json',
  accesskeyid:'********************',
  secretaccesskey:'******************************',
  foldername:'myfolder')
FORMAT USING JSONFormatter ()
INPUT FROM PosSource_TransformedStream;

END APPLICATION tests3;

Note that since the test data set is less than 10,000 events, and the application is using the default upload policy, the data will be upload to S3 for in five minutes, or when you undeploy.