Striim 3.9.4 / 3.9.5 documentation

MonitorLogs: web server log data

The web server logs are in Apache NCSA extended/ combined log format plus response time:

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %D"

(See apache.org for more information.) Here are four sample log entries:

216.103.201.86 - EHernandez [10/Feb/2014:12:13:51.037 -0800]  "GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 21560 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1606
216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.487 -0800]  "GET http://cloud.saas.me/create?type=Partner&id=01e3928f-e05a-9be1-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 63523 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1113
216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.543 -0800]  "GET http://cloud.saas.me/query?type=ChatterMessage&id=01e3928f-e05a-9be2-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 46556 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 1516
216.103.201.86 - EHernandez [10/Feb/2014:12:13:52.578 -0800]  "GET http://cloud.saas.me/retrieve?type=ContractHistory&id=01e3928f-e05a-9be3-bdc5-14109fcf2383&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1" 200 44556 "-" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)" 39
Screen_Shot_2016-01-07_at_10.42.42_AM.png

In MultiLogApp, these logs are read by AccessLogSource:

CREATE SOURCE AccessLogSource USING FileReader (
  directory:'Samples/MultiLogApp/appData',
  wildcard:'access_log',
  blocksize: 10240,
  positionByEOF:false
)
PARSE USING DSVParser (
  columndelimiter:' ',
  ignoreemptycolumn:'Yes',
  quoteset:'[]~"',
  separator:'~'
)
OUTPUT TO RawAccessStream;

The log format is space-delimited, so the columndelimiter value is one space. With these quoteset and separator values, both square brackets and double quotes are recognized as delimiting strings that may contain spaces. With these settings, the first log entry above is output as a WAEvent data array with the following values:

"216.103.201.86",
"-",
"EHernandez",
"10/Feb/2014:12:13:51.037 -0800",
"GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1",
"200",
"21560",
"-",
"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)",
"1606"

This in turn is processed by the ParseAccessLog CQ:

CREATE CQ ParseAccessLog 
INSERT INTO AccessStream
SELECT data[0],
  data[2],
  MATCH(data[4], ".*jsessionId=(.*) "),
  TO_DATE(data[3], "dd/MMM/yyyy:HH:mm:ss.SSS Z"),
  data[4],
  TO_INT(data[5]),
  TO_INT(data[6]),
  data[7],
  data[8],
  TO_INT(data[9])
FROM RawAccessStream;

After the AccessLogEntry type is applied, the event looks like this:

srcIp: "216.103.201.86"
userId: "EHernandez"
sessionId: "01e3928f-e059-6361-bdc5-14109fcf2383"
accessTime: 1392063231037
request: "GET http://cloud.saas.me/login&jsessionId=01e3928f-e059-6361-bdc5-14109fcf2383 HTTP/1.1"
code: 200
size: 21560
referrer: "-"
userAgent: "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"
responseTime: 1606

The web server log data is now in a format that Striim can analyze. AccessStream is used by the HackerCheck, LargeRTCheck, ProxyCheck, and ZeroContentCheck flows.