Striim 3.9.4 / 3.9.5 documentation

Kafka streams

Striim natively integrates Apache Kafka, a high-throughput, low-latency, massively scalable message broker. For a technical explanation, see kafka.apache.org.

In simple terms, what Kafka offers Striim users is the ability to persist real-time streaming source data to disk at the same time Striim loads it into memory, then replay it later. If data comes in too fast to be handled by the built-in Kafka broker, an external Kafka system may be used instead, and scaled up as necessary.

Replaying from Kafka has many potential uses. For example:

  • If you put a source persisted to a Kafka stream in one application and the associated CQs, windows, caches, targets, and WActionStores in another, you can bring down the second application to update the code, and when you restart it processing of source data will automatically continue from the point it left off, with zero data loss and no duplicates.

  • Developers can use a persisted stream to do A/B testing of various TQL application options, or to perform any other useful experiments.

  • You can perform forensics on historical data, mining a persisted stream for data you didn't know would be useful. For example, if you were troubleshooting a security alert, you could write new queries against a persisted stream to gather additional data that was not captured in a WActionStore.

  • By persisting sources to an external Kafka broker, you can enable zero-data-loss recovery after a Striim cluster failure for sources that are normally not recoverable, such as HTTPReader, TCPReader, and UDPReader (see Recovering applications).

  • Persisting to an external Kafka broker can also allow recovery of sources running on a remote host using the Forwarding Agent.

You can use a Kafka stream like any other stream, by referencing it in a CQ, putting a window over it, and so on. Alternatively, you can also use it as a Kafka topic:

  • You can read the Kafka topic with KafkaReader, allowing events to be consumed later using messaging semantics rather than immediately using event semantics.

  • You can read the Kafka topic with an external Kafka consumer, allowing development of custom applications or integration with third-party Kafka consumers.

For additional information, see: