Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020
merge cannot receive data because another channel is empty Operator fission Data parallelism, replication A A A split merge ??? Vasiliki Kalavri | Boston University 2020 33 • if operator is costly computations on small time intervals • Keep intermediate state in memory • Use Spark's RDDs instead of replication • Parallel recovery mechanism in case of failures 44 input stream time-based micro-batches0 码力 | 54 页 | 2.83 MB | 1 年前3Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020
A consumer instance sees records in the order they are stored in the log. • For a topic with replication factor N, we will tolerate up to N-1 server failures without losing any records committed to the0 码力 | 26 页 | 3.33 MB | 1 年前3
共 2 条
- 1