Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020
id) .max("temp") maxTemp.print() env.execute("Compute max sensor temperature”) } } Example: Sensor Readings 7 Vasiliki Kalavri | Boston University 2020 case class Reading(id: max("temp") maxTemp.print() env.execute("Compute max sensor temperature”) } } Sensor id, timestamp, temperature reading Example: Sensor Readings 8 Vasiliki Kalavri | Boston University 2020 0))) .keyBy(_.id) .max("temp") maxTemp.print() env.execute("Compute max sensor temperature”) } } Flink programs are defined in regular Scala/Java methods Set up the0 码力 | 26 页 | 3.33 MB | 1 年前3Windows and triggers - CS 591 K1: Data Stream Processing and Analytics Spring 2020
.keyBy(_.id) .timeWindow(Time.minutes(1)) .max("temp") } } 3 Example: Window sensor readings Vasiliki Kalavri | Boston University 2020 In the DataStream API, you can use the time for the application env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) // ingest sensor stream val sensorData: DataStream[SensorReading] = env.addSource(...) } } Or ProcessingTIme .aggregate(new AvgTempFunction) // An AggregateFunction to compute the average temperature per sensor. // The accumulator holds the sum of temperatures and an event count. class AvgTempFunction extends0 码力 | 35 页 | 444.84 KB | 1 年前3Notions of time and progress - CS 591 K1: Data Stream Processing and Analytics Spring 2020
checkAndGetNextWatermark( r: Reading, extractedTS: Long): Watermark = { if (r.id == "sensor_1") { // emit watermark if reading is from sensor_1 new Watermark(extractedTS - bound) } else { // do not emit a watermark event time characteristic env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) // ingest sensor stream val readings: DataStream[Reading] = env.addSource(new SensorSource) // assign timestamps0 码力 | 22 页 | 2.22 MB | 1 年前3Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020
index. This can model time-series data streams: • a sequence of measurements from a temperature sensor • the volume of NASDAQ stock trades over time This model poses a severe limitation on the stream: views Vasiliki Kalavri | Boston University 2020 Stream representation matters Consider streams of sensor readings from a temperature probe 1. a reading of the current temperature every 1s? 2. the difference 0/9.0))) .keyBy(_.id) .max("temp") maxTemp.print() env.execute("Compute max sensor temperature”) } } Example: Apache Flink DataStream API 42 Vasiliki Kalavri | Boston University0 码力 | 45 页 | 1.22 MB | 1 年前3State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020
University 2020 val sensorData: DataStream[Reading] = ??? // partition and key the stream on the sensor ID val keyedData: KeyedStream[Reading, String] = sensorData Using state in Flink 18 3. get state value 4. update state This is the state of the current key (sensor id) Vasiliki Kalavri | Boston University 2020 Use keyed state to store and access state in the0 码力 | 24 页 | 914.13 KB | 1 年前3Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020
later querying, and so on. • Data streaming from various processes or devices • a residential sensor can stream data to backend servers hosted in the cloud. 24 A publisher application creates a0 码力 | 33 页 | 700.14 KB | 1 年前3Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020
Kalavri | Boston University 2020 21 Online recommendations Vasiliki Kalavri | Boston University 2020 Sensor measurements analysis • Monitoring applications • Complex filtering and alarm activation • Aggregation0 码力 | 34 页 | 2.53 MB | 1 年前3
共 7 条
- 1