Content
In recent years, advances in hardware technology have facilitated new ways of collecting data continuously. In many applications such as for instance network monitoring, the volume of such data is so large that it may be impossible to store the data on disk. Furthemore, even when the data can be stored, the volume of the incoming data may be so large that it may be impossible to process any particular record more than once. Therefore, many database operations and data analysis algorithms such as for instance filtering, indexing, classification and clustering become significantly more challenging in this context.
The course has the following main topics:
- Basic conceptual understanding and terminology of data streams management, introduction to data streams, examples (telephone networks, automotive electronics, avionics, transport management, building monitoring, etc.)
- Basic concepts of technical information systems, modeling of data streams
- Data sources, requirements structuring, requirements of data stream management systems (DSMS)
- Reference architecture of a DSMS, architecture modeling
- Modeling of the functionality, logical architecture. Description on technical architecture, interface definition, behavior modeling
- Data streams processing: Windowing, The Sliding-Window Computation Model and Results
- Synopsis Construction in Data Streams (Sampling, Wavelets, Sketches and Histograms)
- Filtering, counting in data streams
- Data streams analysis: Classification & Clustering
- Data processing in sensor networks
- Modeling examples (automotive electronics, avionics)
- Prototype Systems (Aurora, STREAM, TelegraphCQ)
- Frameworks (Flink, Spark, Storm)