Zur Modulseite PDF generieren

#40494 / #10

Seit SoSe 2024

English, German

BDSPRO Big Data Systems Project

9

Markl, Volker

Benotet

Portfolioprüfung

English

Zugehörigkeit


Fakultät IV

Institut für Softwaretechnik und Theoretische Informatik

34351500 FG Datenbanksysteme und Informationsmanagement (DIMA)

Keine Angabe

Kontakt


EN 7

Soto, Juan

sekr@dima.tu-berlin.de

Lernergebnisse

In this course students will learn how to systematically analyze a current issue in the Big Data management area as well as develop and implement a problem-oriented solution as part of a team. Students will learn how to cooperate as team member, to contribute to project organization, quality assurance and documentation, and to evaluate the quality of your solution through analysis, systematic experiments and test cases. After the course, students will be able to understand the architecture of large scale data systems and to solve problems behind the end-to-end design, implementation, and testing of large scale data management systems. They will be capable of designing and implementing solutions to improve the performance and feature-completeness of large-scale data systems in a collaborative team.

Lehrinhalte

Both the sciences and industry are currently undergoing a profound transformation: large-scale, diverse data sets - derived from Internet-of-Things sensors, the web, or via crowd sourcing - present a huge opportunity for data-driven decision making. This data poses new challenges in a variety of dimensions: in its unprecedented volume, in the speed at which it is generated (its velocity) and in the variety of data sources that need to be integrated. The field of Big Data Systems deals with the technological means of processing high-volume of data to gain insights from data. A whole new breed of systems and paradigms has been developed to to cope with that these challenges. However, current systems still fall short in addressing many user needs. Therefore, further system research is necessary to achieve robust query performance in any scenario. As a result, students will conduct projects that deal with topics related to current data management trends, such as Modern Hardware, Stateful Data Stream Processing, Sensor Data Management, Compiler Technology, Query Optimization, Machine Learning for Databases, and many others. This scope of the project will be adjusted to the final group size to reflect the overall workload of the course (i.e., 270h of work per student) For that, students will learn the algorithms, system design, and actual implementation of the so called Distributed Processing Platforms (e.g., Flink, NebulaStream, Spark). These are systems that execute parallel computations on terabytes of data on clusters as well as distributed Internet-of-Things topologies of up to several thousand machines. At the start of the project, a student will receive a topic as well as some information material. The team, with the assistance of the lecturer, will decide on a project environment with the suitable tools for team work, project communication, development and testing. Next, the problem will have to be analyzed, modelled and decomposed into individual components, from which tasks are derived that are subsequently assigned to smaller teams or individuals. At weekly project meetings, the project team presents progress and milestones that have been reached. In consultation with the lecturer, it is decided which further steps to take. The project is concluded with a final presentation which includes a demonstration of the prototype.

Modulbestandteile

Compulsory area

Die folgenden Veranstaltungen sind für das Modul obligatorisch:

LehrveranstaltungenArtNummerTurnusSpracheSWS ISIS VVZ
BDAPRO - Big Data Analytics ProjectPJ0434 L 484WiSe/SoSeKeine Angabe6

Arbeitsaufwand und Leistungspunkte

BDAPRO - Big Data Analytics Project (PJ):

AufwandbeschreibungMultiplikatorStundenGesamt
Preparation Phase and Design1.040.0h40.0h
Participation in Meetings20.03.0h60.0h
Documentation, Presentation1.040.0h40.0h
Implementation, Tests, Experiments1.0130.0h130.0h
270.0h(~9 LP)
Der Aufwand des Moduls summiert sich zu 270.0 Stunden. Damit umfasst das Modul 9 Leistungspunkte.

Beschreibung der Lehr- und Lernformen

Guided and self-organized project work.

Voraussetzungen für die Teilnahme / Prüfung

Wünschenswerte Voraussetzungen für die Teilnahme an den Lehrveranstaltungen:

Desired prerequisite knowledge and skills include: (a) general computer science (e.g., algorithms, data structures, systems architecture, distributed systems) obtained via the completion of a Bachelor's (e.g., B.Sc. in Computer Science), (b) mathematics (e.g., linear algebra, statistics), (c) specific subfields in computer science (e.g., Database Technology / DBT, Data Management on Modern Hardware / DMH, Machine Learning / ML1 or MI1, Compilers), (d) solid programming in at least one of the following programming languages: Java, C++, Scala, or Python, (e) functional programming, (f) basics in distributed source control management systems (i.e., Git) and software development processes such as Agile, (g) DBT or an equivalent course on database internals prior to enrolling in BDSPRO (otherwise, be concurrently enrolled in DBT and BDSPRO in the same semester).

Verpflichtende Voraussetzungen für die Modulprüfungsanmeldung:

Dieses Modul hat keine Prüfungsvoraussetzungen.

Abschluss des Moduls

Benotung

Benotet

Prüfungsform

Portfolio examination

Art der Portfolioprüfung

100 Punkte insgesamt

Sprache(n)

English

Prüfungselemente

NamePunkteKategorieDauer/Umfang
(Learning Process Review) Experiment Design and Execution20praktischabout 30h
(Deliverable Assessment) Experiments Analysis20praktischabout 30h
(Deliverable Assessment) Intermediate Presentation10mündlichabout 10-15 minutes
(Deliverable Assessment) Final Presentation20mündlichabout 20 minutes
(Learning Process Review) Prototype with Test Cases and Documentation30praktischabout 60h

Notenschlüssel

Notenschlüssel »Notenschlüssel 2: Fak IV (2)«

Gesamtpunktzahl1.01.31.72.02.32.73.03.33.74.0
100.0pt95.0pt90.0pt85.0pt80.0pt75.0pt70.0pt65.0pt60.0pt55.0pt50.0pt

Prüfungsbeschreibung (Abschluss des Moduls)

In the final grade, students are graded individually, i.e., final grades between students in a group can vary depending on the amount of work carried out by each person. The final grade according to § 68 (2) AllgStuPO will be calculated with the faculty grading table 2.

Dauer des Moduls

Für Belegung und Abschluss des Moduls ist folgende Semesteranzahl veranschlagt:
1 Semester.

Dieses Modul kann in folgenden Semestern begonnen werden:
Winter- und Sommersemester.

Maximale teilnehmende Personen

Die maximale Teilnehmerzahl beträgt 12.

Anmeldeformalitäten

Admission to the lecture is limited. Please have a look at https://www.tu.berlin/dima/studium-lehre/kursangebote before the lecture period starts to get information on how you can register.

Literaturhinweise, Skripte

Skript in Papierform

Verfügbarkeit:  nicht verfügbar

 

Skript in elektronischer Form

Verfügbarkeit:  nicht verfügbar

 

Literatur

Empfohlene Literatur
Project specific literature will be announced in the first lecture.

Zugeordnete Studiengänge


Diese Modulversion wird in folgenden Studiengängen verwendet:

Studiengang / StuPOStuPOsVerwendungenErste VerwendungLetzte Verwendung
Computer Engineering (M. Sc.)16SoSe 2024SoSe 2025
Computer Science (Informatik) (M. Sc.)19SoSe 2024SoSe 2025
Elektrotechnik (M. Sc.)16SoSe 2024SoSe 2025
ICT Innovation (M. Sc.)13SoSe 2024SoSe 2025
Information Systems Management (Wirtschaftsinformatik) (M. Sc.)16SoSe 2024SoSe 2025
Medieninformatik (M. Sc.)13SoSe 2024SoSe 2025
Medientechnik (M. Sc.)16SoSe 2024SoSe 2025
Wirtschaftsingenieurwesen (M. Sc.)13SoSe 2024SoSe 2025
This course is oriented towards Master's students pursuing degrees in Computer Science, Information Systems Management, Computer Engineering, or Industrial Engineering & Management. In particular, with an interest in database systems and information management.

Sonstiges

Keine Angabe