Display language
To modulepage Generate PDF

#41086 / #1

Seit SoSe 2023


Large-scale Data Engineering


Böhm, Matthias




Fakultät IV

Institut für Softwaretechnik und Theoretische Informatik

34352900 FG Big Data Engineering

No information


TEL 8-1

Damme, Patrick


No information

Learning Outcomes

In this combined seminar/project module, students will learn how to critically read scientific publications, search for scientific literature on a given topic, write a high-quality scientific paper, create prototypes for specific projects, and give high-quality presentations on papers and prototypes. All of these aspects are covered with a special focus on the areas data engineering, data management, and machine learning systems. Together, the programming project and seminar are a solid foundation for subsequent master theses, both at a methodological level and specific topics.


This module is comprised of a seminar and programming project in the large context of big data engineering, i.e., topics related to scalable data and ML systems. In detail, the module is structured as follows: A) Seminar on selected topics related to data and ML systems * 3 Lectures on scientific methods (structure of scientific papers, scientific reading and writing, experiments and reproducibility) * Reading selected papers and writing a 6-page summary paper (in LaTeX with provided template) * 20min oral presentation of summarized topic B) Programming projects on data and ML systems * Selection of a generic or seminar-topic-specific project * Discussion rounds on design, implementation, tests, and experiments * Prototype implementation, tests, and experiments * 15min oral presentation of the created prototype

Module Components


All Courses are mandatory.

Course NameTypeNumberCycleLanguageSWSVZ
Large-scale Data EngineeringProjektWiSe/SoSeEnglish4
Large-scale Data EngineeringSeminarWiSe/SoSeEnglish2

Workload and Credit Points

Large-scale Data Engineering (Projekt):

Workload descriptionMultiplierHoursTotal
Attendance Discussion Rounds4.02.0h8.0h
Prototype Implementation1.0200.0h200.0h
Tests, Documentation, Experiments1.040.0h40.0h
Talk Preparation and Presentation1.020.0h20.0h
268.0h(~9 LP)

Large-scale Data Engineering (Seminar):

Workload descriptionMultiplierHoursTotal
Attendance Lectures3.02.0h6.0h
Paper reading and writing1.065.0h65.0h
Talk Preparation and Presentation1.015.0h15.0h
86.0h(~3 LP)
The Workload of the module sums up to 354.0 Hours. Therefore the module contains 12 Credits.

Description of Teaching and Learning Methods

Guided and self-organized reading of scientific papers, literature search, and writing of a summary paper. Guided and self-organized project work. In the beginning of the semester, students will hear presentations on reading scientific papers, finding related work, writing high-quality scientific papers, and giving a high-quality scientific presentation. Each student will be assigned an initial paper to read and understand. After that, students search for related work and write a short summary of the assigned paper, including some remarks on related work. In the end of the semester, each student gives a slide presentation in front of the group, followed by a discussion of the topic. Concurrently or in a subsequent semester, students also pick a programming project from a provided list, devise an initial design and then implement a prototype including documentation, tests, and relevant experiments. Theses programming projects are augmented by regular discussion rounds and a final presentation of the obtained results.

Requirements for participation and examination

Desirable prerequisites for participation in the courses:

Completed basic courses on applied machine learning and data management

Mandatory requirements for the module test application:

This module has no requirements.

Module completion



Type of exam

Portfolio examination

Type of portfolio examination

100 Punkte insgesamt



Test elements

(Deliverable assessment) Seminar Paper25written6 pages
(Deliverable assessment) Seminar Presentation15oral20 min
(Deliverable assessment) Project Implementation, Tests, Docs50practicalN/A
(Deliverable assessment) Project Presentation10oral15 min

Grading scale

Notenschlüssel »Notenschlüssel 2: Fak IV (2)«


Test description (Module completion)

The project can be conducted in teams of 1 to 3 students, but graded as a whole. All other parts of the portfolio exam are graded individually for every student.

Duration of the Module

The following number of semesters is estimated for taking and completing the module:
1 Semester.

This module may be commenced in the following semesters:
Winter- und Sommersemester.

Maximum Number of Participants

The maximum capacity of students is 20.

Registration Procedures

Registration via email to Patrick Damme (patrick.damme@tu-berlin.de)

Recommended reading, Lecture notes

Lecture notes

Availability:  unavailable


Electronical lecture notes

Availability:  unavailable



Recommended literature
Seminar-/project-specific literature will be discussed during the first lecture.

Assigned Degree Programs

This module is used in the following Degree Programs (new System):

Studiengang / StuPOStuPOsVerwendungenErste VerwendungLetzte Verwendung
Computer Engineering (M. Sc.)115SoSe 2023SoSe 2024
Computer Science (Informatik) (M. Sc.)118SoSe 2023SoSe 2024
Elektrotechnik (M. Sc.)19SoSe 2023SoSe 2024
Informatik (B. Sc.)13SoSe 2023SoSe 2024
Information Systems Management (Wirtschaftsinformatik) (M. Sc.)16SoSe 2023SoSe 2024
Technische Informatik (B. Sc.)13SoSe 2023SoSe 2024
Wirtschaftsinformatik (B. Sc.)26SoSe 2023SoSe 2024

Students of other degrees can participate in this module without capacity testing.


No information