Display language
To modulepage Generate PDF

#41135 / #4

WiSe 2023/24 - WiSe 2023/24

English

ROC Foundations for Graduate Research in Data Management and Machine Learning Systems

9

Markl, Volker

benotet

Portfolioprüfung

Zugehörigkeit


Fakultät IV

Institut für Softwaretechnik und Theoretische Informatik

34351500 FG Datenbanksysteme und Informationsmanagement (DIMA)

No information

Kontakt


EN 7

Markl, Volker

sekr@dima.tu-berlin.de

Learning Outcomes

Big Data and Machine Learning are key drivers underlying the current wave of innovation in artificial intelligence and data science. Indeed, these drivers have had a profound impact on both the economy and the sciences. This course targets research-oriented students who aim to pursue a PhD in Big Data Management -or- Data Science and Engineering Systems/Technologies. Upon completion of this course, students will have learned about contemporary research methodology, including scientific reading, writing, presenting, prototyping and experimental design, gained both theoretical and practical skills in data management and big data technologies, and be attuned to today’s major research challenges in scalable data management and processing. The course is designed to principally impart technical skills (20%), method skills (40%), systems skills (20%), and social skills (20%).

Content

The central focus of this module is on contemporary research methodology (CRM), data management technologies, and current research challenges. After an initial presentation on CRM, including scientific reading, writing, presenting, prototyping and experimental design, in subsequent lectures, students will read about foundational data management methods/ technologies and offer a presentation, which will then be followed by an instructor led presentation addressing related advanced topics. Topics of discussion, include data storage and indexing, specification and compilation of data analysis programs, query optimization and self-tuning, adaptive methods, processing data science pipelines as well as responsible data management. In an accompanying lab component, students will prototype and evaluate discussed methods, technologies, and settings in a methodical and scientific way, and produce a scientific report on their findings.

Module Components

Pflichtgruppe:

All Courses are mandatory.

Course NameTypeNumberCycleLanguageSWSVZ
ROC-PRO Project on the Foundations for Graduate Research in Data Management and Machine Learning SystemsPJWiSeEnglish4
ROC-SEM Seminar on the Foundations for Graduate Research in Data Management and Machine Learning SystemsSEMWiSeEnglish2

Workload and Credit Points

ROC-PRO Project on the Foundations for Graduate Research in Data Management and Machine Learning Systems (PJ):

Workload descriptionMultiplierHoursTotal
Lab Course (Experimental Setup)15.02.0h30.0h
Lab course (Programming)15.02.0h30.0h
Lab Course (System Setup)15.02.0h30.0h
Report15.02.0h30.0h
Lab course (Performance Evaluation)15.04.0h60.0h
180.0h(~6 LP)

ROC-SEM Seminar on the Foundations for Graduate Research in Data Management and Machine Learning Systems (SEM):

Workload descriptionMultiplierHoursTotal
Plenary Sessions15.04.0h60.0h
Preparation and Presentation (including reading and literature research)15.02.0h30.0h
90.0h(~3 LP)
The Workload of the module sums up to 270.0 Hours. Therefore the module contains 9 Credits.

Description of Teaching and Learning Methods

This module (comprised of ROC-PRO and ROC-SEM) encompasses: (a) lectures on key concepts, (b) discussions, (c) student lead presentations (including literature search), and (d) a systems research project including system setup, prototyping, experimental design, performance evaluation, and (e) creating a presentation and report on the findings. Active participation and contributions to all parts of ROC are essential.

Requirements for participation and examination

Desirable prerequisites for participation in the courses:

Desired prerequisite knowledge and skills are as follows: (a) computer science topics addressed in TU Berlin modules in the Bachelor’s curriculum, particularly, ISDA (Information Systems and Data Analysis) and DBPRA (Practical Database Systems Lab) or their equivalents, (b) good programming skills in C, Java, and SQL. (c) an undergraduate course in linear algebra, probability, and statistics. (d) knowledge of a master's level coursework in database technology (DBT) and advanced information management (e.g., MDS, DMH). (e) strong English language skills.

Mandatory requirements for the module test application:

This module has no requirements.

Module completion

Grading

graded

Type of exam

Portfolio examination

Type of portfolio examination

100 Punkte insgesamt

Language

English

Test elements

NamePointsCategorieDuration/Extent
Effective and Efficient Interaction with the Mentor (PRO)10flexible30 to 60 minutes per need
Evaluation Report (PRO)50written8 pages, conference style
Interaction with the Mentor (SEM)10flexible30 to 60 minutes per need
Performance Evaluation Presentation (PRO)40oral20 min + 10 min discussion
Quiz on Database Technology and Research Methodology (SEM)50written60 minutes
Technology Presentation (SEM)40oral20 min + 10 min discussion (20 Slides)

Grading scale

Notenschlüssel »Notenschlüssel 2: Fak IV (2)«

Gesamtpunktzahl1.01.31.72.02.32.73.03.33.74.0
100.0pt95.0pt90.0pt85.0pt80.0pt75.0pt70.0pt65.0pt60.0pt55.0pt50.0pt

Test description (Module completion)

For ROC-PRO the exam is worth 100 points and determined as follows: evaluation report (50 points), interaction with the mentor (10 points), and a performance evaluation presentation (40 points). For ROC-SEM the exam is worth 100 points and determined as follows: technology presentation (40 points), quiz on database technology and research methodology (50 points), and the interaction with the mentor (10 points). For both ROC-PRO and ROC-SEM the final grade will be computed according to the Grading Table 2 of Faculty IV, according to German law, § 68 (2) AllgStuPO TU Berlin.

Duration of the Module

The following number of semesters is estimated for taking and completing the module:
1 Semester.

This module may be commenced in the following semesters:
Wintersemester.

Maximum Number of Participants

The maximum capacity of students is 8.

Registration Procedures

Students are required to register for the course in the official TUB examination system within six weeks after commencement of the first lecture or when the first graded assignment is due, whichever happens to be first.

Recommended reading, Lecture notes

Lecture notes

Availability:  unavailable

 

Electronical lecture notes

Availability:  unavailable

 

Literature

Recommended literature
Readings in Database Systems, 5th Edition, Peter Bailis, Joseph M. Hellerstein, Michael Stonebraker, editors, http://www.redbook.io/
Various Research Papers, made available during the first lecture
Hadoop: The Definitive Guide (4th Edition), Tom White, O’Reilly Media, 2015.
Raj Jain: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling (Wiley Professional Computing), 1991
Supplementary reading material may be assigned to complement course lectures.

Assigned Degree Programs


This module is used in the following Degree Programs (new System):

Studiengang / StuPOStuPOsVerwendungenErste VerwendungLetzte Verwendung
This module is not used in any degree program.

Miscellaneous

This course targets research-oriented Bachelor’s and Master’s students interested in focusing on Database Systems and Information Management in Computer Science (Major: System Engineering), Computer Engineering (Major: Information Systems and Software Engineering), and Industrial Engineering, as well as students pursuing the Data Science and Engineering Master’s Track.