Main content

CS 557 — Distributed Data Management
(dt. Verteiltes Datenmanagement)

Level, degree of commitment Advanced module, depends on importing study program
Forms of teaching and learning,
workload
Lecture (4 SWS), recitation class (2 SWS),
270 hours (90 h attendance, 180 h private study)
Credit points,
formal requirements
9 CP
Course requirement(s): Successful completion of at least 50 percent of the points from the weekly exercises as well as at least 2 presentations of the tasks.
Examination type: Written or oral examination (individual examination)
Language,
Grading
German/English,
The grading is done with 0 to 15 points according to the examination regulations for the degree program B.Sc. Computer Science.
Origin B.Sc. Computer Science
Duration,
frequency
One semester,
each winter semester
Person in charge of the module's outline Prof. Dr. Thorsten Papenbrock

Contents

  • Actor-, service-, batch-, and stream-based distributed programming.
  • Big Data systems
  • Data serialization and message passing
  • Data structures for distributed data management
  • OSI model and communication protocols
  • Data partitioning and replication
  • Consistency and reconciliation protocols
  • Time synchronization and change propagation
  • Distributed request scheduling

Qualification Goals

Students will

  • Know challenges in building distributed systems (Distributed Systems),
  • know reactive distributed programming (Actor Programming),
  • know techniques for digital representation and serialization of data (Encoding),
  • know procedures for the functioning of networks (Communication),
  • know standards for structuring and querying data (Data Models and Query Languages),
  • know algorithms and data structures for distributed work with data (Storage and Retrieval),
  • know techniques for ensuring reliability and availability (Replication and Partitioning),
  • know techniques for ensuring consistency and unity (Consistency and Consensus),
  • know algorithms for distributed transaction management (Transactions),
  • know frameworks for distributed batch processing of data-intensive tasks (Batch Processing),
  • know frameworks for distributed data stream processing (Stream Processing),
  • know the functionality of distributed database management systems (Distributed DBMSs),
  • know the basics of distributed query processing (Distributed Query Optimization),
  • are able to apply this knowledge practically in programming data-intensive distributed algorithms,
  • are able to apply scientific working methods when independently identifying, formulating and solving problems,
  • Are able to speak freely about scientific content, both in front of an audience and in a discussion.

Prerequisites

None. The competences taught in the following modules are recommended: Algorithms and Data Structures, Database Systems.


Applicability

The module can be attended at FB12 in study program(s)

  • B.Sc. Data Science
  • B.Sc. Computer Science
  • B.Sc. Business Informatics
  • M.Sc. Data Science
  • M.Sc. Computer Science
  • M.Sc. Business Informatics
  • LAaG Computer Science

When studying B.Sc. Computer Science, this module can be attended in the study area Compulsory Elective Modules in Computer Science.

The module can also be used in other study programs (export module).


Recommended Reading

  • Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, Martin Kleppmann, 2017, 978-1449373320
  • Distributed Systems, Maarten van Steen and Andrew S. Tanenbaum, 2017, 978-1543057386
  • Principles of Distributed Database Systems, M. Tamer Özsu and Patrick Valduriez, 2011, 978-1441988331
  • Web-Scale Data Management for the Cloud, Wolfgang Lehner and Kai-Uwe Sattler, 2013, 1489997717
  • Introduction to Parallel Computing, Zbigniew J. Czech, 2017, 978-1107174399
  • Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services, Brendan Burns, 2017, 978-1491983645
  • Spark: Big Data Cluster Computing in Production, Ilya Ganelin and Ema Orhian and Kai Sasaki and Brennon York, 2016, 978-1119254010
  • Reactive Messaging Patterns with the Actor Model, Vaughn Vernon, 2015, 978-0133846836
  • Mining Massive Datasets, Jure Leskovec and Anand Rajaraman and Jeffrey David Ullman, 2014, 978-1107077232
  • Algorithmische Geometrie, Rolf Klein, 2005, 978-3540209560



Please note:

This page describes a module according to the latest valid module guide in Winter semester 2023/24. Most rules valid for a module are not covered by the examination regulations and can therefore be updated on a semesterly basis. The following versions are available in the online module guide:

  • Winter 2016/17 (no corresponding element)
  • Summer 2018 (no corresponding element)
  • Winter 2018/19 (no corresponding element)
  • Winter 2019/20 (no corresponding element)
  • Winter 2020/21 (no corresponding element)
  • Summer 2021 (no corresponding element)
  • Winter 2021/22 (no corresponding element)
  • Winter 2022/23 (no corresponding element)
  • Winter 2023/24

The module guide contains all modules, independent of the current event offer. Please compare the current course catalogue in Marvin.

The information in this online module guide was created automatically. Legally binding is only the information in the examination regulations (Prüfungsordnung). If you notice any discrepancies or errors, we would be grateful for any advice.