Main content
CS 671 — Data Integration
(dt. Datenintegration)
| Level, degree of commitment | Specialization module, compulsory elective module |
| Forms of teaching and learning, workload |
Lecture (2 SWS), recitation class (2 SWS), 180 hours (60 h attendance, 120 h private study) |
| Credit points, formal requirements |
6 CP Course requirement(s): Successful completion of at least 50 percent of the points from the weekly exercises as well as at least 2 presentations of the tasks. Examination type: Written or oral examination (individual examination) |
| Language, Grading |
English,The grading is done with 0 to 15 points according to the examination regulations for the degree program M.Sc. Data Science. |
| Duration, frequency |
One semester, each summer semester |
| Person in charge of the module's outline | Prof. Dr. Thorsten Papenbrock, Prof. Dr. Bernhard Seeger |
Contents
- Data models and query languages
- Data extraction and preparation
- Similarity measures for simple and complex data types
- Metadata and dependency search
- Schema transformation and mapping
- Data transformation and cleaning
- Entity search and resolution
- Architectures of integrated information systems
- Practical exercise of data integration
Qualification Goals
Translation is missing, sorry. German original:
Die Studierenden
- können grundlegende Ähnlichkeitsmaße für einfache und komplexe Datentypen (Data Matching) beschreiben,
- können Verfahren zur Metadatenextraktion und zur Bestimmung von Datenabhängigkeiten (Data Profiling) erläutern,
- können Techniken zur Abbildung, Integration und Transformation von Schemata und deren Daten (Schema Alignment) erläutern,
- können Algorithmen zur Erkennung und Auflösung von Duplikaten und anderer Datenfehler (Entity Resolution) erklären und einsetzen,
- können Architekturen und Funktionsweisen moderner, integrierter Informationssysteme (Integrated Information Systems) erklären,
- können mit heterogenen, verunreinigten Daten und deren Integration umgehen,
- sind in der Lage, wissenschaftliche Arbeitsweisen beim eigenständigen Erkennen, Formulieren und Lösen von Problemen anzuwenden,
- sind in der Lage, über wissenschaftliche Inhalte frei zu sprechen, sowohl vor einem Publikum als auch in einer Diskussion.
Prerequisites
None. The competences taught in the following modules are recommended: either Algorithms and Data Structures or Practical Informatics II: Data Structures and Algorithms for Pre-Service-Teachers, Database Systems.
Applicability
Module imported from M.Sc. Data Science.
It can be attended at FB12 in study program(s)
- B.Sc. Data Science
- B.Sc. Computer Science
- M.Sc. Data Science
- M.Sc. Computer Science
- M.Sc. Mathematics
- M.Sc. Business Informatics
- M.Sc. Business Mathematics
- LAaG Computer Science
When studying B.Sc. Data Science, this module can be attended in the study area Free Compulsory Elective Modules.
The module is assigned to Computer Science. Further information on eligibility can be found in the description of the study area.
Recommended Reading
- Ulf Leser, Felix Naumann: Informationsintegration (dpunkt, 2006)
- AnHai Doan, Alon Halevy, Zachary Ives: Principles of Data Integration (Morgan Kaufmann, 2012)
- Ziawasch Abedjan, Lukasz Golab, Felix Naumann, Thorsten Papenbrock: Data Profiling Synthesis Lectures on Data Management (Morgan & Claypool, 2018)
- George Papadakis, Ekaterini Ioannou, Emanouil Thanos, Themis Palpanas: The Four Generations of Entity Resolution (Morgan & Claypool, 2021)
Please note:
This page describes a module according to the latest valid module guide in Winter semester 2025/26. Most rules valid for a module are not covered by the examination regulations and can therefore be updated on a semesterly basis. The following versions are available in the online module guide:
- Winter 2016/17
- Summer 2018
- Winter 2018/19
- Winter 2019/20
- Winter 2020/21
- Summer 2021
- Winter 2021/22
- Winter 2022/23
- Winter 2023/24
- Winter 2025/26
The module guide contains all modules, independent of the current event offer. Please compare the current course catalogue in Marvin.
The information in this online module guide was created automatically. Legally binding is only the information in the examination regulations (Prüfungsordnung). If you notice any discrepancies or errors, we would be grateful for any advice.