Our research group is mainly interested in techniques for supporting efficient processing of queries on large databases. For the last 20 years, our focus has been on the development of techniques for object-relational databases. A strong emphasis has been on spatial, temporal and spatio-temporal index structures. The recent improvements in network technology allow querying massive remote data sources. From these pre-conditions, the focus of our work has been broadened into two directions. On the one hand, we investigate the management of data streams, created by massive amounts of small sensors. On the other hand, we consider geospatial database in the context of scientific applications and explore techniques for querying and analyzing big spatial databases.
Over the next years, a tremendous number of sensors will be installed in our
environment. More and more data is continuously delivered from these devices
as a stream. In general, a large number of streams are required to provide the
desired information and each of the streams outputs a large number of data items.
Ideally, users pose ad-hoc queries on streams, similar to a traditional DBMS.
There are however fundamental differences: a query runs until the user explicitly
stops it and, streaming data items are generally valid for a short period of
time only. This leads to a substantial change in query processing. Therefore,
systems for streams are primarily designed for the management of queries, of
which there might be millions running simultaneously, whereas data items of
the streams are kept in the system only temporarily.
Our research group addresses the following issues in stream processing:
We have addressed these research issues in a project called PIPES and use them in a new challenging applications for monitoring mission-critical infrastructures like in our ACCEPT project that is currently supported by Bundesministerium für Bildung und Forschung (BMBF).
Though database systems are considered as a mature technology, there are still research challenges due to new demanding applications. The efficient processing of joins has been a big issue for long, but surprisingly very little work has been done for supporting complex join predicates like similarity. Similarity joins are important when users are interested in the integration of different data sources. Another application of a similarity join arises in the context of data mining to detect similar patterns. We are very much interested in efficiently supporting such unusual joins, particularly for the cases when the input consists of more than two relations and the output is produced progressively.
One of the subjects we are very well known for is the area of index-structures. Our R*-tree and MVBT are index-structures that are already available in commercial systems such as Oracle. For many years we have been studying the design and evaluation of heuristics for improving the R*-tree. Moreover, bulk-operations like loading a tree from a given set of objects have been an important topic to our research group. Indexing and storing XML-data is also a subject we are working on. One major focus has been on native storage structures for XML and supporting bulk-loading on our XML-storage. Recently, we have extended our studies to new fields of applications like preference databases and demanding new technologies, for example location-based services and peer-to-peer systems.
Though researchers in the database area are often interested in the development
of a prototype database system, we followed a different approach and have developed
a library called XXL, which may of course very well serve as a platform for
building database systems. XXL provides the query processing functionality required
for a database system like a set of demand-driven operators, a rich collection
of index-structures, and a rule-based optimizer. It supports processing of both,
relational and XML data. All the packages of XXL come with a full documentation
and therefore, people outside of our group are able to quickly familiarize with
the functionality of XXL. It is very important to us that we generally use XXL
to implement new techniques presented in our research papers. There is reference
implementation available in XXL that allow for quick experimental comparison,
for example. We found that XXL improves quality and speed of our coding, when
implementing new ideas, since it provides a rich infrastructure of low- and
high-level components. XXL is a live library where new functionality is continuously
added. The library is publicly available under GNU LGPL.