Data Mining on Structured Data ("Graph Mining")

In the life sciences, one is often concerned with the study of schemaless data that possess a complex internal structure. Such data cannot be mapped onto "flat" feature vectors of a fixed length without an inherent loss of essential information. The aim my research is the development of methods for the analysis of such data and the application of the novel methods to problems from the fields of bioinformatics and structure-based drug design. The individual objects of interest are no longer represented by a fixed set of descriptors, which raises the problem of finding a correspondence among the different features of the objects. This can be formulated as a combinatorial optimization problem. After a correspondence is found, it is possible to apply existing methods for data mining to identify interesting patterns and to correlate the presence/absence of certain features with the class membership of the respective object.