SIMILARITY SEARCH AND DATA MINING: DATABASE TECHNIQUES
SUPPORTING NEXT
DECADE'S APPLICATIONS
Christian Böhm
University for Health Informatics and Technology
Innsbruck, Austria
Abstract
Similarity Search and Data Mining have become widespread problems
of modern database applications involving complex objects
such as Multimedia, CAD, Molecular Biology, Sequence Analysis,
etc. Search problems in such databases are rarely based on
exact matches but rather on some application specific notion
of similarity. A common approach to grasp the intuitive idea
of similarity by a formal means is to translate complex objects
into multidimensional vectors by a feature transformation
which allows retrieval of the most similar objects to a given
query object (similarity search) but also to analyze the complete
set of complex objects with respect to clusters, outliers,
correlations etc. (data mining). In this contribution we identify
several areas of applications where the classical feature
approach is not sufficient. Example applications include Biometric
Identification, Medical Imaging, Electronic Commerce and Share
Price Analysis. We show that existing feature based similarity
models fail due to different reasons, e.g. because they do
not cope with the uncertainty which is inherent to their feature
vectors (biometric identification) or because they do not
integrate application specific methods into the similarity
model (share price analysis, medical imaging). We survey the
challenges and possible solutions to these problems to direct
future research.
Biography
Christian Böhm (christian.boehm@umit.at) is working in
the research fields of data mining, query processing, indexing
high-dimensional data spaces, and similarity search and has
extensively published in these areas. In 1994, he received
his diploma degree in Computer Science at the Technische Universität
München. He received his Ph.D. degree in 1998 and his
habilitation degree in 2001 from the Ludwig Maximilians University
Munich. Since 2002, he is associate professor of computer
science and head of the Unit for Database Systems at the University
for Health Informatics and Technology, Innsbruck, Austria.