Title: Learning Euclidean Embeddings for Indexing and Classification
Authors: Vassilis Athitsos, Joni Alon, Stan Sclaroff, George Kollios
Date: April 7, 2004
Abstract:
BoostMap is a recently proposed method for efficient approximate
nearest neighbor retrieval in arbitrary non- Euclidean
spaces with computationally expensive and possibly
non-metric distance measures. Database and query objects
are embedded into a Euclidean space, in which similarities
can be rapidly measured using a weighted Manhattan distance.
The key idea is formulating embedding construction
as a machine learning task, where AdaBoost is used
to combine simple, 1D embeddings into a multidimensional
embedding that preserves a large amount of the proximity
structure of the original space. This paper demonstrates
that, using the machine learning formulation of BoostMap,
we can optimize embeddings for indexing and classification,
in ways that are not possible with existing alternatives for
constructive embeddings, and without additional costs in retrieval
time. First, we show how to construct embeddings
that are query-sensitive, in the sense that they yield a different
distance measure for different queries, so as to improve nearest
neighbor retrieval accuracy for each query. Second, we
show how to optimize embeddings for nearest neighbor classification
tasks, by tuning them to approximate a parameter
space distance measure, instead of the original feature-based distance
measure.