This paper describes automated image annotation as an image retrieval problem, in which the distance metric used to express similarity among images is learnt from available distance metrics on several image descriptors. Rather than describing the problem as an optimization problem, we study it as a regression problem. On a limited dataset of images of buildings taken in the city center of Brussels, we illustrate the superior performance of the combined distance metrics over any of the considered individual distance metrics in automated image annotation.