Learning Geographical Hierarchy Features via a Compositional Model Academic Article uri icon


  • © 1999-2012 IEEE. Image location prediction is used to estimate the geolocation where an image is taken, which is important for many image applications, such as image retrieval, image browsing, and organization. Since a social image contains heterogeneous contents, such as visual content and textual content, effectively incorporating these contents to predict location is nontrivial. Moreover, it is observed that image content patterns and the locations where they may appear correlate hierarchically. Traditional image location prediction methods mainly adopt a single-level architecture and assume images are independently distributed in geographical space, which is not directly adaptable to the hierarchical correlation. In this paper, we propose a geographically hierarchical bi-modal deep belief network (GH-BDBN) model, which is a compositional learning architecture that integrates multi-modal deep learning model with a non-parametric hierarchical prior model. GH-BDBN learns a joint representation capturing the correlations among different types of image content using a bi-modal DBN, with a geographically hierarchical prior over the joint representation to model the hierarchical correlation between image content and location. Then, an efficient inference algorithm is proposed to learn the parameters and the geographical hierarchical structure of geographical locations. Experimental results demonstrate the superiority of our model for image location prediction.

author list (cited authors)

  • Zhang, X., Hu, X., Wang, S., Yang, Y., Li, Z., & Zhou, J.

citation count

  • 4

complete list of authors

  • Zhang, Xiaoming||Hu, Xia||Wang, Senzhang||Yang, Yang||Li, Zhoujun||Zhou, Jianshe

publication date

  • May 2016