A General Embedding Framework for Heterogeneous Information Learning in Large-Scale Networks

abstract

Network analysis has been widely applied in many real-world tasks, such as gene analysis and targeted marketing. To extract effective features for these analysis tasks, network embedding automatically learns a low-dimensional vector representation for each node, such that the meaningful topological proximity is well preserved. While the embedding algorithms on pure topological structure have attracted considerable attention, in practice, nodes are often abundantly accompanied with other types of meaningful information, such as node attributes, second-order proximity, and link directionality. A general framework for incorporating the heterogeneous information into network embedding could be potentially helpful in learning better vector representations. However, it remains a challenging task to jointly embed the geometrical structure and a distinct type of information due to the heterogeneity. In addition, the real-world networks often contain a large number of nodes, which put demands on the scalability of the embedding algorithms. To bridge the gap, in this article, we propose a general embedding framework named Heterogeneous Information Learning in Large-scale networks (HILL) to accelerate the joint learning. It enables the simultaneous node proximity assessing process to be done in a distributed manner by decomposing the complex modeling and optimization into many simple and independent sub-problems. We validate the significant correlation between the heterogeneous information and topological structure, and illustrate the generalizability of HILL by applying it to perform attributed network embedding and second-order proximity learning. A variation is proposed for link directionality modeling. Experimental results on real-world networks demonstrate the effectiveness and efficiency of HILL.

authors

Zou, Na

published proceedings

ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA

author list (cited authors)

Huang, X., Li, J., Zou, N. a., & Hu, X.

citation count

11

complete list of authors

Huang, Xiao||Li, Jundong||Zou, Na||Hu, Xia

publication date

October 2018

publisher

Association for Computing Machinery (ACM) Publisher

A General Embedding Framework for Heterogeneous Information Learning in Large-Scale Networks Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL