The (un)supervised NMF methods for discovering overlapping communities as well as hubs and outliers in networks

Citation

Wang, Xiao; Cao, Xiaochun; Jin, Di; Cao, Yixin; & He, Dongxiao (2016). The (un)supervised NMF methods for discovering overlapping communities as well as hubs and outliers in networks. Physica A: Statistical Mechanics and its Applications. vol. 446 pp. 22-34

Abstract

For its crucial importance in the study of large-scale networks, many researchers devote to the detection of communities in various networks. It is now widely agreed that the communities usually overlap with each other. In some communities, there exist members that play a special role as hubs (also known as leaders), whose importance merits special attention. Moreover, it is also observed that some members of the network do not belong to any communities in a convincing way, and hence recognized as outliers. Failure to detect and exclude outliers will distort, sometimes significantly, the outcome of the detected communities. In short, it is preferable for a community detection method to detect all three structures altogether. This becomes even more interesting and also more challenging when we take the unsupervised assumption, that is, we do not assume the prior knowledge of the number K of communities. Our approach here is to define a novel generative model and formalize the detection of overlapping communities as well as hubs and outliers as an optimization problem on it. When K is given, we propose a normalized symmetric nonnegative matrix factorization algorithm based on Kullback–Leibler (KL) divergence to learn the parameters of the model. Otherwise, by combining KL divergence and prior model on parameters, we introduce another parameter learning method based on Bayesian symmetric nonnegative matrix factorization to learn the parameters of the model, while determining K . Therefore, we present a community detection method arguably in the most general sense, which detects all three structures altogether without prior knowledge of the number of communities. Finally, we test the proposed method on various real-world networks. The experimental results, in contrast to several state-of-art algorithms, indicate its superior performance over other ones in terms of both clustering accuracy and community quality.

URL

http://www.sciencedirect.com/science/article/pii/S0378437115009954

Keyword(s)

Overlapping community Hubs Outliers (Bayesian) NMF

Reference Type

Journal Article

Journal Title

Physica A: Statistical Mechanics and its Applications

Author(s)

Wang, Xiao
Cao, Xiaochun
Jin, Di
Cao, Yixin
He, Dongxiao

Year Published

2016

Volume Number

446

Pages

22-34

Edition

November 29, 2015

DOI

10.1016/j.physa.2015.11.016

Reference ID

9139