Tsinghua Science and Technology  2019, Vol. 24 Issue (06): 716-727    doi: 10.26599/TST.2018.9010106
 REGULAR ARTICLES
Local Community Detection Based on Network Motifs
Yunlei Zhang, Bin Wu*, Yu Liu, Jinna Lv
∙ Yunlei Zhang, Bin Wu, Yu Liu, and Jinna Lv are with the Beijing Key Laboratory of Intelligence Telecommunications Software and Multimedia, School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China. E-mail: yunlei0518@bupt.edu.cn; imyuliu@outlook.com; lvjinna@bupt.edu.cn.

Abstract

Local community detection aims to find a cluster of nodes by exploring a small region of the network. Local community detection methods are faster than traditional global community detection methods because their runtime does not depend on the size of the entire network. However, most existing methods do not take the higher-order connectivity patterns crucial to the network into consideration. In this paper, we develop a new Local Community Detection method based on network Motif (LCD-Motif) which incorporates the higher-order network information. LCD-Motif adopts the local expansion of a seed set to identify the local community with minimal motif conductance, representing a generalization of the conductance metric for network motifs. In contrast to PageRank-like diffusion methods, LCD-Motif finds the community by seeking a sparse vector in the span of the local spectra, such that the seeds are in its support vector. We evaluate our approach using real-world datasets across various domains and synthetic networks. The experimental results show that LCD-Motif can achieve a higher performance than state-of-the-art methods.

Received: 19 March 2018      Published: 20 June 2019
Corresponding Authors: Bin Wu
 Fig. 1 Network motifs (13 motifs in three-node motifs). M𝟔, $𝒮?𝒮$:1 denotes that node 1 is the seed node, the middle part is the motif adjacency matrix transformed from the original network, the right part is the local community which contains seed node 1 and three instances of motif $M𝟔$)."> Fig. 2 Illustration of motif adjacency matrix and local community detection (the given motif type is $M𝟔$, $𝒮?𝒮$:1 denotes that node 1 is the seed node, the middle part is the motif adjacency matrix transformed from the original network, the right part is the local community which contains seed node 1 and three instances of motif $M𝟔$). Table 1 Parameters for the LFR datasets. Table 2 Statistics for the networks. n: number of nodes; m: number of edges; CC: clustering coefficient of the network; GT: ground truth. Table 3 Average F1 score on LFR1–LFR8 datasets corresponding mixing parameter from 0.1 to 0.8. Fig. 3 Average F1 score on LFR datasets with varying mixing parameter from 0.1 to 0.8. Table 4 Performance of methods on 5 public realworld networks (MC denotes motif conductance). Fig. 4 Performance of methods on realworld networks. Fig. 5 Motif conductance on realworld networks with varying dimension l and random walk step k, respectively. y^."> Fig. 6 Motif conductance of different motifs in HepPh network. Size of set is the size of the community truncated from the sorted vector $y^$.