TY - UNPB

T1 - Robust subgraph counting with distribution-free random graph analysis

AU - Leeuwaarden, Johan S. H. van

AU - Stegehuis, Clara

N1 - 14 pages

PY - 2021/7/21

Y1 - 2021/7/21

N2 - Subgraphs such as cliques, loops and stars form crucial connections in the topologies of real-world networks. Random graph models provide estimates for how often certain subgraphs appear, which in turn can be tested against real-world networks. These subgraph counts, however, crucially depend on the assumed degree distribution. Fitting a degree distribution to network data is challenging, in particular for scale-free networks with power-law degrees. In this paper we develop robust subgraph counts that do not depend on the entire degree distribution, but only on the mean and mean absolute deviation (MAD), summary statistics that are easy to obtain for most real-world networks. By solving an optimization problem, we provide tight (the sharpest possible) bounds for the subgraph counts, for all possible subgraphs, and for all networks with degree distributions that share the same mean and MAD. We identify the extremal random graph that attains the tight bounds as the graph with a specific three-point degree distribution. We leverage the bounds to obtain robust scaling laws for how the numbers of subgraphs grow as function of the network size. The scaling laws indicate that sparse power-law networks are not the most extreme networks in terms of subgraph counts, but dense power-law networks are. The robust bounds are also shown to hold for several real-world data sets.

AB - Subgraphs such as cliques, loops and stars form crucial connections in the topologies of real-world networks. Random graph models provide estimates for how often certain subgraphs appear, which in turn can be tested against real-world networks. These subgraph counts, however, crucially depend on the assumed degree distribution. Fitting a degree distribution to network data is challenging, in particular for scale-free networks with power-law degrees. In this paper we develop robust subgraph counts that do not depend on the entire degree distribution, but only on the mean and mean absolute deviation (MAD), summary statistics that are easy to obtain for most real-world networks. By solving an optimization problem, we provide tight (the sharpest possible) bounds for the subgraph counts, for all possible subgraphs, and for all networks with degree distributions that share the same mean and MAD. We identify the extremal random graph that attains the tight bounds as the graph with a specific three-point degree distribution. We leverage the bounds to obtain robust scaling laws for how the numbers of subgraphs grow as function of the network size. The scaling laws indicate that sparse power-law networks are not the most extreme networks in terms of subgraph counts, but dense power-law networks are. The robust bounds are also shown to hold for several real-world data sets.

KW - cs.SI

KW - math.PR

KW - physics.soc-ph

M3 - Working paper

BT - Robust subgraph counting with distribution-free random graph analysis

PB - arXiv.org

ER -