TY - JOUR

T1 - Robust subgraph counting with distribution-free random graph analysis

AU - van Leeuwaarden, Johan S.H.

AU - Stegehuis, Clara

N1 - Funding Information:
J. van Leeuwaarden acknowledges NWO VICI grant 202.068. C. Stegehuis acknowledges NWO VENI grant 202.001.
Publisher Copyright:
©2021 American Physical Society

PY - 2021/10

Y1 - 2021/10

N2 - Subgraphs such as cliques, loops, and stars play a crucial role in real-world networks. Random graph models can provide estimates for how often certain subgraphs appear, which can be tested against real-world networks. These estimated subgraph counts, however, crucially depend on the assumed degree distribution. Fitting a degree distribution to network data is challenging, in particular, for scale-free networks with power-law degrees. Therefore, in this paper we develop robust subgraph counts that do not depend on the entire degree distribution but only on its mean and mean absolute deviation (MAD), summary statistics that are easy to obtain for most real-world networks. By solving an optimization problem, we provide tight (the sharpest possible) bounds for the subgraph counts, for all possible subgraphs, and for all networks with degree distributions that share the same mean and MAD. We identify the extremal random graph that attains these tight bounds as the graph with a specific three-point degree distribution. We leverage the bounds on the maximal subgraph counts to obtain robust scaling laws for how the number of subgraphs grows as a function of the network size. The scaling laws indicate that sparse power-law networks are not the most extreme networks in terms of subgraph counts but dense power-law networks are. The robust bounds are also shown to hold for several real-world data sets.

AB - Subgraphs such as cliques, loops, and stars play a crucial role in real-world networks. Random graph models can provide estimates for how often certain subgraphs appear, which can be tested against real-world networks. These estimated subgraph counts, however, crucially depend on the assumed degree distribution. Fitting a degree distribution to network data is challenging, in particular, for scale-free networks with power-law degrees. Therefore, in this paper we develop robust subgraph counts that do not depend on the entire degree distribution but only on its mean and mean absolute deviation (MAD), summary statistics that are easy to obtain for most real-world networks. By solving an optimization problem, we provide tight (the sharpest possible) bounds for the subgraph counts, for all possible subgraphs, and for all networks with degree distributions that share the same mean and MAD. We identify the extremal random graph that attains these tight bounds as the graph with a specific three-point degree distribution. We leverage the bounds on the maximal subgraph counts to obtain robust scaling laws for how the number of subgraphs grows as a function of the network size. The scaling laws indicate that sparse power-law networks are not the most extreme networks in terms of subgraph counts but dense power-law networks are. The robust bounds are also shown to hold for several real-world data sets.

UR - http://www.scopus.com/inward/record.url?scp=85117736032&partnerID=8YFLogxK

U2 - 10.1103/PhysRevE.104.044313

DO - 10.1103/PhysRevE.104.044313

M3 - Article

AN - SCOPUS:85117736032

VL - 104

JO - Physical review E: covering statistical, nonlinear, biological, and soft matter physics

JF - Physical review E: covering statistical, nonlinear, biological, and soft matter physics

SN - 2470-0045

IS - 4

M1 - A41

ER -