TY - JOUR
T1 - Robust subgraph counting with distribution-free random graph analysis
AU - van Leeuwaarden, Johan S.H.
AU - Stegehuis, Clara
N1 - Funding Information:
J. van Leeuwaarden acknowledges NWO VICI grant 202.068. C. Stegehuis acknowledges NWO VENI grant 202.001.
Publisher Copyright:
©2021 American Physical Society
PY - 2021/10
Y1 - 2021/10
N2 - Subgraphs such as cliques, loops, and stars play a crucial role in real-world networks. Random graph models can provide estimates for how often certain subgraphs appear, which can be tested against real-world networks. These estimated subgraph counts, however, crucially depend on the assumed degree distribution. Fitting a degree distribution to network data is challenging, in particular, for scale-free networks with power-law degrees. Therefore, in this paper we develop robust subgraph counts that do not depend on the entire degree distribution but only on its mean and mean absolute deviation (MAD), summary statistics that are easy to obtain for most real-world networks. By solving an optimization problem, we provide tight (the sharpest possible) bounds for the subgraph counts, for all possible subgraphs, and for all networks with degree distributions that share the same mean and MAD. We identify the extremal random graph that attains these tight bounds as the graph with a specific three-point degree distribution. We leverage the bounds on the maximal subgraph counts to obtain robust scaling laws for how the number of subgraphs grows as a function of the network size. The scaling laws indicate that sparse power-law networks are not the most extreme networks in terms of subgraph counts but dense power-law networks are. The robust bounds are also shown to hold for several real-world data sets.
AB - Subgraphs such as cliques, loops, and stars play a crucial role in real-world networks. Random graph models can provide estimates for how often certain subgraphs appear, which can be tested against real-world networks. These estimated subgraph counts, however, crucially depend on the assumed degree distribution. Fitting a degree distribution to network data is challenging, in particular, for scale-free networks with power-law degrees. Therefore, in this paper we develop robust subgraph counts that do not depend on the entire degree distribution but only on its mean and mean absolute deviation (MAD), summary statistics that are easy to obtain for most real-world networks. By solving an optimization problem, we provide tight (the sharpest possible) bounds for the subgraph counts, for all possible subgraphs, and for all networks with degree distributions that share the same mean and MAD. We identify the extremal random graph that attains these tight bounds as the graph with a specific three-point degree distribution. We leverage the bounds on the maximal subgraph counts to obtain robust scaling laws for how the number of subgraphs grows as a function of the network size. The scaling laws indicate that sparse power-law networks are not the most extreme networks in terms of subgraph counts but dense power-law networks are. The robust bounds are also shown to hold for several real-world data sets.
UR - http://www.scopus.com/inward/record.url?scp=85117736032&partnerID=8YFLogxK
U2 - 10.1103/PhysRevE.104.044313
DO - 10.1103/PhysRevE.104.044313
M3 - Article
AN - SCOPUS:85117736032
SN - 2470-0045
VL - 104
JO - Physical Review E
JF - Physical Review E
IS - 4
M1 - A41
ER -