A singular perturbation approach for choosing the PageRank damping factor

Konstatin Avrachenkov, Nelly Litvak, Kim Son Pham

Research output: Contribution to journalArticleAcademicpeer-review

17 Citations (Scopus)
19 Downloads (Pure)

Abstract

We study the PageRank mass of principal components in a bow-tie web graph as a function of the damping factor $c$. It is known that the web graph can be divided into three principal components: SCC, IN, and OUT. The giant strongly connected component (SCC) contains a large group of pages having a hyperlink path connecting them. The pages in the IN (OUT) component have a path to (from) the SCC, but not back. Using a singular perturbation approach, we show that the PageRank share of the IN and SCC components remains high even for very large values of the damping factor, in spite of the fact that it drops to zero when $c$ tends to one. However, a detailed study of the OUT component reveals the presence of "dead ends" (small groups of pages linking only to each other) that receive an unfairly high ranking when $c$ is close to 1. We argue that this problem can be mitigated by choosing $c$ as small as ½.
Original languageEnglish
Pages (from-to)47-69
Number of pages23
JournalInternet mathematics
Volume5
Issue number1-2
Publication statusPublished - 2008

Fingerprint

PageRank
Singular Perturbation
Connected Components
Damping
Web Graph
Principal Components
Path
Tie
Linking
Ranking
Tend
Zero

Keywords

  • EWI-17732
  • IR-71099
  • METIS-268957

Cite this

Avrachenkov, Konstatin ; Litvak, Nelly ; Pham, Kim Son. / A singular perturbation approach for choosing the PageRank damping factor. In: Internet mathematics. 2008 ; Vol. 5, No. 1-2. pp. 47-69.
@article{8b63b3dcd2ab4d92a6852dd8c3052ea1,
title = "A singular perturbation approach for choosing the PageRank damping factor",
abstract = "We study the PageRank mass of principal components in a bow-tie web graph as a function of the damping factor $c$. It is known that the web graph can be divided into three principal components: SCC, IN, and OUT. The giant strongly connected component (SCC) contains a large group of pages having a hyperlink path connecting them. The pages in the IN (OUT) component have a path to (from) the SCC, but not back. Using a singular perturbation approach, we show that the PageRank share of the IN and SCC components remains high even for very large values of the damping factor, in spite of the fact that it drops to zero when $c$ tends to one. However, a detailed study of the OUT component reveals the presence of {"}dead ends{"} (small groups of pages linking only to each other) that receive an unfairly high ranking when $c$ is close to 1. We argue that this problem can be mitigated by choosing $c$ as small as ½.",
keywords = "EWI-17732, IR-71099, METIS-268957",
author = "Konstatin Avrachenkov and Nelly Litvak and Pham, {Kim Son}",
year = "2008",
language = "English",
volume = "5",
pages = "47--69",
journal = "Internet mathematics",
issn = "1542-7951",
publisher = "Taylor & Francis",
number = "1-2",

}

A singular perturbation approach for choosing the PageRank damping factor. / Avrachenkov, Konstatin; Litvak, Nelly; Pham, Kim Son.

In: Internet mathematics, Vol. 5, No. 1-2, 2008, p. 47-69.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - A singular perturbation approach for choosing the PageRank damping factor

AU - Avrachenkov, Konstatin

AU - Litvak, Nelly

AU - Pham, Kim Son

PY - 2008

Y1 - 2008

N2 - We study the PageRank mass of principal components in a bow-tie web graph as a function of the damping factor $c$. It is known that the web graph can be divided into three principal components: SCC, IN, and OUT. The giant strongly connected component (SCC) contains a large group of pages having a hyperlink path connecting them. The pages in the IN (OUT) component have a path to (from) the SCC, but not back. Using a singular perturbation approach, we show that the PageRank share of the IN and SCC components remains high even for very large values of the damping factor, in spite of the fact that it drops to zero when $c$ tends to one. However, a detailed study of the OUT component reveals the presence of "dead ends" (small groups of pages linking only to each other) that receive an unfairly high ranking when $c$ is close to 1. We argue that this problem can be mitigated by choosing $c$ as small as ½.

AB - We study the PageRank mass of principal components in a bow-tie web graph as a function of the damping factor $c$. It is known that the web graph can be divided into three principal components: SCC, IN, and OUT. The giant strongly connected component (SCC) contains a large group of pages having a hyperlink path connecting them. The pages in the IN (OUT) component have a path to (from) the SCC, but not back. Using a singular perturbation approach, we show that the PageRank share of the IN and SCC components remains high even for very large values of the damping factor, in spite of the fact that it drops to zero when $c$ tends to one. However, a detailed study of the OUT component reveals the presence of "dead ends" (small groups of pages linking only to each other) that receive an unfairly high ranking when $c$ is close to 1. We argue that this problem can be mitigated by choosing $c$ as small as ½.

KW - EWI-17732

KW - IR-71099

KW - METIS-268957

M3 - Article

VL - 5

SP - 47

EP - 69

JO - Internet mathematics

JF - Internet mathematics

SN - 1542-7951

IS - 1-2

ER -