Improved smoothed analysis of the k-means method

Bodo Manthey, Heiko Röglin

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

19 Citations (Scopus)
21 Downloads (Pure)


The k-means method is a widely used clustering algorithm. One of its distinguished features is its speed in practice. Its worst-case running-time, however, is exponential, leaving a gap between practical and theoretical performance. Arthur and Vassilvitskii [3] aimed at closing this gap, and they proved a bound of poly(nk, σ−1) on the smoothed running-time of the k-means method, where n is the number of data points and σ is the standard deviation of the Gaussian perturbation. This bound, though better than the worst-case bound, is still much larger than the running-time observed in practice.

We improve the smoothed analysis of the k-means method by showing two upper bounds on the expected running-time of k-means. First, we prove that the expected running-time is bounded by a polynomial in n√k and σ−1. Second, we prove an upper bound of kkd·poly(n, σ−1), where d is the dimension of the data space. The polynomial is independent of k and d, and we obtain a polynomial bound for the expected running-time for k, d ∈ O(√logn/log logn).

Finally, we show that k-means runs in smoothed polynomial time for one-dimensional instances.
Original languageEnglish
Title of host publicationProceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2009
EditorsC. Mathieu
Place of PublicationPhiladelphia
Number of pages10
Publication statusPublished - 2009
Event20th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2009 - New York, United States
Duration: 4 Jan 20096 Jan 2009
Conference number: 20


Conference20th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2009
Abbreviated titleSODA
Country/TerritoryUnited States
CityNew York


  • $k$-means method
  • IR-68852
  • Smoothed Analysis
  • Clustering
  • EWI-16976
  • METIS-264227


Dive into the research topics of 'Improved smoothed analysis of the k-means method'. Together they form a unique fingerprint.

Cite this