The Price of Explainability for Clustering

Gupta, Anupam; Pittu, Madhusudhan Reddy; Svensson, Ola; Yuan, Rachel

doi:10.1109/FOCS57990.2023.00067

Gupta, Anupam; Pittu, Madhusudhan Reddy; Svensson, Ola; Yuan, Rachel

2023

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Given a set of points in d-dimensional space, an explainable clustering is one where the clusters are specified by a tree of axis-aligned threshold cuts. Dasgupta et al. (ICML 2020) posed the question of the price of explainability: the worst-case ratio between the cost of the best explainable clusterings to that of the best clusterings.|We show that the price of explainability for k-medians is at most 1 + Hk-1; in fact, we show that the popular Random Thresholds algorithm has exactly this price of explainability, matching the known lower bound constructions. We complement our tight analysis of this particular algorithm by constructing instances where the price of explainability (using any algorithm) is at least (1 - o(1)) ln k, showing that our result is best possible, up to lower-order terms. We also improve the price of explainability for the k-means problem to O(k ln ln k) from the previous O(k ln k), considerably closing the gap to the lower bounds of Omega(k). Finally, we study the algorithmic question of finding the best explainable clustering: We show that explainable k-medians and k-means cannot be approximated better than O(ln k), under standard complexity-theoretic conjectures. This essentially settles the approximability of explainable k-medians and leaves open the intriguing possibility to get significantly better approximation algorithms for k-means than its price of explainability.

Details

Title The Price of Explainability for Clustering

Author(s) Gupta, Anupam ; Pittu, Madhusudhan Reddy ; Svensson, Ola ; Yuan, Rachel

Published in 2023 Ieee 64Th Annual Symposium On Foundations Of Computer Science, Focs

Pages 1131-1148

Conference 64th Annual IEEE Symposium on the Foundations of Computer Science (FOCS), NOV 06-09, 2023, Santa Cruz, CA

Date 2023-01-01

Publisher Ieee Computer Soc, Los Alamitos

ISSN 0272-5428

ISBN 979-8-3503-1894-4

Keywords

K-Means; K-Medians; Explainable Clustering; Approximation Algorithms; Randomized Algorithms

DOI https://doi.org/10.1109/FOCS57990.2023.00067

Other identifier(s) View record in Web of Science

Laboratories THL2

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > THL2 - Theory of Computation Laboratory 2
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Grant Swiss National Science Foundation: 200021-184656

Record creation date 2024-02-23