Nearly-Tight and Oblivious Algorithms for Explainable Clustering

Gamlath, Buddhima; Jia, Xinrui; Polak, Adam Teodor; Svensson, Ola Nils Anders

Gamlath, Buddhima; Jia, Xinrui; Polak, Adam Teodor; Svensson, Ola Nils Anders

2021

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

We study the problem of explainable clustering in the setting first formalized by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). A k-clustering is said to be explainable if it is given by a decision tree where each internal node splits data points with a threshold cut in a single dimension (feature), and each of the k leaves corresponds to a cluster. We give an algorithm that outputs an explainable clustering that loses at most a factor of O(log2 k) compared to an optimal (not necessarily explainable) clustering for the k-medians objective, and a factor of O(k log2 k) for the k-means objective. This improves over the previous best upper bounds of O(k) and O(k2), respectively, and nearly matches the previous Ω(log k) lower bound for k-medians and our new Ω(k) lower bound for k-means. The algorithm is remarkably simple. In particular, given an initial not necessarily explainable clustering in Rd, it is oblivious to the data points and runs in time O(dk log2 k), independent of the number of data points n. Our upper and lower bounds also generalize to objectives given by higher ℓp-norms. © 2021 Neural information processing systems foundation.

Details

Title Nearly-Tight and Oblivious Algorithms for Explainable Clustering

Author(s) Gamlath, Buddhima ; Jia, Xinrui ; Polak, Adam Teodor ; Svensson, Ola Nils Anders

Published in Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021

Series Advances in Neural Information Processing Systems, 34

Volume 34

Pages 28929–28939

Conference 35th Conference on Neural Information Processing Systems, NeurIPS 2021, virtual, December 6-14, 2021

Date 2021

Other identifier(s) View record in Scopus

Additional link Link to conference paper

Laboratories DISOPT

Record Appears in Scientific production and competences > SB - School of Basic Sciences > MATH - Institute of Mathematics > DISOPT - Chair of Discrete Optimization
Scientific production and competences > SB - School of Basic Sciences > Mathematics
Peer-reviewed publications
Conference Papers
Work produced at EPFL

Record creation date 2022-11-17