HPCache: memory-efficient OLAP through proportional caching revisited

Nicholson, Hamish; Chrysogelos, Periklis; Ailamaki, Anastasia

doi:10.1007/s00778-023-00828-7

Nicholson, Hamish; Chrysogelos, Periklis; Ailamaki, Anastasia

2023

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Résumé

Analytical engines rely on in-memory data caching to avoid storage accesses and provide timely responses by keeping the most frequently accessed data in memory. Purely frequency- and time-based caching decisions, however, are a proxy of the expected query execution speedup only when storage accesses are significantly slower than in-memory query processing. On the other hand, fast storage offers loading times that approach fully in-memory query response times, rendering purely frequency-based statistics incapable of capturing the impact of a caching decision on query execution. For example, caching the input of a frequent query that spends most of its time processing joins is less beneficial than caching a page for a slightly less frequent but scan-heavy query. Thus, existing caching policies waste valuable memory space to cache input data that offer little-to-no acceleration for analytics. This paper proposes HPCache, a buffer management policy that enables fast analytics on high-bandwidth storage by efficiently using the available in-memory space. HPCache caches data based on the speedup potential instead of relying on frequency-based statistics. We show that, with fast storage, the benefit of in-memory caching varies significantly across queries; therefore, we quantify the efficiency of caching decisions and formulate an optimization problem. We implement HPCache in Proteus and show that (i) estimating speedup potential improves memory space utilization, and (ii) simple runtime statistics suffice to infer speedup. We show that HPCache achieves up to a 1.75x speed-up over frequency-based caching policies by caching column proportions and automatically tuning them. Overall, HPCache enables efficient use of the in-memory space for input caching in the presence of fast storage, without requiring workload predictions.

Détails

Titre HPCache: memory-efficient OLAP through proportional caching revisited

Auteur(s) Nicholson, Hamish ; Chrysogelos, Periklis ; Ailamaki, Anastasia

Publié dans Vldb Journal

Date 2023-12-22

Editeur Springer, New York

ISSN 1066-8888
0949-877X

Mots-clés (libres)

Analytical Query Processing; Storage Engines; Storage-Resident Data; Nvme; High-Bandwidth Storage

DOI https://doi.org/10.1007/s00778-023-00828-7

Autres identifiant(s) Afficher la publication dans Web of Science

Laboratoires DIAS

Le document apparaît dans Production scientifique et compétences > I&C - Faculté Informatique & Communications > IINFCOM > DIAS - Laboratoire de systèmes et applications de traitement de données massives
Publications validées par des pairs
Travail produit à l'EPFL
Articles de journaux
Publié

Grant Schweizerischer Nationalfonds zur Frderung der Wissenschaftlichen Forschung: 200021_178894/1
SNSF project "Efficient Real-time Analytics on General-Purpose GPUs

Date de création de la notice 2024-02-20

Files

Résumé

Détails

PDF