linear elastic caching cuts cloud memory costs

source: google research: optimizing cloud economics with linear elastic caching

level: technical

modern cloud databases rely on in-memory caching for fast performance, but memory is expensive. traditional fixed-size caches force a trade-off: too small and performance drops, too large and money is wasted on idle memory. google research proposes linear elastic caching, which treats memory as a variable cost that accumulates over time. the system dynamically adjusts cache size based on real-time workloads, aiming to minimize the total cost of memory and cache misses.

the approach frames cache eviction as a ski rental problem. each cached page faces a choice: keep it in memory and pay ongoing rental costs, or evict it and risk a buy cost if it is needed again soon. the solution separates eviction policy from rental duration. a lightweight machine learning model, a shallow decision tree, predicts a time-to-live for each page based on access patterns and costs. if the cache fills up, a standard eviction policy like least recently used takes over.

tests on google spanner production workloads showed a 15.5% reduction in memory usage with only a 5.5% increase in cache misses, leading to a 5% lower total cost of ownership. the extra misses were on cheap-to-fetch data, so i/o costs barely rose. experiments on public cache traces confirmed the benefits, with elastic caching consistently outperforming fixed-size caches, especially as memory costs rise relative to miss costs. the method adapts cache size to workload, reducing both cost and miss rate.

why it matters: it enables cloud services to cut memory costs without hurting performance, using simple machine learning to make caching more efficient.

source: google research: optimizing cloud economics with linear elastic caching