Skip to content

Commit 8c02606

Browse files
authored
Merge pull request #942 from haochengxia/main
Midterm blog post for CacheBench
2 parents 2220cb9 + 3588ea6 commit 8c02606

File tree

1 file changed

+54
-0
lines changed
  • content/report/osre25/harvard/cachebench/2025-08-06-haochengxia

1 file changed

+54
-0
lines changed
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
title: "Midterm Report: Simulation, Comparison, and Conclusion of Cache Eviction"
3+
subtitle: ""
4+
summary: "Develop a comprehensive benchmarking suite, CacheBench, for evaluating the performance of cache systems in modern computing environments."
5+
authors:
6+
- haochengxia
7+
tags: ["osre25","reproducibility","storage systems"]
8+
categories: ["SummerofReproducibility24"]
9+
date: 2025-08-06
10+
lastmod: 2025-08-06
11+
featured: true
12+
draft: false
13+
14+
image:
15+
caption: "Haocheng Xia, SoR 2025"
16+
focal_point: "Center"
17+
preview_only: false
18+
---
19+
20+
## Project Overview
21+
22+
**CacheBench** is a benchmarking suite designed for comprehensive cache performance evaluation, with a particular focus on analyzing the miss ratios of various cache eviction algorithms.
23+
24+
At the core of CacheBench lie two key components: the high-performance cache simulator, [libCacheSim](https://github.com/1a1a11a/libCacheSim), and the extensive [open-source cache datasets](https://github.com/cacheMon/cache_dataset), which collectively contain over 8,000 traces from diverse applications. This ensures broad coverage across a range of realistic workloads.
25+
26+
Our primary goal is to evaluate all major and widely-used cache eviction algorithms on thousands of traces, in order to gain insights into their behaviors and design trade-offs. Additionally, we aim to identify and distill representative workloads, making benchmarking more efficient and comprehensive for future cache research.
27+
28+
## Progress and Pain Points
29+
30+
We began by benchmarking prevalent eviction algorithms, including FIFO, LRU, CLOCK, LFU, Random, Belady (BeladySize), CAR, ARC, LIRS, LHD, Hyperbolic, GDSF, W-TinyLFU, 2Q, SLRU, S3-FIFO, SIEVE, and LeCaR. As we developed the suite, we made progressive improvements to both the simulator and dataset infrastructure. Our progress can be summarized as follows:
31+
32+
- Collected miss ratio results for all listed algorithms across 8,000+ traces.
33+
- Identified best- and worst-performing traces for each algorithm, and conducted feature analysis of these traces.
34+
- Developed Python bindings: To increase accessibility, we provided a Python package that allows users to easily download traces and run simulation analyses using [libCacheSim](https://github.com/1a1a11a/libCacheSim) and the [cache datasets](https://github.com/cacheMon/cache_dataset).
35+
36+
However, analysis remains challenging because there is no universally accepted metric or baseline for objectively comparing cache eviction algorithms' performance across all workloads.
37+
38+
## Next Steps
39+
40+
For the second half of the project, my focus will shift to:
41+
42+
- **Evaluating More Complex Eviction Algorithms**: Having concentrated mainly on static eviction policies so far (which are generally more deterministic and understandable), I will now investigate learning-based eviction algorithms such as LRB and 3L-Cache. These models incorporate learning components and incur additional computational overhead, making simulations slower and more complex.
43+
44+
- **Detailed Trace Analysis**: Since eviction algorithms can have highly variable performance on the same trace, I plan to analyze why certain algorithms excel on specific traces while others do not. Understanding these factors is crucial to characterizing both the algorithms and the workload traces.
45+
46+
- **Constructing Representative Workload Sets**: Based on ongoing simulations and trace analyses, I aim to identify a minimal but representative subset of traces that can serve as a basic evaluation suite, simplifying testing and improving accessibility.
47+
48+
## Reflection
49+
50+
This project has truly been the highlight of my summer. By evaluating a wide range of cache eviction algorithms, I've significantly deepened my understanding of cache design and its underlying principles.
51+
52+
I'm especially grateful to my mentors for their constant support, patience, and guidance throughout this journey. It’s been a privilege to learn from you!
53+
54+
I'm excited to see the final results of CacheBench!

0 commit comments

Comments
 (0)