Cache-aware orchestration for LLM agents. Fork helpers that share cached prefixes, detect cache breaks, and cut token costs by 38%+.
-
Updated
Apr 2, 2026 - Python
Cache-aware orchestration for LLM agents. Fork helpers that share cached prefixes, detect cache breaks, and cut token costs by 38%+.
Arhitectura Sistemelor de Calcul - UPB 2020
Comparison of parallel matrix multiplication methods using OpenMP, focusing on cache efficiency, runtime, and performance analysis with Intel VTune.
University of Pittsburgh ECE 1175
Enhanced Learned Bloom Filter with cache optimization, incremental learning, and adaptive threshold control
Solved tasks of the master's degree courses of speciality "Algorithms and Systems for Big Data Processing".
Optimize multi-agent LLM workflows by sharing prompt prefixes and cutting cached input token costs
Add a description, image, and links to the cache-optimization topic page so that developers can more easily learn about it.
To associate your repository with the cache-optimization topic, visit your repo's landing page and select "manage topics."