Building a Virtual Memory System for KV Cache
LLM Inference & Memory Systems DS practice problem on Onlearn.
Difficulty: hard.
Topics: Understanding PagedAttention and KV Cache Memory Management, PagedAttention, KV Cache Block Table, Physical Block Pool, Logical-to-Physical Mapping, External Fragmentation Mitigation, Computer Architecture, Operating Systems, Deep Learning Systems, Data Structures, Memory Management, Virtual Memory, Cache Locality, Memory Fragmentation, Resource Allocation, Reference Counting.
Implement a 'BlockManager' class that manages physical memory for a KV cache. The system should support: 1. Allocation of blocks for a specific sequence. 2. Freeing of blocks. 3. Mapping logical block indices to physical block indices. 4. Tracking reference counts for shared memory blocks. The system should return a list of physical block indices for a requested number of blocks and raise an error if physical memory is exhausted.