Cache
A Cache is a system in which fast memory is used to accelerate access to a larger, but also slower main memory. The cache memory contains a copy of a portion of the content of the main memory.
Overview
Cache memory is fast memory that contains a copy of slower memory in the system. The purpose of the cache is to minimize delays experienced by the processor, thus making a system faster.
Caches are typically used in systems with clock speeds of 200 MHz and higher. In large systems, such as PCs and servers, there are typically multiple levels of caching, and these are labelled L1 (first-level cache), L2, and L3.
Cache sizes vary. Typically, L1 cache sizes are around 16 kB. In systems with multi-level caches, the upper levels (L2 and L3) are larger than the L1 cache. In many systems, two L1 caches exist in parallel, one for data (D-cache), and one for instructions (I-cache), which typically merge into a unified L2 cache.
Cache types
Read-only cache
A read-only cache is typically used as instruction cache. It caches read accesses, but can not handle write accesses, which would then go right through to main memory. If main memory gets updated, the cache needs to be invalidated. This can be done for just the affected lines or for the entire cache. Invalidating a cache is an easy thing, as only the valid flags for the affected (or all lines) must be cleared.
Read-write (data) cache
A data cache caches read operations, but also write operations. When reading data, it works just like a read-only cache: Should the requested information already be in the cache, then it is returned to the requester (the CPU). If it is not, then underlying memory is read and the data transferred into the cache as well as delivered to the requester.
Write through
A write-through cache writes data immediately to main memory as well as to the cache. This approach is simple and ensures consistency, but it is significantly slower for write-intensive workloads.
Write back
A write-back cache stores modified data only in the cache initially. A “dirty flag” marks lines that need to be written back to main memory. Some caches use multiple dirty flags to minimize unnecessary memory writes. Write-back caches are faster for both reads and writes, but require careful handling in cases where main memory must be consistent with the cache (e.g., before DMA transfers, since DMA operates on main memory).
Cache maintenance operations
A cache typically provides at least two types of maintenance operations:
- Invalidate - The affected cache lines are marked as not valid and will not be used on subsequent read or write operations. Before invalidating a cache line, it should be cleaned to avoid data loss.
- Clean - The affected (dirty) cache line(s) are copied into main memory. Not that they continue to be valid.
Normally there are different types of invalidate and clean operations possible: By address, by location in the cache (by set or way) or on the entire cache (usually for a read-only I-cache only)
Cache organization
A cache consists of a number of cache lines, organized in sets and ways.
Cache lines
A cache line is typically 32 or 64 bytes. Every time a miss occurs, the entire cache line is filled from main memory.
FAQ
Q: Which memory areas are cached? A: It depends. Usually, this is configurable. In systems with MMU or MPU, the definitions of the memory area also contain information on how to cache them.