A cache in a computer or computer program is a mechanism to speed up access to data. In light of the well-known principle "There is no such thing as a free lunch.", it may be educational to see how this trick is accomplished, and to appreciate the trade-offs that are made when you or your computer use a cache.
A computer remembers data by storing it in memory. If it took the same time to access any part of memory, there would be no need for caches. Unfortunately, that is not the case. Even today, when we have gigabytes of main memory in a garden-variety desktop computer, there are several different kinds of memory, each of which has a different access time. The table below lists common kinds and amounts of memory and the typical time it takes to get a single item of data from each kind.
|CPU cache||10ns||512 KB|
|Main Memory (DRAM)||150ns||1024 MB|
|Hard Disk||140μs||150 GB|
|USB Memory||800μs-5000μs||1000 MB|
Plainly, there is a great deal of variation in how fast we can fetch data into the CPU so we can use it. To speed things up, it would be nice if all data references were as fast as the fastest memory type. Happily, we can see a way to create that behavior by observing that very little of the available data is actually needed at any one time. Therefore, we can move the needed data into the cache before it is used, and everything will be speedy.
But we have traded one problem for another. How are we to know which data are needed? Fortunately, a great many studies have shown that when we fetch one data item from its original spot, it is very likely that we will also want to have nearby items shortly thereafter. This property of computer programs is called locality of reference, and is one of the things that makes caches work at all.