The first and most obvious spot in a computer system where we find a cache is between the CPU and the main memory. This cache exists because it is possible to build a much faster memory on the same silicon chip with the CPU than one that is external to the CPU package. As manufacturing skills have improved in the semiconductor processing industry, the number of transistors that can be packed onto a chip surface has increased steadily. This has made it possible to include a cache memory on the same chip with the CPU at an acceptable cost, both in terms of dollars and in terms of heat dissipation. The access-time ratio for this cache system is usually around 10:1; to achieve good performance, the miss ratio should be kept under about 3%.
Typically, designers include two caches on-chip: one to contain only instructions, and the other to contain data. The advantages of this arrangement are that locality is improved, since the instruction stream generally comes from a different memory segment than the data, and cache control is simplified, since instructions are not altered by program execution (in modern processors), so there is never a dirty miss in the instruction cache that would incur extra processing time. In addition, there is an additional small performance improvement because the instruction stream is needed at a different spot on the CPU chip than the data; having separate caches means that the cache output data do not have far to travel on the silicon chip.
Another kind of cache is the virtual memory system created by most modern operating systems; this scheme makes all of the data on a disk file system available as if it were in main memory. The design tradeoffs are quite different in this case; the access-time ratio is near 1000:1, so to achieve good performance of the memory/disc system, the miss ratio must be kept quite small (under 0.5%). Doing the necessary bookkeeping to achieve that small a miss ratio is not at all trivial; whole books have been written on this topic.