Any Good resource to understand Hardware Implementation of various Cache Memory Mapping techniques (Direct, Fully Associative and K-way Set Associative). I mean how many Comparators, Multiplexers, etc. are required? How do they work?

Please read Hennessy Patterson. Cache memory part.
