Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp0176537138f
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Martonosi, Margaret | en_US |
dc.contributor.author | Wu, Carole-Jean | en_US |
dc.contributor.other | Electrical Engineering Department | en_US |
dc.date.accessioned | 2012-08-01T19:34:27Z | - |
dc.date.available | 2012-08-01T19:34:27Z | - |
dc.date.issued | 2012 | en_US |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/dsp0176537138f | - |
dc.description.abstract | Given the emerging dominance of chip-multiprocessor (CMP) systems, an important research problem concerns application memory performance in the face of deep memory hierarchies, where one or more caches are shared by multiple cores. Often, when several applications compete for capacity in shared caches, the performance of multiprogrammed and parallel workloads degrades significantly and becomes unpredictable. This happens because the commonly-used Least-Recently-Used replacement policy does not distinguish between processes and their distinct memory needs. Therefore, processes often suffer from such inter-application cache interference and the overall system throughput can be slowed down by as much as 55%. In addition to managing multiple applications sharing the last-level cache (LLC), managing a single application's memory performance is far from straightforward even in an idealized setup, considering only user accesses. It becomes even more challenging in real-machine environments, where interference can stem from operating system (OS) activities, and even from an application's own prefetch requests and page table walks caused by Translation Lookaside Buffer (TLB) misses. Using hardware performance counters on existing CMPs, this thesis characterizes such intra-application cache interference in the LLC and shows that application data references represent much less than half of the LLC misses, with hardware prefetching and page table walks causing considerable intra-application cache interference. The primary focus of my thesis is to address the challenges of both inter- and intra-application cache interference through hardware and software design mechanisms. My thesis focuses on each of these issues across a range of computing application domains and tackles an overarching research problem: Addressing intra- as well as inter-application cache interference, stemming from user applications, OS, and hardware prefetching via dynamic management to achieve better and more predictable performance improvement. The intelligent LLC management proposed in this thesis can speed up execution for a diverse range of applications by taking into account the memory requirement of co-scheduled applications, OS reference characteristics, and hardware prefetching. For mitigating the degree of contention when multiple applications are accessing the shared LLC simultaneously, my thesis proposes OS priority-aware and signature-based cache capacity management techniques. In particular, by correlating memory reuse characteristics and each memory request's unique signature, such as an instruction's program counter or a sequence of instruction types, my thesis demonstrates that the proposed capacity management techniques allow the shared LLC to be utilized more effectively than other state-of-the-art techniques. While inter-application interference has received significant research attention, intra-application interference is less well studied. Based on the detailed characterization of intra-application cache interference in the face of a modern OS and current, aggressive hardware prefetchers on an existing system, my thesis proposes and evaluates dynamic management techniques to address inter- as well as intra-application cache interference. Furthermore, my thesis also demands cache research proposals to carefully account for real-system effects while proposing and evaluating new cache management designs. Overall, this thesis offers a mix of real-system characterizations with detailed evaluations of hardware and system proposals that can help guide future OS and architecture work regarding the importance and challenges of both inter- and intra-application cache interference. Via signature-based, prefetch- and OS-aware approaches, performance is improved by as much as 65% and with an average of 22% execution time speedup. These techniques establish the importance of making user application performance more predictable in the face of system complexities and deep memory hierarchies. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Princeton, NJ : Princeton University | en_US |
dc.relation.isformatof | The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the <a href=http://catalog.princeton.edu> library's main catalog </a> | en_US |
dc.subject | Cache interference | en_US |
dc.subject | Chip-multiprocessor caches | en_US |
dc.subject | Intra-application cache interference | en_US |
dc.subject | Operating system cache effects | en_US |
dc.subject | Prefetcher management | en_US |
dc.subject | Re-reference interval prediction | en_US |
dc.subject.classification | Computer engineering | en_US |
dc.title | Dynamic Techniques for Mitigating Inter- and Intra-Application Cache Interference | en_US |
dc.type | Academic dissertations (Ph.D.) | en_US |
pu.projectgrantnumber | 690-2143 | en_US |
Appears in Collections: | Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Wu_princeton_0181D_10156.pdf | 25 MB | Adobe PDF | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.