SC is the International Conference for
 High Performnance Computing, Networking, Storage and Analysis

SCHEDULE: NOV 13-19, 2010

The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches

SESSION: Optimization Strategies on the Node


TIME: 3:30PM - 4:00PM

SESSION CHAIR: Karl Fuerlinger

AUTHOR(S):David Tarjan, Kevin Skadron


Graphics Processing Units (GPUs) have recently emerged as a new platform for high performance, general-purpose computing, due to their combination of high peak performance and high memory bandwidth. Because current GPUs employ deep multithreading to hide latency, they only have small, per-core caches to capture reuse and eliminate unnecessary off-chip accesses. We show that for general-purpose workloads, the ability to copy cache lines between private caches captures inter-core temporal locality and provides substantial reductions in off-chip bandwidth requirements. We introduce the sharing tracker to track cache lines in the private caches on a chip imprecisely (because it is only a performance hint). This is so effective at capturing inter-core reuse that the L2 can be eliminated entirely. The sharing tracker is motivated by but not specific to the GPU and hence could be used in other manycore organizations.

Chair/Author Details:

Karl Fuerlinger (Chair) - University of California, Berkeley

David Tarjan - University of Virginia

Kevin Skadron - University of Virginia

