|
|
|
|
Many-core and PSPI: Mixing fine-grain and coarse-grain parallelism |
Niagara2 is the second generation innovative CMT, Chip Multi-Threading, CPU design from Sun Microsystems, Inc. It has eight computation cores with 4 Megabytes of shared L2 cache and 4 dual channel FBDIMM memory controller. It also contains integrated networking units, PCI-Express unit, embedded wire-speed cryptography coprocessor, and built-in virtualization supports. Each core has two integer execution pipes and one floating point execution pipe shared by eight fine-grained hardware threads. In all, Niagara2 sports 64 hardware threads and combines all major server and network functions on a single chip and is well suited for power efficient secure data-center and thread level parallel computing applications.
The idea behind the chip design is that most applications are memory bound, most of the time is waiting to retrieve memory from the either cache or main memory. By having several (in this case eight) simultaneous tasks attached to each processing unit you can hide the memory latency. Figure 1 illustrates this concept. The 'M' shows a thread waiting for a memory request while the 'C' shows computation. At each clock cycle computation is being performed and the time associated with memory requests are hidden.
|
niagara2
Figure 1. The idea behind the Niagara architecture. The 'M' shows a thread waiting for a memory request while the 'C' shows computation. At each clock cycle computation is being performed and the time associated with memory requests are hidden. [NR] |
|
|---|---|
|
|
The Niagara2 platform performs well on an application when two requirements are met. First, that the problem is truly memory bound. This is a function of memory access speed, memory hierarchy, and the compute engines of a given core. Second, the parallelism granularity of the application cannot require a siginificant level of synchronization.
|
|
|
|
Many-core and PSPI: Mixing fine-grain and coarse-grain parallelism |