Nest accelerator – Architecture and technical overview
By Isabella Ward / December 13, 2023 / No Comments / Hardware management console overview, IBM Certifcation Exam, Simultaneous multithreading
2.1.10 Nest accelerator
The Power10 processor has an on-chip accelerator called nest accelerator unit or NX unit. The coprocessor features that are available on the Power10 processor are similar to the features on the POWER9 processor. These coprocessors provide specialized functions, such as the following examples:
Ê IBM proprietary data compression and decompression
Ê Industry standard gzip compression and decompression
Ê AES and Secure Hash Algorithm (SHA) cryptography
Ê Random number generation
Figure 2-6 on page 63 shows a block diagram of the NX unit.62 IBM Power E1080: Technical Overview and Introduction
Figure 2-6 Block diagram of the NX unit
Each one of the AES/SHA engines, data compression, and Gzip units consist of a coprocessor type and the NX unit features three coprocessor types. The NX unit also includes more support hardware to support coprocessor invocation by user code, use of effective addresses, high-bandwidth storage accesses, and interrupt notification of job completion.
The direct memory access (DMA) controller of the NX unit helps to start the coprocessors and move data on behalf of coprocessors. SMP interconnect unit (SIU) provides the interface between the Power10 SMP interconnect and the DMA controller.
The NX coprocessors can be started transparently through library or operating system kernel calls to speed up operations that are related to data compression, live partition mobility migration, IPSec, JFS2 encrypted file systems, PKCS11 encryption, random number generation, and the most recently announced logical volume encryption.
In effect, this on-chip NX unit on Power10 systems implements a high throughput engine that can perform the equivalent work of multiple cores. The system performance can benefit by off-loading these expensive operations to on-chip accelerators, which in turn can greatly reduce the CPU usage and improve the performance of applications.
The accelerators are shared among the logical partitions (LPARs) under the control of the PowerVM hypervisor and accessed by way of hypervisor call. The operating system, along with the PowerVM hypervisor, provides a send address space that is unique per process requesting the coprocessor access. This configuration allows the user process to directly post entries to the first in – first out (FIFO) queues that are associated with the NX accelerators. Each NX coprocessor type has a unique receive address space corresponding to a unique FIFO for each of the accelerators.
Chapter 2. Architecture and technical overview 63
For more information about the use of the xgzip tool that uses the gzip accelerator engine, see the following resources:
Ê IBM support article Using the POWER9 NX (gzip) accelerator in AIX
Ê IBM Power Systems community article Power9 GZIP Data Acceleration with IBM AIX
Ê AIX community article Performance improvement in openssh with on-chip data compression accelerator in power9
Ê IBM Documentation: nxstat Command