Power10 processor overview – Architecture and technical overview
By Isabella Ward / July 13, 2023 / No Comments / IBM Certifcation Exam, Power and performance management, Simultaneous multithreading
2.1.1 Power10 processor overview
The Power10 processor is a superscalar symmetric multiprocessor that is manufactured in complimentary metal-oxide-semiconductor (CMOS) 7 nm lithography with 18 layers of metal. The processor contains up to 15 cores that support eight simultaneous multithreading (SMT8) independent execution contexts.
Each core has private access to 2 MB L2 cache and local access to 8 MB of L3 cache capacity. The local L3 cache region of a specific core also is accessible from all other cores on the processor chip. The cores of one Power10 processor share up to 120 MB of latency optimized non-uniform cache access (NUCA) L3 cache.
The processor supports the following three distinct functional interfaces which all are capable to run with a signaling rate of up to 32 GTps2:
Ê Open memory interface
The Power10 processor has eight memory controller unit (MCU) channels that support one open memory interface (OMI) port with two OMI links each3. One OMI link aggregates
8 lanes running at 32 GTps speed and connects to one memory buffer based differential DIMM (DDIMM) slot to access main memory. Physically, the OMI interface is implemented in two separate die areas of 8 OMI links each. The maximum theoretical full-duplex bandwidth aggregated over all 128 OMI lanes is 1 TBps.
Ê SMP fabric interconnect (PowerAXON)
A total of 144 lanes are available in the Power10 processor to facilitate the connectivity to other processors in a symmetric multiprocessing (SMP) architecture configuration. Each SMP connection requires 18 lanes, eight data lanes plus one spare lane per direction
(2 x(8+1)). In this way the processor can support a maximum of eight SMP connections with at total of 128 data lanes per processor. This configuration yields a maximum theoretical full-duplex bandwidth aggregated over all SMP connections of 1 TBps.
The generic nature of the interface implementation also allows the use of 128 data lanes to potentially connect accelerator or memory devices through the OpenCAPI protocols. Also, it can support memory cluster and memory interception architectures.
Because of the versatile characteristic of the technology, it is also referred to as PowerAXON interface (Power A-bus/X-bus/OpenCAPI/Networking4). The OpenCAPI and
the memory clustering and memory interception use cases can be pursued in the future and are currently not use by available technology products.
2 Giga transfers per second (GTps)
3 The OMI links are also referred to as OMI sub-channels.
4 A-busses and X-busses provide SMP fabric ports used between CEC drawers or within CEC drawers respectively.
Chapter 2. Architecture and technical overview 51
Ê PCIe Version 5.0 interface
To support external I/O connectivity and access to internal storage devices, the Power10 processor provides differential Peripheral Component Interconnect Express version 5.0 interface busses (PCIe Gen 5) with a total of 32 lanes. The lanes are grouped in two sets of 16 lanes that can be used in one of the following configurations:
– 1 x16 PCIe Gen 4 – 2 x8 PCIe Gen 4
– 1 x8, 2 x4 PCIe Gen 4
– 1 x8 PCIe Gen 5, 1 x8 PCIe Gen 4 – 1 x8 PCIe Gen 5, 2 x4 PCIe Gen 4
Figure 2-2 shows the Power10 processor die with several functional units labeled. Note, 16 SMT8 processor cores are shown, but only 10-, 12-, or 15-core processor options are available for Power E1080 server configurations.
Figure 2-2 The Power10 processor chip (Die photo courtesy of Samsung Foundry)
52 IBM Power E1080: Technical Overview and Introduction
Important Power10 processor characteristics are listed in 2-1.
Table 2-1 Summary of the Power10 processor chip and processor core technology
a. Complimentary metal-oxide-semiconductor (CMOS)
b. Power instruction set architecture (Power ISA)
The Power10 processor is packaged as single-chip module (SCM) for exclusive use in the Power E1080 servers. The SCM contains the Power10 processor plus more logic that is needed to facilitate power supply and external connectivity to the chip. It also holds the connectors to plug SMP cables directly onto the socket to build 2-, 3-, and 4-node
Power E1080 servers.
Figure 2-3 shows the logical diagram of the Power10 SCM.
Figure 2-3 Power10 single chip module
Chapter 2. Architecture and technical overview 53
As indicated in Figure 2-3 on page 53, the PowerAXON interface lanes are grouped in two sets of 72 lanes each. One set provides four interface ports (OP1, OP2, OP4, and OP6), which are accessible to SMP connectors that are physically placed on the top of the SCM module. The second set of ports (OP0, OP3, OP5, and OP7) are used to implement the fully connected SMP fabric between the four sockets within a system node. Eight open memory interface ports (OMI0 to OMI7) with two OMI links each provide access to the buffered main memory differential DIMMs (DDIMMs). The 32 PCIe Gen 5 lanes are grouped in two PCIe host bridges (E0, E1).
Figure 2-4 shows a physical diagram of the Power10 SCM. The eight SMP connectors (OP1A, OP1B, OP2A, OP2B, OP4A, OP4B, OP6A, and OP6B) externalize 4 SMP busses, which are used to connect system node drawers in 2-, 3-, and 4-node Power E1080 configurations. The OpenCAPI connectivity options are also indicated, although they are not used by any commercially available product.
Figure 2-4 Power10 single chip module physical diagram
54 IBM Power E1080: Technical Overview and Introduction