At 16 lanes, a PCIe device has a theoretical bandwidth of 16 GB/sec over the bus and effectively (from my work with GPUs) 12 GB/sec.
Now, if a CPU manufacturer offers a CPU with lots more than 16 lanes - say, 64 lanes as an example - does that mean it can communicate at full speed with 4 16-lane devices?
2 Answers
Yes... Maybe.
It really depends on how these lanes are connected on the motherboard. Don't forget that PCIe lanes are now being used by other things like M.2 NVMe SSDs as well as the standard PCIe slots.
If you purchase a motherboard that has 4x 16-lane PCIe slots, and also has an NVMe compatible M.2 slot or two, then chances are that some of those 16-lane PCIe slots are actually not fully connected.
Read the CPU and motherboard manuals to be sure about what you're getting.
In response to your question:
So you're saying it would mean that board manufacturers could make this happen, right? But if the CPU supports this, it should not be some difficult challenge?
Correct... it should be possible to physically do this.
*Talking on speculation here*: The problem arises when you have a range of CPUs that can be installed into a given socket. It will certainly be interesting to see what happens with the ThreadRipper and i9 CPUs... Linus certainly has some concerns and I think he's probably somewhat correct... If a family of CPUs (or a particular socket) can support such a wide min/max number of lanes (e.g: from 16-lanes to 64-lanes-or-more), then there is no good way for a motherboard manufacturer to really offer a fully functional product for every CPU that could be installed - you might buy a motherboard that "supports 4x 16-lane PCIe slots", but a CPU that only has 16 lanes available... What should happen in this situation? Provide 8-lanes to two slots, leaving the other two completely non-functional?
You also have to remember that, while a CPU with 64 PCIe lanes can support 4x 16-lane PCIe slots, it isn't likely to due to things like NVMe... some of the lanes are going to be dedicated to "other things", and thus you'd probably need to see a CPU with more than 64 PCIe lanes before a real motherboard supports 4x 16-lane PCIe slots. At the moment, I highly doubt that a manufacturer is likely to design a motherboard without support for NVMe.
5It very much depends. For example, here's your typical motherboard diagram:
There are 16 lanes from the CPU the rest is connected to the CPU via the PCIe x4 equivalent DMI 3.0. But, it is entirely possible your motherboard maker has other priorities:
(note in reality that switch doesn't exist, rather you can buy two different models.) This is a Supermicro X11SSL-nF and they decided their customers needed full bandwidth in the U.2 connectors more than an electronically x16 slot. Given how few cards actually are limited by being PCIe 3.0 x8 instead of PCIe 3.0 x16 this is a great decision but one that desktop MB makers are loath to make because their customers do not want to accept that for all desktop applications it's not a limit (GPU computing is the only time when it actually limits).
There are two CPUs we need to talk about here: Ryzen Threadripper which has 64 PCIe lanes. Now, you don't really have 64 for the slots, 4 is used for the chipset and at least 4 but likely 8 will be used for 1-2 NVMe (U.2/M.2) devices and then you want perhaps 10 GbE as well and native USB 3.1/ThunderBolt taking four lanes... Realistically 48 lanes will be available for PCIe slots and indeed AMD said they plan supporting three x16 cards. Dynamic allocation will be used here, that's very likely, I can easily imagine the boards will be populated by six or seven mechanically x16 slots and then depending on what you want you can use three x16, six x8 cards or perhaps 2 x 8 + 2 x 16 and so forth. Four x16 is just not possible (without a PCIe switch but costs will skyrocket with one) because if nothing else the chipset eats up 4 lanes.
The other situation is much simpler. EPYC motherboards will have six or seven x16 slots and that's about it, 96-112 lanes out of 128, the rest will be used for NVMe (M.2/U.2/OCuLink) devices, chipset connection, networking etc. Some boards will feature significantly more NVMe but few slots -- I can very well imagine a 2U machine with 20-24 CPU connected NVMe disks, one or two 100 GbE connections and not much else (what else do you need, really?). Or, Supermicro has an Intel based server supporting 48 NVMe disks and that's 192 lanes but it's not like one or two Intel Xeons have 192 lanes, they are using 32 lanes and eight 4-to-24 lane PCI switch ICs. Now using 48 lanes instead (to allow for single CPU systems) is an absolute no brainer and it gives an 50% bandwidth advantage. I think dynamic allocation will not be a feature in this segment because servers are more purpose built and simplicity is important for support and the sales pitch as well.
Footnote: for a server M.2 is not ideal because it can produce relatively large amounts of heat (15-22W) without a heatsink -- while disks in front have a much larger surface area formed into a heatsink. You don't want a server part to thermal throttle. Whether the server runs U.2 or OCuLink cables inside the box is irrelevant, the hotswap caddies use the SFF-8639 edge connector anyways.
2