PCIe Devices

How GPUs, NICs, capture cards, and storage controllers use PCIe lane widths, and what x8 versus x16 actually means for real workloads.

Published February 18, 2026

PCIe Devices

Different devices need different amounts of bandwidth. Lane width is how PCIe matches the pipe to the demand.

Understanding this means you stop wondering why your GPU slot is x16 and your NIC slot is x4, and you start making better decisions about what to plug in where.

Lane Widths

The x number on a PCIe slot describes how many lanes are active. More lanes means a wider pipe.

x1   = 1 lane   = ~1 GB/s (Gen 3), ~2 GB/s (Gen 4)
x4   = 4 lanes  = ~4 GB/s (Gen 3), ~8 GB/s (Gen 4)
x8   = 8 lanes  = ~8 GB/s (Gen 3), ~16 GB/s (Gen 4)
x16  = 16 lanes = ~16 GB/s (Gen 3), ~32 GB/s (Gen 4)

A physical x16 slot can operate at x16, x8, x4, or x1 depending on what is negotiated. The slot shape stays the same; the active lanes change.

GPUs

Graphics cards use x16 for historical and practical reasons. The GPU needs to stream large textures and framebuffers to and from system memory, and x16 ensures there is headroom for even the most bandwidth-hungry scenarios.

RTX 4090 bandwidth needs by scenario:
 
  1440p gaming      ~5–8 GB/s actual bus usage
  4K gaming         ~8–12 GB/s actual bus usage
  AI inference      ~10–18 GB/s actual bus usage
 
  x16 Gen 4 available: ~32 GB/s
  x8  Gen 4 available: ~16 GB/s
  x4  Gen 4 available: ~8 GB/s

The practical takeaway is that x8 Gen 4 provides ~16 GB/s — enough headroom for the vast majority of gaming and even demanding AI inference workloads. Real-world gaming benchmarks show a 1–3% performance difference between x16 and x8 Gen 4. At x4, some scenarios show 10–20% loss where bandwidth starts to genuinely limit the GPU.

Bifurcation: Running Two GPUs

When a board bifurcates a x16 slot into two x8 slots, it is splitting the same physical wiring in half. Each GPU gets 8 lanes instead of 16.

Single GPU:
  PCIe x16 slot  →  GPU at x16  (32 GB/s)
 
Dual GPU (bifurcated):
  PCIe x16 slot  →  GPU A at x8  (16 GB/s)
                 →  GPU B at x8  (16 GB/s)

The total bandwidth is the same. Each card just gets half. For gaming this barely matters. For AI training that saturates memory bandwidth, x8 has a measurable but usually acceptable cost.

Other Devices and Their Lane Requirements

Most devices do not need x16. The GPU is the exception, not the rule.

DeviceTypical widthBandwidth neededNotes
GPU (gaming)x168–16 GB/s peakx8 Gen 4 is effectively equal
GPU (AI training)x16up to 32 GB/sbenefits from full x16
NVMe SSDx4up to 16 GB/s (Gen 5)CPU-direct slot preferred
10 Gigabit NICx4~1.25 GB/sx1 Gen 3 technically sufficient
25 Gigabit NICx8~3.1 GB/sx4 Gen 3 usually enough
4K capture cardx4~1–2 GB/sx4 Gen 3 handles 4K/60fps
RAID controllerx83–8 GB/sdepends on drive count
Thunderbolt 4 cardx4up to 5 GB/sPCIe 3.0 x4 per port
WiFi 6E cardx1~600 MB/srarely saturates a single lane
Audio interfacex1<100 MB/slatency matters more than bandwidth

Physical Slot Shapes vs. Electrical Widths

A physical x16 slot does not always run at x16. This is a common source of confusion.

Board layout example (mid-range ATX):
 
  Slot 1 (physical x16)  →  electrically x16, CPU-direct  ← primary GPU
  Slot 2 (physical x16)  →  electrically x4, chipset      ← looks like x16, runs at x4
  Slot 3 (physical x1)   →  electrically x1, chipset      ← NIC, WiFi card
 
Slot 2 will accept a GPU physically. The GPU will negotiate x4.
At Gen 4 that is ~8 GB/s — fine for lightweight AI, not great for a gaming card.

The board manual specifies the electrical width for every slot. The physical slot shape tells you what card fits, not how fast it will run.

Lane Sharing Between Slots

On many consumer boards, enabling certain slots disables others because they share the same CPU or chipset lanes.

A common pattern is that installing a device in slot 2 disables one M.2 slot, because both draw from the same pool of chipset lanes. The manual will note this as something like "M.2_2 slot is disabled when PCIE_3 is populated."

This is not a flaw. It is a lane budget decision the board makes to keep costs reasonable. Workstation and HEDT (High-End Desktop) platforms have more CPU-direct lanes and do not have this constraint as often.

Comments

Sign in with GitHub to leave a comment or reaction.