MTS

Wiki/Physical layer

Supply Chain Bottlenecks: What Actually Gates AI Buildout Speed

The popular framing of the AI infrastructure crunch is "we need more GPUs." That framing is wrong by roughly half. GPUs are the single largest line item — call it 55–65% of a training cluster's bill of materials at current pricing — but they sit on top of a stack of components, each of which has its own 6–18 month lead time and its own capacity-constrained supplier base. Doubling Nvidia's GPU output requires roughly doubling TSMC CoWoS, HBM stack output, optical transceiver assembly, switch silicon, power-delivery silicon, liquid-cooling CDUs, transformers, and switchgear simultaneously. None of those chains scales on a quarterly cadence. The "stacked lead time" problem is what determines actual buildout pace.

This page collects what the primary sources say about each link.


1. TSMC CoWoS: The Single Hardest Constraint

CoWoS (Chip-on-Wafer-on-Substrate) is TSMC's 2.5D advanced packaging technology — the silicon interposer that fuses a GPU compute die to multiple HBM stacks into one package. Every Nvidia datacenter GPU since A100 ships on CoWoS; every Blackwell, every B200, GB200, and Rubin part does too. AMD MI300/MI325/MI350 use it. Broadcom's custom ASICs for Google TPU and Meta MTIA use it. Marvell's Trainium2 silicon uses it. There is no second source. SPIL, ASE, and Amkor are years behind on equivalent capability.

Capacity ramp. TSMC has been transparent on earnings calls about the trajectory, even if specific wafer counts are usually triangulated through analyst notes rather than disclosed directly:

Year CoWoS capacity (wafers/month, end of year) Source
2023 ~13–15k TrendForce CoWoS expansion note
2024 ~40k TrendForce; DigiTimes CoWoS coverage
2025 (target) ~75–80k Reuters on TSMC CoWoS doubling; Bloomberg
2026 (target) ~135k+ TrendForce 2026 outlook

C.C. Wei, TSMC's chairman/CEO, framed it on the Q1 2024 earnings call: "the demand is much, much more than our ability to supply… even though we are doubling capacity, we cannot meet the customer's demand" (TSMC Q1 2024 earnings call transcript, Seeking Alpha). The same framing recurred verbatim through Q2, Q3, and Q4 2024 calls.

CoWoS-S vs CoWoS-L. The split matters because Blackwell architecture forced a packaging-type transition:

  • CoWoS-S (silicon interposer, the original): used for H100, H200, A100, MI300X. Reticle-limited interposer area capped die-plus-HBM count at ~3.3x reticle.
  • CoWoS-L (with local silicon interconnect / LSI bridge dies embedded in an RDL interposer): used for B200, GB200, B300, and forward. Enables larger packages (>5x reticle) to fit two compute dies plus eight HBM stacks. The yield curve was painful in 2024 — Nvidia delayed initial Blackwell volume partly because of CoWoS-L mask and yield issues (The Information on Blackwell delay; Semianalysis Blackwell deep-dive).

Through 2025, TSMC has been actively converting CoWoS-S lines to CoWoS-L as Blackwell crowds out Hopper. TrendForce estimated CoWoS-L would exceed CoWoS-S capacity by mid-2025.

Allocation. TSMC does not break out customer allocation, but Semianalysis and Morgan Stanley/Citi notes have triangulated rough 2025 shares:

Customer Approx 2025 CoWoS allocation share Primary products
Nvidia ~60–65% B100/B200/GB200/B300, H200 tail
AMD ~8–10% MI300X, MI325X, MI350
Broadcom ~12–15% Google TPU v5p/v6e/v7, Meta MTIA v2
Marvell ~5–7% AWS Trainium2/Inferentia
Others (Intel Gaudi, AWS direct, Tesla, Apple) ~5–10% misc.

The custom-silicon shift matters here: Broadcom and Marvell's CoWoS draw has roughly doubled year-over-year as Google TPU v5p and AWS Trainium2 ramped, and that capacity is taken directly from the Nvidia pool. Hock Tan (Broadcom CEO) said on the Q4 FY24 call that AI revenue would reach "$60–90B SAM across three hyperscaler customers by FY27" — almost all of that is CoWoS-packaged (Broadcom Q4 FY24 transcript, Motley Fool).

TSMC's CoWoS capex. TSMC has pulled forward >$30B of advanced-packaging capex into the AP6/AP7/AP8 fabs in Chiayi and Zhunan. AP6 (Chiayi) is the first dedicated CoWoS-L mega-fab; AP7 is the second. Construction started 2023 and 2024 respectively, with full ramp 2026 (Nikkei Asia on TSMC AP6 fab; TSMC Q3 2024 transcript). Even with this, Wei said on Q3 2024 that "we will continue to double [CoWoS capacity] in 2025 and continue to grow in 2026" — without committing to closing the gap with demand.


2. HBM: The Sold-Out Decade

High Bandwidth Memory is the second hard constraint. Every CoWoS package needs 4–8 HBM stacks; each stack is itself an advanced 3D-packaged product (8-Hi or 12-Hi DRAM dies on a base logic die, TSV-connected). Three suppliers globally: SK Hynix, Samsung, Micron. Hynix has dominated the AI cycle.

Per-GPU HBM content:

GPU HBM generation Capacity Stacks Bandwidth
H100 HBM3 80 GB 5 active (of 6) 3.35 TB/s
H200 HBM3e 141 GB 6× 24 GB 4.8 TB/s
B200 HBM3e 192 GB 8× 24 GB 8 TB/s
B300 (Blackwell Ultra) HBM3e 12-Hi 288 GB 8× 36 GB 8 TB/s
Rubin (R100, 2026) HBM4 288 GB (est.) 8 stacks ~13 TB/s (target)
Rubin Ultra (2027) HBM4 1 TB (per package) 16 stacks ~32 TB/s

Sources: Nvidia GTC 2024 / GTC 2025 keynotes; Tom's Hardware Blackwell deep-dive; AnandTech HBM3e coverage.

"Sold out" — the actual quotes. SK Hynix CEO Kwak Noh-jung said on the Q1 2024 earnings call that HBM was "sold out for 2024 and nearly sold out for 2025" (Reuters). On the Q3 2024 call, the company tightened that: "our 2025 HBM capacity has already been sold out, and we are now in discussions for 2026" (SK Hynix Q3 2024 earnings call, company release). By Q2 2025, the framing was "discussions for 2026 are essentially complete; conversations on 2027 are active."

Qualification matters more than capacity. Nvidia gates HBM suppliers through a multi-quarter qualification process. As of late 2025:

  • SK Hynix: primary HBM3e supplier (8-Hi and 12-Hi). Has supplied >70% of H200/B200 HBM volume (Nikkei).
  • Micron: HBM3e 8-Hi qualified for H200 in Q1 2024, 12-Hi later in 2025. Limited share but ramping (Micron F2Q24 earnings release).
  • Samsung: the laggard. 8-Hi HBM3e qualified late 2024 for limited Nvidia volume; 12-Hi qualification reportedly slipped into 2025 (Reuters on Samsung HBM struggles; The Information). Samsung's chairman publicly acknowledged the lag in late 2024 and committed to recovering on HBM4.

HBM4 timing. First HBM4 mass-production samples from SK Hynix and Micron shipped late 2025 for Rubin qualification; Nvidia's Rubin platform targets H2 2026 production launch. HBM4 moves to a 2048-bit interface (vs. 1024 on HBM3e) and shifts the base die to a logic process (Hynix is using TSMC N12 for its HBM4 base die — itself a foundry capacity ask) (SK Hynix HBM4 announcement; Anandtech HBM4 overview).

HBM capex. SK Hynix's M15X fab in Cheongju (Korea) is purpose-built for HBM and will cost ~$15B through 2028 (Korea Times). Hynix's total 2024–2028 HBM capex envelope is >$40B. Samsung is expanding P4 phase 2 in Pyeongtaek; Micron's Idaho and New York fabs are partly HBM-targeted with $50B+ U.S. capex through 2030 (CHIPS-Act-supported) (Micron press release).


3. Foundry Leading-Edge: N3, N2, and the Single-Source Problem

Nvidia GPUs are made on TSMC, period. H100 used a custom "4N" process (a tuned N5 derivative). B200 uses N4P. Rubin (R100) is on N3P. Feynman (2028) is rumored N2 with CFETs.

N3 ramp. TSMC said on the Q4 2023 call that N3 contribution would be "high single digits" of 2024 revenue, reaching "mid teens" in 2025. N3 capacity utilization has been near 100% since mid-2024, with Apple holding ~50% allocation and Nvidia, AMD, MediaTek, and Qualcomm splitting most of the rest (TSMC Q4 2024 transcript).

N2 ramp. First production wafers H2 2025 from Hsinchu Fab 20; Kaohsiung Fab 22 ramps 2026. TSMC has been explicit that "N2 demand exceeds N3 demand at the same stage of ramp" — Apple has the first slot, Nvidia the second (Nikkei on N2 allocation).

Intel Foundry: not yet relevant. Intel 18A ramps in 2025–2026 but external customer commitments remain thin. Microsoft and a handful of small customers are public; Nvidia and AMD are not. Intel's foundry business posted $13.4B of revenue and $13.4B of operating loss in 2024 (Intel 10-K 2024). The foundry is a 2027+ story for AI silicon, not 2026.

Samsung Foundry: losing ground. Samsung's 3nm GAA yields lagged TSMC's FinFET N3 through 2023–2024. Qualcomm moved Snapdragon 8 Gen 3 to TSMC; Nvidia consumer parts (Blackwell GeForce) are on TSMC. Samsung's leading-edge customer wins for AI silicon are minimal — primarily its own Exynos and a small Tesla Dojo allocation (Reuters Samsung Foundry struggles).

SMIC and China. SMIC's "N+2" (claimed 7nm-class, no EUV) is the technology floor for Huawei's Ascend 910C. Without EUV, yields are reportedly 20–30%, and capacity is in the low thousands of wafers/month for advanced nodes (Bloomberg on SMIC Ascend yields; FT). Domestic substitution within China is real but supply-constrained: Huawei has reportedly committed nearly all SMIC N+2 capacity through 2025–2026.


4. Networking Optics: The Quiet Bottleneck

Every GPU in an AI cluster is connected to ≥1 optical transceiver, and at 1.6T speeds the count is often 2–4 transceivers per GPU. A single GB200 NVL72 rack carries 72 GPUs and thousands of optical lanes through the spine. The optics chain has shipped record volumes in 2024–2025 but is recurringly described as supply-constrained.

Suppliers and shares (2024–2025, 800G + 1.6T transceivers):

Vendor Approx share Notes
InnoLight (China) ~25–30% Largest single 800G/1.6T supplier to Nvidia
Coherent (COHR) ~15–20% InP laser + transceiver vertical play
Eoptolink (China) ~10–15% Aggressive 1.6T ramp
Lumentum (LITE) ~10–15% Lasers + transceivers; ramped EML capacity 2024
Fabrinet (FN) (contract manufacturer for COHR, Cisco, Nvidia) Thai EMS — major Nvidia switch + DGX builder
Accelink, HG Tech, Source Photonics balance

Sources: LightCounting transceiver market reports; Coherent Q4 FY25 earnings call; Lumentum Q3 FY25 earnings call.

Shortage cadence. Coherent and Lumentum management both characterized 800G transceiver demand as exceeding supply throughout 2024 — EML (electro-absorption modulated laser) chips were the binding upstream component. Lumentum on the F4Q24 call: "EML demand is well in excess of our current capacity, and we are aggressively adding capacity through CY25" (Lumentum F4Q24 transcript, Motley Fool). Nvidia explicitly called out networking-optics supply as a gating factor on Q3 FY25's call (Nvidia FY25 Q3 transcript).

800G to 1.6T transition. Standard 800G transceivers (8x100G PAM4) shipped in volume from 2023; 1.6T (8x200G PAM4) volume ramp is 2025, full mainstream in 2026. Nvidia's Spectrum-X and Quantum-X switches (announced GTC 2025) move the network from 800G to 1.6T natively.

Co-Packaged Optics (CPO). The interesting medium-term shift. CPO eliminates pluggable transceivers by integrating optics directly next to switch ASICs, cutting power per port by ~50%. Nvidia announced Quantum-X Photonics (CPO InfiniBand switch) and Spectrum-X Photonics (CPO Ethernet) at GTC 2025, targeting late 2025 / early 2026 (Nvidia CPO announcement; Semianalysis CPO analysis). Broadcom shipped its first CPO Tomahawk (TH5 with CPO option) in 2024 and is sampling TH6 with CPO (Broadcom CPO press release). CPO is structurally bad for pluggable-transceiver vendors but creates new business for InP/silicon-photonics chip suppliers — Coherent, Lumentum, and TSMC's COUPE silicon-photonics platform.


5. Switch Silicon and InfiniBand

A 100k-GPU cluster needs roughly 8–12k network switches. The silicon inside them is concentrated among three vendors.

  • Nvidia / Mellanox: dominant in InfiniBand (Quantum-2 400G, Quantum-X 800G/1.6T) — effectively monopoly for low-latency training clusters. Also pushing Spectrum-X Ethernet for AI fabrics. InfiniBand revenue inside Nvidia networking grew to ~$13B annualized run-rate by FY25 (Nvidia FY25 earnings).
  • Broadcom: Tomahawk 5 (51.2 Tbps, mainstream 2024) and Tomahawk 6 (102.4 Tbps, sampling 2025); Jericho3-AI for routed AI fabrics (Broadcom Tomahawk 6 announcement). Broadcom's AI networking revenue >$4B in FY24, guiding toward $10B+ FY27.
  • Marvell: Teralynx (acquired Innovium) is the smaller third option, popular with Oracle Cloud and some Tier-2 hyperscalers.

System OEMs: Arista (ANET) is the dominant Ethernet AI switch builder, with $7B+ AI cluster orders disclosed and Meta + Microsoft as anchor customers (Arista Q4 2024 transcript, Seeking Alpha). Cisco (Silicon One G200) competes but has been slower to penetrate hyperscale; Juniper (acquired by HPE in 2025) targets enterprise AI fabric.


6. Power Delivery Silicon

GB200 NVL72 racks pull ~120 kW at 800V DC busbars, vs. ~30 kW at 48V in HGX H100. The transition forces a near-rebuild of the power chain inside the rack.

  • Monolithic Power Systems (MPWR): primary point-of-load (POL) supplier for H100 and B200 carrier boards. MPS gross margins held above 55% through 2024 as AI-related design wins ramped (MPWR 2024 10-K). Some loss of Nvidia share in Blackwell to competing power-stage vendors disclosed by Nvidia in 2024 (Reuters on MPS Nvidia share).
  • Vicor (VICR): 48V-to-PoL factorized power; design wins in some hyperscaler custom silicon platforms but smaller scale.
  • Infineon, Renesas, ON Semi, Texas Instruments: discrete MOSFETs, gate drivers, controllers, increasingly SiC and GaN at the rack PSU level.
  • 800V DC rack architecture: Nvidia announced 800V DC rack power at GTC 2025 (replacing 415V AC + 48V DC), enabling MW-scale racks (Nvidia GTC 2025 800V announcement). Adoption requires new busway, PDUs, and SST (solid-state transformer) products from Schneider, Eaton, ABB, and Vertiv.

7. Liquid Cooling and Thermal

Air cooling tops out around 50–60 kW per rack. Above that — and especially for GB200's 120 kW — direct-to-chip liquid cooling is mandatory. The cooling capex per MW roughly doubles vs. air-cooled designs (see colocation-reits.md §4). The supplier chain:

Component Key suppliers Notes
CDUs (coolant distribution units) Vertiv (VRT), Schneider, Motivair, CoolIT, Boyd Vertiv guided >$700M AI liquid-cooling revenue for 2025 (Vertiv Q4 2024 earnings)
Cold plates CoolIT, Asetek, Boyd, Cooler Master, Auras Asetek pulled out of consumer to focus on AI server liquid-cooling 2024
Manifolds, quick-disconnects Parker Hannifin, CPC (Colder Products), Stäubli QD lead times stretched to 30+ weeks in 2024
Rear-door heat exchangers nVent (Schroff), Vertiv, Motivair Used for retrofit and hybrid air/liquid sites
Immersion (single-phase + two-phase) Submer, GRC, LiquidStack, Iceotope Niche — adoption is real (Microsoft, Meta pilots) but small share

Vertiv has been the single biggest beneficiary among public names: order growth >30% YoY through 2024 and >$8B backlog by Q4 2024 (Vertiv Q4 2024 transcript, Motley Fool). Lead times on CDUs reached 9–12 months by mid-2024 and have only modestly compressed.


8. The Other Long-Lead Items

The non-IT side of the datacenter has lead times that rival the chip side and, in some cases, exceed them.

Item Typical 2024–2025 lead time Notes / sources
Large power transformers (>50 MVA) 80–130 weeks Up from 25–40 weeks pre-2022 (Wood Mackenzie; Hitachi Energy)
Medium-voltage switchgear 50–80 weeks ABB, Eaton, Schneider, Siemens all back-ordered
Diesel generators (2.5–3 MW class) 80–130 weeks Caterpillar AI-related power orders backlog disclosed at >$30B by mid-2025 (Cat Q2 2025 earnings); Cummins, Kohler similarly stretched
Large UPS (>1 MW lithium) 40–70 weeks Vertiv, Schneider, Eaton, ABB
Chillers / cooling towers 50–80 weeks Trane, Carrier, Johnson Controls
Copper cable / bus duct 30–50 weeks Tightened by raw copper price and electrical-construction labor
Permit + utility interconnect 24–60+ months The dominant ground-up constraint in PJM, ERCOT-Dallas, Virginia

Caterpillar's framing on the Q1 2025 earnings call is the cleanest single quote on the generator chain: "Demand for data center power solutions remains well above our ability to supply, with backlog continuing to extend into 2027" (Caterpillar Q1 2025 transcript). Generator orders for the AI cycle are largely placed before a site has a finalized utility timeline — operators are reserving capacity at the OEM independently of any single project.


9. Stacked Lead Times: Why Doubling One Thing Doesn't Help

The under-appreciated framing across this entire stack is that lead times do not pipeline — they stack. To bring one new 100 MW AI campus online:

  1. Utility interconnect: 24–60 months from application.
  2. Power transformers + switchgear: order 18+ months before energization.
  3. Generators + UPS: order 18–30 months ahead.
  4. CDUs + cold plates: order 9–15 months ahead.
  5. GPU + HBM + CoWoS allocation: secured 12–18 months ahead.
  6. Optical transceivers + switches: order 6–9 months ahead.
  7. Construction: 18–24 months in parallel with the long-lead orders.

A bottleneck in any one of these gates the whole project. Doubling CoWoS (TSMC's actual 2024→2025 plan) does not double finished GPUs unless HBM doubles in parallel — and as noted, HBM 2025 supply was sold out by early 2024, with Nvidia, AMD, and the custom-silicon cohort all sourcing from the same three vendors. Doubling GPU shipments does not double finished AI clusters unless transformers, switchgear, generators, CDUs, transceivers, and switch ASICs all scale in step.

This is why hyperscaler capex growth, while genuinely large, has not translated into proportional MW landings. The 2025 disclosed development pipeline at the colo REITs (§3 of the colocation reference) is at record levels, but pre-leasing percentages — 75%+ at DLR — show that the binding constraint is delivered capacity, not demand.


10. The Custom-Silicon and China Reallocation

Two structural shifts in 2024–2025 redistribute the constraint, without easing it.

Custom silicon. Google TPU v5p/v6e/v7, AWS Trainium2/Trainium3, Meta MTIA v2, and Microsoft Maia all draw from the same TSMC N3/N4 + CoWoS-L + HBM3e pool as Nvidia and AMD. The hyperscalers building custom silicon are not adding capacity — they are reallocating from the merchant Nvidia/AMD pool. Broadcom and Marvell's combined CoWoS draw rose ~2x in 2024 (Semianalysis CoWoS). The total addressable pool is unchanged; the customer mix shifted.

China. Huawei Ascend 910B/910C draws from SMIC, CXMT (DRAM), and YMTC, plus stockpiled HBM that Hynix and Samsung shipped before the October 2023 expanded U.S. export controls. China is effectively building a parallel, lower-yield, lower-capacity stack — domestically sufficient for internal demand from Baidu, Alibaba, ByteDance, Tencent, and the PLA, but not exportable. The marginal effect on the global supply-demand picture: modestly positive for Western buyers (a few percent of demand removed from the Nvidia queue by export bans), substantially negative for Nvidia's China revenue (H20 controls in April 2025 cut the addressable China datacenter market significantly — see Nvidia FY26 Q1 8-K).


Cost Stack Summary

A rough single-cluster cost decomposition for a 100k-GPU GB200 NVL72 AI factory (2025 build, ~150 MW IT load):

Layer Approx share of cluster $ Approx $/GPU
GPU silicon (B200 / GB200) 55–60% $30–40k
HBM (priced inside GPU above) (within GPU) (~$5–8k of GPU BOM)
Networking (switches + transceivers + cables) 8–12% $5–7k
Server platform (CPU, NIC, board, chassis, PSU) 8–12% $4–6k
Storage 3–5% $1.5–2.5k
Datacenter shell + M&E (allocated per GPU) 12–18% $7–10k
Power infrastructure (substation, transformers, generators) 4–6% $2–3k
Cooling (CDUs, plates, chillers) 2–4% $1.5–2.5k

Sources: Semianalysis AI Cloud TCO; Dell'Oro AI Networking Report (subscription); cross-checked against Vertiv, Arista, and Coherent disclosed AI-revenue ratios.

The takeaway: the ~40% of the cluster that isn't GPU silicon is where the additional bottlenecks compound. Even at scale, those layers have multi-quarter lead times that respond slowly to capex. Closing the gap requires every supplier in the stack to invest in parallel — and they are, but on different clocks.


Primary Sources to Watch Going Forward

  • TSMC monthly revenue + quarterly earnings transcripts (investor.tsmc.com) — the single best read on advanced packaging and N3/N2 pull-through.
  • SK Hynix and Micron quarterly transcripts — HBM commentary, particularly forward-year sold-out framing.
  • Nvidia 10-Q and FY transcripts — supply commentary in MD&A and analyst Q&A.
  • Broadcom and Marvell quarterly transcripts — custom-ASIC AI revenue trajectory.
  • Caterpillar, Cummins, Vertiv quarterly transcripts — generator, UPS, and CDU backlog framing.
  • Semianalysis (Dylan Patel et al.) — most granular outside-in modeling of CoWoS allocation and HBM share.
  • Nikkei Asia, FT, Bloomberg — earliest disclosure on TSMC/Hynix/Samsung fab moves.
  • JEDEC, IEDM — HBM4 / N2 / advanced packaging technical roadmaps.