The Memory Wall: How HBM Shortages Will Define the AI Hierarchy in 2027

On 21 January 2026, Micron’s chief business officer told investors the company was “sold out” of high-bandwidth memory for the entire 2026 calendar year, with the bulk of 2027 capacity already under negotiation. SK Hynix and Samsung have made nearly identical statements; Samsung confirmed in late October 2025 that its 2026 HBM allocations were closed, two months after it finally cleared NVIDIA’s twelve-high HBM3E qualification in September. Hyperscalers, accelerator vendors, and a handful of strategic state buyers now control the entire 2026 HBM output of three suppliers, and have begun pre-paying for 2027 wafers that no fab has yet produced. The market has stopped trading; it is allocating. The semiconductor story of the next two years is not about silicon area, lithography nodes, or the GPU itself. It is about a stack of bonded DRAM dies, manufactured by three firms in four cities, packaged on a single advanced substrate technology at a single foundry. That bottleneck, more than any export-control regime or capital-cost differential, will set the AI compute hierarchy through 2027.

The Suppliers and Their Share. Three firms manufacture HBM at scale. SK Hynix holds roughly 62 percent of HBM unit shipments and 57 percent of HBM revenue as of the third quarter of 2025, and is the sole qualified supplier of twelve-high HBM3E to NVIDIA’s Blackwell B200 and GB200 lines through most of that year. UBS projects SK Hynix will hold approximately 70 percent of the HBM4 market on NVIDIA’s Rubin platform when volume shipments begin in 2026. Micron overtook Samsung for the second slot during 2025 on the back of its twelve-high HBM3E qualification with NVIDIA, and is now sampling HBM4 ahead of a ramp scheduled for the second quarter of 2026. Samsung, the historical DRAM leader, has been forced into third place after eighteen months of failed qualification attempts; it eventually cleared NVIDIA in September 2025 but secured an initial allocation of only 10,000 units, a token volume relative to SK Hynix’s monthly output. Samsung’s larger 2026 prize is Google’s TPUv7, where it supplies more than 60 percent of the HBM3E. The competitive structure that matters in 2027 is therefore: SK Hynix as systemic supplier, Micron as swing producer with American capacity, Samsung as captive supplier to Google’s TPU programme.

The Packaging Chokepoint. An HBM stack is useless without an advanced package, and the advanced package in question is TSMC’s Chip-on-Wafer-on-Substrate, specifically the CoWoS-L variant used by Blackwell, Rubin, and AMD’s MI400. TSMC is racing to expand monthly CoWoS capacity to 150,000 wafers by the end of 2026, a roughly fourfold increase from late 2024, backed by approximately $56 billion in capex earmarked for advanced packaging through 2027. Morgan Stanley estimates NVIDIA alone has booked around 60 percent of TSMC’s 2026 CoWoS output, equivalent to roughly 510,000 CoWoS-L wafers, with AMD, Broadcom, AWS, and Google taking most of the remainder. The implication is brutal for anyone outside that list. There is no second source for CoWoS-L at scale; Samsung Foundry’s I-Cube and Intel Foveros remain functionally untested for production HBM4 accelerator volumes. A GPU die without a CoWoS-L slot is a paperweight, and a CoWoS-L slot without HBM stacks is the same. The two constraints must be satisfied jointly, and both are owned by no more than four companies between them.

Demand From Three Buyer Tiers. The accelerators consuming this constrained output are easy to enumerate. NVIDIA’s Rubin R100 carries 288 GB of HBM4 across eight stacks, delivers 22 TB per second of memory bandwidth, and begins shipping to early-access hyperscalers in the second half of 2026, with volume in the first quarter of 2027. Rubin Ultra, slated for 2027, scales to 384 GB of HBM4E and 32 TB per second. AMD’s MI350, available since the third quarter of 2025, ships with 288 GB of HBM3E at 8 TB per second; its successor MI400, due in 2026, carries 432 GB of HBM4 at 19.6 TB per second. Google’s TPUv7 and Amazon’s Trainium3 both consume HBM3E in 2026 and pivot to HBM4 in 2027. The three buyer tiers are converging on the same parts list. Hyperscalers (Microsoft, Meta, Google, Amazon, Oracle) sign multi-year prepay contracts directly with SK Hynix and Samsung; second-tier clouds (CoreWeave, Lambda, Nebius, Nscale) receive allocations through NVIDIA’s order book; sovereign AI buyers in the EU, the United Kingdom, Japan, Korea, the Gulf, and India queue behind both. Reports of HBM3E spot pricing rising roughly 20 percent for 2026 contracts confirm that the suppliers have priced the queue rather than expanded it.

The Capex That Will Not Arrive in Time. Memory firms understand the shortage and are spending against it, but the lead time for HBM-capable DRAM capacity is structurally long. SK Hynix has committed roughly 600 trillion won, about $430 billion at current rates, to its Yongin cluster over the lifecycle of four fabs; the first Yongin fab carries a confirmed 9.4 trillion won investment with a clean-room opening accelerated to February 2027, with phase two later in 2027 and phase three in 2028. Cheongju’s M15X fab, a 20 trillion won build, opens its first clean room in May 2026 but enters volume only later in the year. Micron raised its fiscal 2026 capex guidance to $20 billion, up from $18 billion, almost entirely directed at HBM, and pulled its first Boise wafer output forward to mid-2027; a second Boise fab is funded through the CHIPS Act’s $6.1 billion award but does not produce wafers before late 2028. Samsung’s Pyeongtaek expansion follows a similar trajectory. None of this capacity meaningfully relieves 2026 or 2027. The supply curve through the end of 2027 is essentially fixed by decisions made in 2023 and 2024, when the industry was still planning around merchant DRAM cycles rather than AI accelerator economics.

The Hierarchy That Locks In. The consequence is a three-tier compute order through 2027, defined by HBM access rather than by capital. Tier one is the four American hyperscalers plus Oracle, which between them have secured the bulk of Blackwell, Rubin early access, MI400, and their own ASIC allocations through bilateral prepay with the memory firms and direct CoWoS-L bookings at TSMC. Tier two comprises specialist GPU clouds (CoreWeave, Lambda, Nebius, Nscale) and a small set of frontier-model laboratories with NVIDIA strategic-account status; they will receive Rubin in volume but on NVIDIA’s allocation terms, not their own. Tier three is everyone else. The EU’s AI Factories programme, Japan’s METI-backed compute build-out, Korea’s domestic AI infrastructure, Gulf sovereigns including G42 and HUMAIN, and the long tail of regional clouds and national champions are bidding against this stack with neither the prepay capital nor the relationship leverage of the tier-one buyers. They will receive HBM3E hardware in 2026 and 2027, not HBM4, and will train and serve at a structural bandwidth disadvantage. That disadvantage is not a marketing distinction. It is the difference between competitive frontier inference economics and second-rate ones.

The Verdict. The defining input to the 2027 AI hierarchy is not GPU silicon, not power, not data, and not regulation. It is the stack of bonded DRAM dies on a CoWoS-L substrate, and the four companies that control its production. The lithography wars of the 2010s and the early-2020s focused political attention on extreme ultraviolet scanners and on Taiwan as a geographic concentration risk, but the actual chokepoint of the AI build-out is now packaging and HBM, and it sits across two further concentrations: SK Hynix in Cheongju and Cheongju-adjacent Korea, and TSMC’s advanced packaging lines in Hsinchu and Tainan. Capex announced in 2025 and 2026 does not translate into wafer output before late 2027 at the earliest, and into HBM4 volume probably not before 2028. By the time the supply curve bends, the hierarchy will have set. Hyperscalers and Rubin-class national-champion buyers will own frontier AI economics; everyone else will inherit a second-class inference cost structure they cannot price their way out of. The window to alter that outcome closed when 2026 supply sold out, which it did before most governments outside Washington and Seoul had begun to understand the variable.

Read our full Report Disclaimer.

Report Disclaimer

This report is provided for informational purposes only and does not constitute financial, legal, or investment advice. The views expressed are those of Bretalon Ltd and are based on information believed to be reliable at the time of publication. Past performance is not indicative of future results. Recipients should conduct their own due diligence before making any decisions based on this material. For full terms, see our Report Disclaimer.