Woman looking in the background at a futuristic brain image with “AI” written inside.

Why Agentic AI Will Change Memory Demand

Agentic AI keeps long-lived context instead of discarding state after each query. That means that KV (Key-Value) caches must persist across many steps, as explained in this article from The Register, meaning that memory residency time grows from milliseconds to hours or days. However, GPUs stall if context cannot be accessed quickly enough. This turns memory into the primary scaling constraint.

Which Memory Components Will See Higher Demand?

High-Bandwidth Memory (HBM)

Demand will rise because KV caches ideally stay in HBM for speed, but agentic workloads require far more capacity than current HBM can provide. Even though HBM is expensive, it remains the fastest tier and will be saturated first. You can therefore expect strong demand for larger-capacity HBM stacks (HBM4, HBM4E).

System DRAM (DDR5 / DDR6)

Here demand will rise because DRAM becomes the overflow tier when HBM is insufficient. Agentic AI increases total memory footprint per GPU node. As a result, you require more DRAM per server, higher-bandwidth DIMMs, and memory-rich CPU hosts.

CXL-Attached Memory (CXL DRAM Expanders & Memory Pooling Devices)

Demand for this memory rises because CXL enables disaggregated, pooled memory with sub-100 ns latency. Offloading KV cache to CXL memory can reduce GPU memory usage by up to 87% and multiple agents can share the same context without duplication. Therefore, buyers should expect to see rapid growth in CXL memory modules, memory pooling appliances, and coherent fabrics.

Near-Compute Flash Tiers (e.g., Nvidia ICMS, NVMe-backed fabrics)

In this contact, demand rises because new “G3.5” tiers bridge the gap between HBM and SSD. They are designed specifically for large, streaming KV cache reads and enable scale-out context storage without crippling latency. This means that you can expect to see increased demand for high-performance NVMe, RDMA-attached flash, and specialized inference-context storage appliances.

High-Bandwidth Interconnects (Ethernet, NVLink, RDMA fabrics)

Demand here rises because memory tiers only work if GPUs can access them with minimal jitter. Agentic AI increases cross-node memory traffic, which means you can expect to see growth in ultra-low-latency networking hardware and memory-centric fabrics.

Overall Market Impact

Agentic AI will shift the industry from “more compute” to “more memory, closer to compute.”  The biggest winners will be:

  • HBM manufacturers (Micron, Samsung, SK Hynix)
  • DRAM vendors
  • CXL memory module and pooling system providers
  • NVMe flash and near-compute storage vendors
  • Networking companies enabling memory fabrics

Agentic AI shifts the bottleneck from compute to memory.

Hyperscalers are driving and will continue to drive the first and largest wave of demand—especially for HBM and CXL memory fabrics. Enterprises will follow with a more modest but steady increase in DRAM, CXL modules, and NVMe for local inference. Most companies will deploy:

  • Smaller on-prem inference servers
  • CXL memory expanders for local LLMs
  • High-capacity DRAM for agentic workflows that run internally
  • NVMe flash tiers for local context storage

Why? Because many agentic AI use cases involve:

  • Proprietary data
  • Compliance constraints
  • Low-latency internal workflows
  • Integration with ERP/CRM/SCM systems

So enterprises will increase memory per server, but not to hyperscaler levels.

Mapping Agentic AI memory demand to specific vendors

HBM (High-Bandwidth Memory) Vendors

HBM is the biggest winner because agentic AI keeps KV caches “alive” for long periods and needs extreme bandwidth. This demand for supplies here is from hyperscalers.

SK Hynix

  • Currently the market leader in HBM supply for Nvidia.
  • HBM3E and HBM4 demand will surge as agentic AI increases context windows and multi-agent orchestration.
  • Likely to see the largest absolute revenue lift.

Samsung

  • Strong HBM roadmap, aggressively pursuing HBM4 capacity.
  • Gains share as hyperscalers diversify supply chains.
  • Benefits from both HBM and DDR5/DDR6 demand.

Micron

  • Smaller HBM share today but rapidly expanding.
  • Strong position in HBM3E and future HBM4.
  • Also benefits from DDR5/DDR6 and CXL DRAM.

DRAM Vendors (DDR5 / DDR6)

Agentic AI pushes more memory off GPU into host DRAM and CXL pools. DRAM demand will rise across the board, especially for servers with 1–2 TB per node.

Samsung, SK Hynix, Micron
All three dominate DRAM. There will be demand increases for:

  • High-capacity DIMMs
  • High-bandwidth DDR5/DDR6
  • Low-latency DRAM for CXL expanders

CXL Memory Vendors

CXL is the biggest structural shift because agentic AI requires disaggregated, pooled memory. CXL demand is driven by hyperscalers first, and then by enterprise on-prem. The standout non-memory vendor in this segment is likely to be Astera Labs.

Samsung

  • Leading supplier of CXL DRAM modules.
  • Strong position in memory pooling appliances.

SK Hynix

  • CXL DRAM and early CXL memory expander designs.

Micron

  • CXL DRAM and future CXL-attached persistent memory.

Astera Labs

  • The key enabler for CXL memory expansion because it provides:
    • CXL memory controllers
    • Memory pooling switches
    • CXL fabric management

Marvell

  • CXL fabric controllers and switches.

Flash / NVMe Vendors (Near-Compute Storage)

Agentic AI creates a new “G3.5” tier: fast flash used as extended KV cache. Flash demand will increase for both near-compute tiers and bulk storage

Western Digital

  • NVMe SSDs for AI inference clusters (for hyperscaler deployments)

Kioxia

  • High-endurance NVMe for AI workloads.

Samsung

  • Large share of hyperscaler NVMe supply.

Solidigm (SK Hynix subsidiary)

  • High-capacity QLC drives for context storage.

In summary

Therefore current and future memory demand will mostly be driven by hyperscalers.

It will affect the following suppliers:

  • SK Hynix, Samsung, Micron (HBM + DRAM)
  • Astera Labs, Marvell (CXL)
  • Samsung, WD, Kioxia, Solidigm (NVMe)
  • Nvidia, Broadcom, Arista (networking)

There will be a secondary but growing demand driven by enterprise on-premises requirements.

It will affect supplies from:

  • Samsung, Micron (DRAM + CXL)
  • Supermicro, Dell, HPE (servers)
  • Solidigm, WD (NVMe)

The shortage continues to persist, making supply management increasingly complex. Contact us to check the actual availability of the components you are using and identify reliable alternative sources, reducing risks to your supply chain.

Newsletter archives:

Blog article archives:

Search news articles:

Newsletter signup

Before continuing, please click to read our privacy policy.