Part 8/11:
AMD has strategically positioned itself in the inference segment of AI workloads, which is growing rapidly. Unlike training, which involves massive models and extensive compute power, inference apps—such as real-time AI service interactions—are more scalable at the edge. AMD’s GPUs with higher memory capacity are tailored for this purpose, enabling smaller, energy-efficient clusters suitable for deployment at the edge.
The recent design of AMD’s MI300 and MI355 GPUs incorporates higher memory bandwidth—an essential feature for efficient data and AI weights processing. These advancements may allow AMD to capture a significant share of the inference market, especially as AI applications become more distributed and latency-sensitive.