Business

Zyphra trains large-scale MoE AI model on AMD hardware, signalling a shift in AI training landscape

Friday, 28 November 2025 10:35AM UTC

Zyphra achieves a milestone in AI development by training the first known large-scale Mixture-of-Experts model on AMD’s Instinct MI300X GPUs, challenging traditional GPU dominance and highlighting new industry collaborations.

AMD has announced a significant milestone in artificial intelligence development with Zyphra’s successful training of ZAYA1, a Mixture-of-Experts (MoE) foundation model built entirely using AMD’s hardware and software platform. This marks the first known instance of a large-scale MoE model being trained on AMD Instinct MI300X GPUs, AMD Pensando networking technology, and the ROCm open software stack, demonstrating that AMD is emerging as a viable alternative to more traditionally dominant GPU platforms in AI training.

ZAYA1-Base is notable for its size, containing 8.3 billion parameters in total, though only 760 million are active during inference. Despite this lower active parameter count, the model delivers competitive or superior performance across a range of benchmarks, including reasoning, mathematics, and coding tasks, when compared to established AI models like Alibaba’s Qwen3-4B, Google’s Gemma3-12B, Meta’s Llama-3-8B, and OLMoE. Zyphra’s technical report, published on 28 November, highlights these achievements and underscores the efficiency gains made possible by AMD’s platform.

A key enabler for the model’s efficient training was the memory capacity of the MI300X GPUs, which feature 192 GB of high-bandwidth memory. This large memory pool allowed Zyphra to avoid complex tensor sharding techniques that can introduce latency and operational complexity in large-scale AI model training. Furthermore, Zyphra reported a tenfold improvement in model save times, attributing this to AMD’s optimised distributed I/O capabilities, which also enhance reliability during extensive training runs.

The training infrastructure supporting ZAYA1 was a product of a close collaboration between Zyphra, AMD, and IBM. Together, they developed a high-performance training cluster integrating AMD GPUs with IBM Cloud’s advanced fabric and storage architecture. This infrastructure was integral in allowing Zyphra to scale its training efficiently while maintaining performance and reliability. IBM and AMD’s partnership to deliver this cluster reflects a broader industry trend towards combining hardware and cloud-native software solutions to meet the demanding needs of next-generation AI model development.

Executives from both companies emphasised the milestone’s significance. Emad Barsoum, AMD’s corporate vice president of AI and engineering, pointed to the achievement as a testament to AMD’s growing leadership in accelerated computing, which is empowering innovators like Zyphra to push the boundaries of AI. Zyphra’s CEO, Krithik Puthalath, highlighted the importance of efficiency in the company’s approach, noting that the model’s design, algorithmic development, and hardware choices were all guided by a principle of optimising price-performance to deliver advanced intelligence to customers.

This advancement highlights AMD’s increasing footprint in a domain long dominated by competitors with established GPU solutions. The successful demonstration of large-scale MoE model training on AMD’s integrated platform could signal a shift in the AI hardware landscape, particularly with its combination of extensive memory capacity, high-speed networking, and an open software ecosystem tailored to demanding AI workloads.

Looking ahead, Zyphra intends to continue its collaboration with AMD and IBM to develop next-generation multimodal foundation models, which further integrate diverse data types such as text, images, and video. This ongoing partnership aligns with broader industry movements towards more versatile and efficient AI architectures driven by scalable, high-performance hardware-software platforms.

📌 Reference Map:

^[1] (techedt.com) - Paragraphs 1, 2, 3, 4, 5, 6, 7, 8
^[2] (amd.com blog) - Paragraph 1, 2
^[3] (amd.com newsroom) - Paragraph 2, 3
^[4] (amd.com newsroom) - Paragraph 5
^[5] (zyphra.com) - Paragraph 2, 3
^[6] (prnewswire.com) - Paragraph 2, 3
^[7] (investing.com) - Paragraph 2

Source: Noah Wire Services

More on this

https://www.techedt.com/amd-powers-zyphras-large-scale-ai-training-milestone - Please view link - unable to able to access data
https://www.amd.com/en/blogs/2025/zyphra-demonstrates-large-scale-training-on-amd-with-zaya1.html - Zyphra has successfully trained ZAYA1-base, the first large-scale Mixture-of-Experts (MoE) foundation model entirely on an AMD platform comprising AMD Instinct™ GPUs and AMD Pensando™ Pollara AI NICs. This achievement demonstrates that AMD's hardware and software stack is a viable, high-performance alternative for frontier-scale AI training. ([amd.com](https://www.amd.com/en/blogs/2025/zyphra-demonstrates-large-scale-training-on-amd-with-zaya1.html?utm_source=openai))
https://www.amd.com/en/newsroom/press-releases/2025-11-24-amd-powers-frontier-ai-training-for-zyphra.html - AMD announced that Zyphra has developed ZAYA1, the first large-scale MoE foundation model trained using AMD Instinct™ MI300X GPUs, AMD Pensando™ networking, and the ROCm open software stack. ZAYA1-base outperforms models like Llama-3-8B and OLMoE across multiple benchmarks and rivals Qwen3-4B and Gemma3-12B. ([amd.com](https://www.amd.com/en/newsroom/press-releases/2025-11-24-amd-powers-frontier-ai-training-for-zyphra.html?utm_source=openai))
https://www.amd.com/en/newsroom/press-releases/2025-10-01-ibm-and-amd-collaborate-with-zyphra-on-next-generation.html - IBM and AMD announced a collaboration to provide advanced AI infrastructure to Zyphra, delivering a large cluster of AMD Instinct™ MI300X GPUs on IBM Cloud for training frontier multimodal foundation models. This partnership aims to offer among the largest advanced generative AI training capabilities to date. ([amd.com](https://www.amd.com/en/newsroom/press-releases/2025-10-01-ibm-and-amd-collaborate-with-zyphra-on-next-generation.html?utm_source=openai))
https://www.zyphra.com/post/zyphra-demonstrates-large-scale-training-on-full-stack-amd-platform-powered-by-ibm-cloud - Zyphra, in collaboration with AMD and IBM, has demonstrated large-scale training on an integrated AMD platform using AMD Instinct™ GPUs, AMD Pensando™ networking, and the ROCm software stack. The ZAYA1 model, with 8.3 billion total parameters and 760 million active parameters, achieves performance comparable to leading models such as Qwen3-4B and Gemma3-12B. ([zyphra.com](https://www.zyphra.com/post/zyphra-demonstrates-large-scale-training-on-full-stack-amd-platform-powered-by-ibm-cloud?utm_source=openai))
https://www.prnewswire.com/news-releases/zyphra-demonstrates-first-large-scale-training-on-integrated-amd-compute-and-networking-powered-by-ibm-cloud-302624291.html - Zyphra has demonstrated large-scale training on an integrated AMD platform, introducing ZAYA1, the first large-scale MoE foundation model trained entirely on AMD Instinct™ GPUs, AMD Pensando™ networking, and the ROCm software stack. Despite operating with a fraction of the active parameter count, ZAYA1-base achieves performance comparable to leading models such as Qwen3-4B and Gemma3-12B. ([prnewswire.com](https://www.prnewswire.com/news-releases/zyphra-demonstrates-first-large-scale-training-on-integrated-amd-compute-and-networking-powered-by-ibm-cloud-302624291.html?utm_source=openai))
https://www.investing.com/news/company-news/zyphra-trains-first-largescale-moe-model-on-amd-hardware-93CH-4375333 - Zyphra has developed ZAYA1, the first large-scale MoE foundation model trained entirely on AMD hardware, including AMD Instinct MI300X GPUs, AMD Pensando networking, and the AMD ROCm open software stack. Benchmark results indicate that ZAYA1-Base matches or exceeds the performance of several prominent models across reasoning, mathematics, and coding tasks. ([investing.com](https://www.investing.com/news/company-news/zyphra-trains-first-largescale-moe-model-on-amd-hardware-93CH-4375333?utm_source=openai))

Noah Fact Check Pro

The draft above was created using the information available at the time the story first emerged. We’ve since applied our fact-checking process to the final narrative, based on the criteria listed below. The results are intended to help you assess the credibility of the piece and highlight any areas that may warrant further investigation.

Freshness check

Score: 8

Notes: The narrative was first published on 24 November 2025, with the earliest known publication date being 24 November 2025. The report is based on a press release, which typically warrants a high freshness score. The content has been republished across multiple reputable outlets, including AMD's official blog and Zyphra's official website . No discrepancies in figures, dates, or quotes were found. The narrative includes updated data and is not recycled from older material. No similar content appeared more than 7 days earlier.

Quotes check

Score: 9

Notes: Direct quotes from Emad Barsoum, AMD’s corporate vice president of AI and engineering, and Krithik Puthalath, Zyphra’s CEO, were found in the official press release . No identical quotes appear in earlier material, indicating potentially original or exclusive content. No variations in quote wording were noted.

Source reliability

Score: 10

Notes: The narrative originates from reputable organisations: AMD and Zyphra. AMD is a well-established company with a strong public presence, and Zyphra's official website provides detailed information about their operations. The press release is accessible on both AMD's and Zyphra's official websites , confirming the authenticity of the information.

Plausability check

Score: 10

Notes: The claims about Zyphra's successful training of the ZAYA1 model using AMD's hardware and software platform are corroborated by multiple reputable sources, including AMD's official press release and Zyphra's official website . The technical details, such as the use of AMD Instinct MI300X GPUs and the ROCm open software stack, are consistent across sources. The narrative's language and tone are consistent with industry standards, and the structure is focused on the main claim without excessive or off-topic detail.

Overall assessment

Verdict (FAIL, OPEN, PASS): PASS

Confidence (LOW, MEDIUM, HIGH): HIGH

Summary: The narrative is fresh, originating from a recent press release by AMD and Zyphra. The quotes are original and not reused from earlier material. The sources are reputable, and the claims are plausible, supported by multiple reputable outlets. No significant credibility risks were identified.

AMD
Artificial Intelligence
MoE models