Small AI model beats GPT-5 and Claude Sonnet 4.5

⏱ 2 minute read

Web Desk: Zyphra, a Palo Alto-based startup, has released ZAYA1-8B, a new reasoning-focused language model that aims to challenge the industry’s largest players with a fraction of their computing requirements.

The model utilizes a mixture of experts architecture, featuring just over 8 billion total parameters with only 760 million active during any single task. Despite this lightweight design, Zyphra reports that the system delivers competitive benchmark performance against high-tier models such as GPT-5-High, DeepSeek-V3.2, and Claude Sonnet 4.5.

Zyphra built the model using its proprietary MoE++ architecture. This system incorporates Compressed Convolutional Attention and a custom router system to improve efficiency and reduce memory consumption. Unlike many existing models that add reasoning abilities during fine-tuning, Zyphra integrated these capabilities directly into the initial pre-training phase.

The company also introduced a method called Markovian RSA, which enables the model to explore multiple reasoning paths without the typical memory growth that slows down inference. Consequently, the model achieved a 91.9% score on the AIME ‘25 benchmark while maintaining a significantly smaller hardware footprint than its competitors.

Zyphra trained the model entirely on AMD Instinct MI300 GPUs. This choice highlights the growing viability of AMD hardware as a primary alternative to Nvidia for high-end AI development.

Because of its low active parameter count, ZAYA1-8B is designed for local deployment on enterprise hardware and edge devices. This capability allows businesses to reduce their reliance on cloud providers and lower latency for real-time applications.

The model is now available on Hugging Face under an Apache 2.0 open-source license. This licensing allows developers and corporations to customize and use the technology for commercial purposes.

Founded in 2021, Zyphra focuses on increasing intelligence density in AI systems. The company reached unicorn status in June 2025 after a $110 million Series A funding round, with backing from industry leaders including AMD and IBM.