Cerebras Unveils WSE-3, the World's Fastest AI Chip

Cerebras Systems, a leader in driving forward generative AI technology, has once again broken its own record for the fastest AI chip by unveiling the Wafer Scale Engine 3 (WSE-3). This new engine offers a performance that is double that of its predecessor, the WSE-2, maintaining the same energy consumption and price point. Specifically designed for the development of the industry's most advanced AI models, the WSE-3, built on a 5nm process with 4 trillion transistors, is the powerhouse behind the Cerebras CS-3 AI supercomputer. This setup achieves a staggering 125 petaflops of peak AI performance through 900,000 AI-optimized compute cores.

Key Features Include:

4 trillion transistors
900,000 AI-specific cores
Peak performance of 125 petaflops
44GB of onboard SRAM
Manufactured using a 5nm TSMC process
External memory options of 1.5TB, 12TB, or up to 1.2PB
Capable of training AI models with up to 24 trillion parameters
Supports clustering of up to 2048 CS-3 systems

This advanced system is capable of training frontier models, including those 10 times larger than GPT-4 and Gemini, with a memory capacity reaching up to 1.2 petabytes. The CS-3 streamlines the training process for models with up to 24 trillion parameters by eliminating the need for partitioning or refactoring, significantly enhancing workflow efficiency and developer productivity. It simplifies the training of large-scale models, making the training of a one-trillion parameter model as straightforward as training a one-billion parameter model on traditional GPU setups.

The CS-3 system caters to both enterprise and large-scale demands. In small configurations of four systems, it can fine-tune 70 billion parameter models within a day. In its maximum configuration utilizing 2048 systems, it can train a Llama 70B model from the ground up in just 24 hours—a monumental achievement in generative AI.

Cerebras offers an enhanced software framework providing out-of-the-box support for PyTorch 2.0 and the latest AI innovations including multi-modal models and diffusion techniques. Unique in its support for dynamic and unstructured sparsity, Cerebras accelerates training speeds by up to eightfold.

Andrew Feldman, CEO and co-founder of Cerebras, remarked on the journey from the initial skepticism surrounding wafer-scale processors to the introduction of the industry-leading, third-generation AI chip, WSE-3, highlighting its role in pushing the boundaries of AI potential.

In terms of efficiency and ease of use, the CS-3 outperforms other systems by offering more computational power in less space with lower energy usage. Unlike GPUs, where power consumption increases with each generation, the CS-3 doubles its performance without increasing power needs. Furthermore, it simplifies the coding process substantially, requiring 97% less code for large language models (LLMs) and allowing for the training of models of sizes up to 24T parameters in data parallel mode alone.

The demand for the CS-3 spans enterprises, government, and international cloud services, indicating strong market momentum. Testimonials from longstanding partners like Argonne National Laboratory and the Mayo Clinic emphasize the transformative potential of Cerebras' wafer-scale engineering for exploring AI frontiers and enhancing patient care.

The strategic partnership between Cerebras and G42 is particularly noteworthy, having already resulted in the creation of some of the world’s largest AI supercomputers through the Condor Galaxy installations. With Condor Galaxy 3 underway, this collaboration is poised to further extend the global AI compute capabilities, underscoring the pivotal role of Cerebras technology in advancing AI innovation.