Cerebras Systems develops computing chips with the sole purpose of accelerating AI. Santa Clara, California-based Cerebras’ AI chips compete with the advanced hardware produced by Nvidia NVDA.O that help OpenAI develop the underlying software that powers apps such as ChatGPT. Cerebras builds computer systems for complex artificial intelligence deep learning applications. Instead of stitching together thousands of chips to build and run AI applications, Cerebras has bet that its roughly foot-wide chip can outperform Nvidia’s clusters of chips.
The new CS3 system, the third in the industry’s only commercial wafer-scale AI processor, is built on TSMC 5nm process and is immediately available. The compute and memory density is off the charts, with 900,000 AI cores and 44 GB of fast (~10X HBM) on-wafer memory. Additional memory for large AI problems is supplied in a separate MemoryX parameter server. And the Cerebras software stack enables AI problems to scale efficiently across a CS3 cluster (the complete system housing a wafer) at a fraction of the development effort needed to distribute a problem across a cluster of accelerators. A faster chip, a faster cluster, and much faster time to deploy AI has helped Cerebras earn the support of organizations like the Mayo Clinic and Glaxo-Smith Klein. The company is also partering with Qualcomm, optimizing the output from the CS3 to cut inference costs with co-developed technology.
Cerebras Systems has unveiled its Wafer Scale Engine 3 (WSE-3), a breakthrough AI wafer-scale chip with double the performance of its predecessor, the WSE-2. This new device packs 4 trillion transistors made on TSMS’s 5nm-class fabrication process; 900,000 AI cores; 44GB of on-chip SRAM; and has a peak performance of 125 FP16 PetaFLOPS. Ceberas’s WSE-3 will be used to train some of the industry’s largest AI models.
The WSE-3 powers Cerebras’s CS-3 supercomputer, which can be used to train AI models with up to 24 trillion parameters — a significant leap over supercomputers powered by the WSE-2 and other modern AI processors. The supercomputer can support 1.5TB, 12TB, or 1.2PB of external memory, which allows it to store massive models in a single logical space without partitioning or refactoring — streamlining the training process and enhancing developer efficiency.