FuriosaAI, a start-up with offices in Santa Clara, California and Seoul, Korea, emerged from stealth to introduce its new AI accelerator, RNGD, designed to enhance the performance and efficiency of large language model (LLM) and multimodal model inference in data centers. The RNGD chip features a novel architecture, advanced memory technology such as HBM3, and a thermal design power (TDP) of 150 watts, aiming to address the industry’s challenges in achieving high performance, power efficiency, and programmability in a single product.
At Hot Chips 2024, FuriosaAI said its RNGD chip, fabricated by TSMC, underwent rapid development, with silicon samples received in May and the first industry-standard Llama 3.1 models running by early June. FuriosaAI has begun delivering early access units to customers and is now focused on refining the software stack as they prepare for full production. The chip, expected to be available in data centers by early 2025, has shown promising initial performance, with further improvements anticipated through software enhancements.
• RNGD Chip Specifications:
• TDP: 150 watts.
• Memory Technology: Utilizes HBM3 for high-bandwidth data handling.
• Performance: Initial tests show 12 queries per second on the GPT-J 6B model, with expected improvements.
• Architecture: Features a novel design optimized for LLM and multimodal model inference.
• Development Timeline:
• First Silicon: Delivered by TSMC in May 2024.
• Initial Testing: Running Llama 3.1 models by early June 2024.
• Early Access: Began delivering units to customers in July 2024.
• Full Availability: Targeted for early 2025.
“Our priority now is refining our software stack as we ramp up RNGD production,” said June Paik, co-founder and CEO of FuriosaAI. “We’re eager to hear what the AI community thinks of RNGD as we work to make the chip widely available.”