Start-up Unveils “Prometheus” AI Server to Tackle Memory Bottlenecks in Large-Scale Models
Majestic Labs introduced Prometheus, an AI server engineered to address the “memory wall,” a growing constraint in scaling modern AI workloads. The system adopts a memory-first architecture that connects significantly larger pools of high-speed memory directly to processing elements, aiming to reduce latency and improve utilization. The company positions Prometheus as an alternative to conventional GPU-centric designs, which often leave processors underutilized while waiting on data movement across fragmented memory hierarchies.
The Prometheus platform integrates up to 128 TB of shared, contiguous memory within a single standard-sized server, enabling execution of large-scale AI models that typically require distributed infrastructure. Majestic Labs claims the system can deliver performance comparable to multiple racks of traditional servers while reducing power consumption and total cost of ownership. The design reflects broader industry pressures, as hyperscaler capital expenditures on AI infrastructure continue to rise sharply, with a growing share allocated to memory and data movement challenges rather than raw compute.
At the silicon level, Prometheus incorporates proprietary AI Processing Units (AIUs) called Ignite, combining ARM-based CPU cores with RISC-V vector and tensor engines in a unified memory space. The system supports widely used frameworks such as PyTorch, vLLM, and OpenAI Triton, allowing developers to run existing workloads without code modification. Majestic Labs states that the architecture can support multi-trillion-parameter models, large context windows, and emerging AI workloads such as mixture-of-experts and agentic systems within a single node.
- Memory-first architecture designed to overcome AI “memory wall” constraints
- Up to 128 TB of high-speed, shared memory per server
- Claims of performance equivalent to multiple racks in a single system
- Ignite AIUs combine ARM CPUs with RISC-V vector/tensor cores
- Supports PyTorch, vLLM, and Triton without code changes
- Targets multi-trillion-parameter models and large context windows
- Early deployments underway; broader availability expected next year
“Prometheus represents the first ground-up reimagination of AI infrastructure with memory as a first-class citizen,” said Ofer Shacham, Co-Founder and CEO of Majestic Labs.
| Profile: Majestic AI | |
|---|---|
| Company | Majestic Labs |
| Founded | 2023 |
| Headquarters | San Francisco, California, USA and Tel Aviv, Israel |
| CEO / Co-Founder | Ofer Shacham |
| Other Founders | Sha Rabii (President), Masumi Reynders (COO) |
| Founder Backgrounds | Engineering and product leadership experience at Google and Meta, with focus on custom silicon, AI infrastructure, and large-scale systems |
| Recent Funding | $100 million Series A (September 2025) |
| Key Investors | Lux Capital, Bow Wave Capital Management, Upfront Ventures, SBI Investment |
| Core Product | Prometheus AI Server (memory-first architecture) |
| Custom Silicon | Ignite AI Processing Units (AIUs) |
| Architecture | Memory-first design with large, shared, contiguous memory pool directly attached to compute |
| Maximum Memory | Up to 128 TB per server |
| Software Compatibility | Supports PyTorch, vLLM, and Triton without code modification |
| Target Workloads | Large language models, multi-trillion-parameter models, mixture-of-experts, agentic AI systems, graph neural networks |
| Mission | Expand access to advanced AI while improving power efficiency and reducing data center footprint |







