SuperNODE™ platform, capable of supporting dozens of Corsair™ AI inference accelerators in a single node, delivers unprecedented scale and efficiency for next-generation AI inference workloads.
GigaIO, a pioneer in scalable edge-to-core AI platforms for all accelerators that are easy to deploy and manage, today announced the next phase of its strategic partnership with d-Matrix to deliver the world’s most expansive inference solution for enterprises deploying AI at scale. Integrating d-Matrix’s revolutionary Corsair inference platform into GigaIO’s SuperNODE architecture creates an unparalleled solution that eliminates the complexity and performance bottlenecks traditionally associated with large-scale AI inference deployment.
This joint solution addresses the growing demand from enterprises for high-performance, energy-efficient AI inference capabilities that can scale seamlessly without the typical limitations of multi-node configurations. Combining GigaIO’s industry-leading scale-up AI architecture with d-Matrix’s purpose-built inference acceleration technology produces a solution that delivers unprecedented token generation speeds and memory bandwidth, while significantly reducing power consumption and total cost of ownership.
Revolutionary Performance Through Technological Integration
The new GigaIO SuperNODE platform, capable of supporting dozens of d-Matrix Corsair accelerators in a single node, is now the industry’s most scalable AI inference platform. This integration enables enterprises to deploy ultra-low-latency batched inference workloads at scale without the complexity of traditional distributed computing approaches.
“By combining d-Matrix’s Corsair PCIe cards with the industry-leading scale-up architecture of GigaIO’s SuperNODE, we’ve created a transformative solution for enterprises deploying next-generation AI inference at scale,” said Alan Benjamin, CEO of GigaIO. “Our single-node server eliminates complex multi-node configurations and simplifies deployment, enabling enterprises to quickly adapt to evolving AI workloads while significantly improving their TCO and operational efficiency.”
The combined solution delivers exceptional performance metrics that redefine what’s possible for enterprise AI inference:
- Processing capability of 30,000 tokens per second at just 2 milliseconds per token for models like Llama3 70B
- Up to 10x faster interactive speed compared with GPU-based solutions
- 3x better performance at a similar total cost of ownership
- 3x greater energy efficiency for more sustainable AI deployments
“When we started d-Matrix in 2019, we looked at the landscape of AI compute and made a bet that inference would be the largest computing opportunity of our lifetime,” said Sid Sheth, founder and CEO of d-Matrix. “Our collaboration with GigaIO brings together our ultra-efficient in-memory compute architecture with the industry’s most powerful scale-up platform, delivering a solution that makes enterprise-scale generative AI commercially viable and accessible.”
This integration leverages GigaIO’s cutting-edge PCIe Gen 5-based AI fabric, which delivers low-latency communication between multiple d-Matrix Corsair accelerators with near-zero latency. This architectural approach eliminates the traditional bottlenecks associated with distributed inference workloads while maximizing the efficiency of d-Matrix’s Digital In-Memory Compute (DIMC) architecture, which delivers an industry-leading 150 TB/s memory bandwidth.
Industry Recognition and Performance Validation
This partnership builds on GigaIO’s recent achievement of recording the highest tokens per second for a single node in the MLPerf Inference: Datacenter benchmark database, further validating the company’s leadership in scale-up AI infrastructure.
“The market has been demanding more efficient, scalable solutions for AI inference workloads that don’t compromise performance,” added Benjamin. “Our partnership with d-Matrix brings together the tremendous engineering innovation of both companies, resulting in a solution that redefines what’s possible for enterprise AI deployment.”
Those interested in early access to SuperNODEs running Corsair accelerators can indicate interest here.
About GigaIO
GigaIO redefines scalable AI infrastructure, seamlessly bridging from edge to core with a dynamic, open platform built for every accelerator. Reduce power draw with GigaIO’s SuperNODE, the world’s most powerful and energy-efficient scale-up AI computing platform. Run AI jobs anywhere with Gryf, the world’s first suitcase-sized AI supercomputer that brings datacenter-class computing power directly to the edge. Both are easy to deploy and manage, utilizing GigaIO’s patented AI fabric that provides ultra-low latency and direct memory-to-memory communication between GPUs for near-perfect scaling for AI workloads. Visit www.gigaio.com, or follow on Twitter (X) and LinkedIn.
About d-Matrix
d-Matrix is transforming the economics of large-scale inference with the world’s most efficient AI computing platform for inference in data centers. The company’s Corsair platform leverages innovative Digital In-Memory Compute (DIMC) architecture to accelerate AI inference workloads with industry-leading real-time performance, energy efficiency, and cost savings compared to GPUs and other alternatives. d-Matrix delivers ultra-low latency without compromising throughput, unlocking the next wave of Generative AI use cases while enabling commercially viable AI computing that scales with model size to empower companies of all sizes and budgets. For more information, visit www.d-matrix.ai.
View source version on businesswire.com: https://www.businesswire.com/news/home/20250501596224/en/
"Combining GigaIO’s scale-up AI architecture with d-Matrix’s purpose-built inference acceleration technology delivers unprecedented token generation speeds and memory bandwidth, while significantly reducing power consumption and total cost of ownership."
Contacts
Shannon Biggs | 760-487-8395 | shannon@xandmarketing.com