Charles Srisuwananukorn Discusses Scaling Artificial Intelligence Infrastructure at Together AI

Charles Srisuwananukorn, VP of Engineering at Together AI, reveals the complexity and demands of building and operating physical infrastructure for cutting-edge Artificial Intelligence development.

Charles Srisuwananukorn, Founding Vice President of Engineering at Together AI, shared insights during a Chat8VC fireside chat about navigating the demands of scaling physical infrastructure for advanced Artificial Intelligence applications. Detailing his career journey—from impactful work at Snorkel AI and Apple to building core infrastructure at Together AI—he emphasized the unique challenges of managing large, physical GPU clusters, in contrast to virtualized environments. This hands-on approach has become integral to Together AI’s growth and mission to provide robust compute resources and infrastructure for foundational model development.

Srisuwananukorn discussed the major gaps in the open-source ecosystem, particularly the scarcity of clean, high-quality datasets, which led Together AI to launch the RedPajama initiative. He also highlighted the need for improved reinforcement learning tools as models become more sophisticated. Together AI’s clusters, equipped with the latest GPUs like H100s and H200s, are used for both internal research and external client workloads, offering customized orchestration and optimized system performance via proprietary software like the Together Kernel Collection. This focus on deep technical optimization—spanning networking, kernel design, and systems reliability—enables clients to achieve faster, more efficient model training, often delivering a notable performance boost out of the box.

As the company scales to tens of thousands of GPUs, Srisuwananukorn described tackling unexpected low-level operational challenges such as hardware reliability, overheating, and maintaining consistent performance. Automation is key, yet physical interventions—like resolving hardware failures—remain necessary. On infrastructure flexibility, he addressed the evolving demand for both giant and smaller, faster models, noting Together AI´s investments in edge computing to reduce latency for real-world Artificial Intelligence applications. Despite the operational pressures, Srisuwananukorn expressed optimism about recent breakthroughs in model accessibility, which allow increasingly sophisticated models to run on consumer hardware, forecasting a wave of innovation in the Artificial Intelligence ecosystem.

76

Impact Score

DiffusionGemma rethinks text generation with diffusion

DiffusionGemma applies diffusion-style denoising to text, trading autoregressive token-by-token decoding for iterative canvas refinement. Its design combines encoder guidance, bidirectional denoising, scheduling, and entropy-based sampling.

NVIDIA shows RTX Spark platform at Computex 2026

NVIDIA presented RTX Spark in Taipei as a Windows on Arm platform spanning gaming, creator, and Artificial Intelligence workloads. Microsoft also detailed Windows 11 optimizations built specifically for the new NVIDIA silicon.

AWS enterprise processor targets Artificial Intelligence inference

AWS’s Annapurna Labs-designed enterprise server processor uses a chiplet architecture for cloud infrastructure and Artificial Intelligence inferencing. The design combines Arm compute resources, cache coherency, and high-bandwidth interconnects for AWS deployments.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.