Charles Srisuwananukorn Discusses Scaling Artificial Intelligence Infrastructure at Together AI

Charles Srisuwananukorn, VP of Engineering at Together AI, reveals the complexity and demands of building and operating physical infrastructure for cutting-edge Artificial Intelligence development.

Charles Srisuwananukorn, Founding Vice President of Engineering at Together AI, shared insights during a Chat8VC fireside chat about navigating the demands of scaling physical infrastructure for advanced Artificial Intelligence applications. Detailing his career journey—from impactful work at Snorkel AI and Apple to building core infrastructure at Together AI—he emphasized the unique challenges of managing large, physical GPU clusters, in contrast to virtualized environments. This hands-on approach has become integral to Together AI’s growth and mission to provide robust compute resources and infrastructure for foundational model development.

Srisuwananukorn discussed the major gaps in the open-source ecosystem, particularly the scarcity of clean, high-quality datasets, which led Together AI to launch the RedPajama initiative. He also highlighted the need for improved reinforcement learning tools as models become more sophisticated. Together AI’s clusters, equipped with the latest GPUs like H100s and H200s, are used for both internal research and external client workloads, offering customized orchestration and optimized system performance via proprietary software like the Together Kernel Collection. This focus on deep technical optimization—spanning networking, kernel design, and systems reliability—enables clients to achieve faster, more efficient model training, often delivering a notable performance boost out of the box.

As the company scales to tens of thousands of GPUs, Srisuwananukorn described tackling unexpected low-level operational challenges such as hardware reliability, overheating, and maintaining consistent performance. Automation is key, yet physical interventions—like resolving hardware failures—remain necessary. On infrastructure flexibility, he addressed the evolving demand for both giant and smaller, faster models, noting Together AI´s investments in edge computing to reduce latency for real-world Artificial Intelligence applications. Despite the operational pressures, Srisuwananukorn expressed optimism about recent breakthroughs in model accessibility, which allow increasingly sophisticated models to run on consumer hardware, forecasting a wave of innovation in the Artificial Intelligence ecosystem.

76

Impact Score

Gen and Intel push on device artificial intelligence deepfake detection

Cyber safety company Gen is partnering with Intel to bring on device artificial intelligence deepfake detection to consumer hardware, targeting scams that hide inside long form video and synthetic audio. New research from Gen suggests most deepfake enabled fraud now emerges during extended viewing sessions rather than through obvious phishing links.

Asic scaling challenges Nvidia’s artificial intelligence gpu dominance

Between 2022 and 2025, major vendors increased artificial intelligence chip output primarily by enlarging hardware rather than fundamentally improving individual processors. Nvidia and its rivals are presenting dual chip cards as single units to market apparent performance gains.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.