Synthesia’s artificial intelligence clones are more expressive and soon will talk back

Synthesia’s latest Express-2 model produces more lifelike avatars with improved gestures, voice cloning, and faster rendering. The company says future avatars will understand conversations and respond in real time.

During a visit to Synthesia’s London studio, the author recorded a short scripted session to generate a hyperrealistic avatar and compared outputs from the older Express-1 model and the new Express-2 technology. Synthesia began in 2017 with dubbed, face-matching avatars and later offered businesses the ability to create presentation videos featuring AI versions of staff or consenting actors. The Express-2 avatars show notably smoother facial movements, more expressive hand gestures and a voice model that better preserves accents and intonation. The author received two avatars a few weeks after filming and found Express-2 significantly closer to a realistic presentation, while still betraying small telltale artifacts such as plasticky skin tones, stiff hair strands and glassy eyes.

Technically, Synthesia combined a new voice cloning model with multiple video models to improve realism. A speech analysis stage feeds an Express-Voice model that preserves accent and expressiveness. Express-2 then uses a gesture generator, an alignment evaluator that selects the best motion to match audio, and a powerful rendering model that replaces the earlier model. Where Express-1 used models with a few hundred million parameters, the Express-2 rendering model has parameters in the billions, which the company says reduces creation time and lets the system learn associations from much more diverse data. Synthesia staff described how the pipeline lets the system infer appropriate micro gestures and intonation without needing the same extensive emotion-specific footage required by older versions.

Synthesia is focused on corporate uses such as internal communications, training and financial presentations, and it has begun integrating Google’s Veo 3 generative video model to embed new clips directly. The company also plans to make avatars interactive so they can pause, expand or answer questions in real time, effectively combining conversational systems with a lifelike digital human. Researchers quoted in the reporting warned that increased realism risks deepening the uncanny valley and enabling new forms of attachment or manipulation. Observers pointed to existing examples of AI clones used commercially, the potential for embarrassing misuse, and broader concerns that highly charismatic synthetic presenters could alter human-to-human connection and encourage unhealthy emotional bonds.

78

Impact Score

Modular artificial intelligence agents outperform fine tuned monoliths

New multi institution research suggests that small specialized tools wrapped around a frozen large language model can match the accuracy of heavily fine tuned agents while using 70x less training data, validating a modular approach one developer discovered through trial and error.

Best workplace artificial intelligence tools for teams in 2026

The article outlines 10 workplace artificial intelligence tools that help teams cut busywork, improve communication, and standardize workflows across hiring, HR, projects, and operations in 2026. It explains which platforms fit different environments, from productivity suites and messaging to HR systems and service management.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.