Artificial intelligence model boosts yeast based protein drug production

March 1, 2026

MIT chemical engineers trained a species specific language model to design yeast friendly gene sequences, improving production of several therapeutic proteins and highlighting limits of traditional codon optimization metrics.

MIT chemical engineers have developed a large language model that learns the codon “grammar” of the industrial yeast Komagataella phaffii and uses it to design gene sequences that the organism translates more efficiently. Codons are three letter DNA units that encode amino acids, and because the genetic code includes 64 possible codons for only 20 amino acids, organisms evolve strong preferences for particular synonymous codons and local sequence patterns. Standard codon optimization tools typically emphasize the most frequent codons, but they often overlook how codon context, transfer RNA availability, and regulatory motifs shape real world protein expression. The new model, built as a GRU based encoder decoder network, was trained on amino acid sequences and matching coding DNA from roughly 5,000 native K. phaffii proteins sourced from a public National Center for Biotechnology Information dataset, allowing it to infer species specific usage patterns without hand coded rules.

After training, the researchers used the model to generate codon optimized DNA sequences for six recombinant proteins of varying size and complexity: human growth hormone, human granulocyte colony stimulating factor, a VHH nanobody called 3B2, an engineered SARS CoV 2 receptor binding domain, human serum albumin, and the IgG1 monoclonal antibody trastuzumab. They compared these sequences against versions produced by four commercial tools from Azenta, IDT, GenScript, and Thermo Fisher by inserting each construct into K. phaffii and measuring resulting protein titers. Across the six proteins, the MIT model produced the highest titer for five and ranked second for the remaining one. For human growth hormone and human granulocyte colony stimulating factor, the team observed about a 25% improvement, while human serum albumin showed about a threefold improvement when comparing optimized constructs to the native coding sequence. For native serum albumin sequences, human serum albumin reached a titer of 45 mg/L, while bovine serum albumin and mouse serum albumin reached 60 mg/L and 100 mg/L, respectively, and codon optimization increased bovine and mouse serum albumin titers by an additional 25%, to 75 mg/L and 135 mg/L.

The study also dissected what the model learned internally and how its designs differ from conventional metrics. Visualizations of learned amino acid embeddings showed clusters organized by physicochemical traits, including aliphatic, aromatic, basic, acid/amide, and alcohol groups, with hydrophobic residues grouping together and polar residues grouping together. Constructs designed by the model for the six tested proteins contained no negative cis regulatory elements in the analysis described and also avoided negative repeat elements, despite not being explicitly trained to filter these features. In contrast, global codon usage measures such as the Codon Adaptation Index and codon pair metrics did not consistently correlate with final titers, and in some cases higher Codon Adaptation Index scores were associated with lower yields. The researchers note that the model is trained for a single host species and that models trained on other organisms, including humans and cows, produce different predictions, underscoring the need for species specific approaches. They position the work as one lever among many in biomanufacturing, arguing that more predictive codon design can cut process development uncertainty and help move new protein based drugs into production more quickly, while acknowledging that cellular engineering, media formulation, and process optimization remain critical components.

Source

58

Impact Score

Latest News

Corning secures US$6b Meta deal to power artificial intelligence data centers with fiber optics

March 1, 2026

Corning has signed a multi year, US$6b fiber optics supply agreement with Meta Platforms, positioning its optical communications unit as core infrastructure for generative artificial intelligence data centers and underpinning a stronger growth outlook.

Ajinomoto’s quiet grip on a material powering Artificial Intelligence chips

March 1, 2026

Japanese food giant Ajinomoto has become a critical chokepoint in the semiconductor supply chain by controlling nearly all production of a specialized insulating film used in advanced Artificial Intelligence processors. Its Ajinomoto Build-up Film underpins high performance Nvidia-style chips and is extremely difficult for rivals to replicate.

Foundation models, not hardware, drive the next wave of robotics

March 1, 2026

Physical artificial intelligence is shifting robotics from task-specific machines to generalist systems powered by large behavior, vision language action, and world models that can operate safely at the edge.

How artificial intelligence is transforming Go and colliding with geopolitics

March 1, 2026

Artificial Intelligence is reshaping professional Go strategy, intensifying tensions between Anthropic and the Pentagon, and quietly infiltrating everything from extremist propaganda to fast-food worker evaluations.

MIT Technology Review named 2026 asme finalist for artificial intelligence energy reporting

March 1, 2026

MIT Technology Review has been named a finalist for a 2026 National Magazine Award in reporting for an investigation into the energy footprint of artificial intelligence. The story examined the climate implications of growing artificial intelligence demand and helped spur greater transparency from major technology companies.

Artificial intelligence model boosts yeast based protein drug production

58

Impact Score

Latest News

Corning secures US$6b Meta deal to power artificial intelligence data centers with fiber optics

Ajinomoto’s quiet grip on a material powering Artificial Intelligence chips

Foundation models, not hardware, drive the next wave of robotics

How artificial intelligence is transforming Go and colliding with geopolitics

MIT Technology Review named 2026 asme finalist for artificial intelligence energy reporting

Contact Us