Artificial intelligence model boosts yeast based protein drug production

March 1, 2026

MIT chemical engineers trained a species specific language model to design yeast friendly gene sequences, improving production of several therapeutic proteins and highlighting limits of traditional codon optimization metrics.

MIT chemical engineers have developed a large language model that learns the codon “grammar” of the industrial yeast Komagataella phaffii and uses it to design gene sequences that the organism translates more efficiently. Codons are three letter DNA units that encode amino acids, and because the genetic code includes 64 possible codons for only 20 amino acids, organisms evolve strong preferences for particular synonymous codons and local sequence patterns. Standard codon optimization tools typically emphasize the most frequent codons, but they often overlook how codon context, transfer RNA availability, and regulatory motifs shape real world protein expression. The new model, built as a GRU based encoder decoder network, was trained on amino acid sequences and matching coding DNA from roughly 5,000 native K. phaffii proteins sourced from a public National Center for Biotechnology Information dataset, allowing it to infer species specific usage patterns without hand coded rules.

After training, the researchers used the model to generate codon optimized DNA sequences for six recombinant proteins of varying size and complexity: human growth hormone, human granulocyte colony stimulating factor, a VHH nanobody called 3B2, an engineered SARS CoV 2 receptor binding domain, human serum albumin, and the IgG1 monoclonal antibody trastuzumab. They compared these sequences against versions produced by four commercial tools from Azenta, IDT, GenScript, and Thermo Fisher by inserting each construct into K. phaffii and measuring resulting protein titers. Across the six proteins, the MIT model produced the highest titer for five and ranked second for the remaining one. For human growth hormone and human granulocyte colony stimulating factor, the team observed about a 25% improvement, while human serum albumin showed about a threefold improvement when comparing optimized constructs to the native coding sequence. For native serum albumin sequences, human serum albumin reached a titer of 45 mg/L, while bovine serum albumin and mouse serum albumin reached 60 mg/L and 100 mg/L, respectively, and codon optimization increased bovine and mouse serum albumin titers by an additional 25%, to 75 mg/L and 135 mg/L.

The study also dissected what the model learned internally and how its designs differ from conventional metrics. Visualizations of learned amino acid embeddings showed clusters organized by physicochemical traits, including aliphatic, aromatic, basic, acid/amide, and alcohol groups, with hydrophobic residues grouping together and polar residues grouping together. Constructs designed by the model for the six tested proteins contained no negative cis regulatory elements in the analysis described and also avoided negative repeat elements, despite not being explicitly trained to filter these features. In contrast, global codon usage measures such as the Codon Adaptation Index and codon pair metrics did not consistently correlate with final titers, and in some cases higher Codon Adaptation Index scores were associated with lower yields. The researchers note that the model is trained for a single host species and that models trained on other organisms, including humans and cows, produce different predictions, underscoring the need for species specific approaches. They position the work as one lever among many in biomanufacturing, arguing that more predictive codon design can cut process development uncertainty and help move new protein based drugs into production more quickly, while acknowledging that cellular engineering, media formulation, and process optimization remain critical components.

Source

58

Impact Score

Latest News

Nvidia leads the Artificial Intelligence chip market as rivals and regulators close in

June 1, 2026

Nvidia has built a dominant position in Artificial Intelligence chips through strong hardware, software, and rapid product cycles. Demand remains intense, but competition, geopolitics, and questions about the durability of the buildout are mounting.

Waymo unveils Ojai robotaxi

June 1, 2026

Waymo has introduced the Ojai, a purpose-built robotaxi designed around rider comfort and its latest autonomous driving system. The vehicle also debuts the sixth-generation Waymo Driver, with a leaner sensor stack and lower operating costs.

Anthropic nears $1tn valuation after record Artificial Intelligence funding round

June 1, 2026

Anthropic has approached the trillion-dollar threshold after a massive new fundraising round underscored the soaring cost of building and scaling frontier Artificial Intelligence systems. The company plans to use the capital to expand compute capacity, advance safety research and meet rising enterprise demand for Claude.

Generative Artificial Intelligence raises governance risks in UK financial services

May 31, 2026

A joint UK report warns that generative Artificial Intelligence creates governance problems in financial services that cannot be fully eliminated, only managed. Growing dependence on third-party providers and uneven understanding of deployed systems are increasing operational and systemic risk.

Huawei chip design raises pressure on Nvidia, AMD, and Intel

May 31, 2026

Huawei has outlined a new chip design framework that it says can improve efficiency and reduce dependence on leading-edge manufacturing tools. The move adds pressure on US chipmakers as China builds a domestic Artificial Intelligence semiconductor ecosystem under export restrictions.

Artificial intelligence model boosts yeast based protein drug production

58

Impact Score

Latest News

Nvidia leads the Artificial Intelligence chip market as rivals and regulators close in

Waymo unveils Ojai robotaxi

Anthropic nears $1tn valuation after record Artificial Intelligence funding round

Generative Artificial Intelligence raises governance risks in UK financial services

Huawei chip design raises pressure on Nvidia, AMD, and Intel

Contact Us