Science

Void-X AI model designs protein interfaces atom by atom

Chinese researchers take a bottom-up approach to predicting how proteins bind, with implications for drug discovery and synthetic biology.

Omega Editorial· June 20, 2026· 3 min read

A new approach to protein engineering

Researchers at the Shanghai Institute of Organic Chemistry have developed Void-X, a generative AI model that predicts protein-protein interactions by filling atomic voids rather than designing entire protein structures from the top down. The work, first reported in Proceedings of the National Academy of Sciences on June 9, represents a fundamentally different strategy for computational protein design.

Proteins perform nearly every critical function in living cells, and many modern medicines work by targeting specific protein interactions. Antibody therapies for cancer and insulin replacement for diabetes both depend on precise protein binding. The ability to predict and engineer these interactions could accelerate therapeutic development, particularly as delivery technologies like mRNA lipid nanoparticles and adeno-associated virus vectors continue to advance.

How the model works

Most existing AI protein design tools follow a top-down strategy: they generate an overall protein scaffold that fits a target binding site, then optimize the amino acid sequence. Void-X inverts this logic.

The model operates on the principle that stable protein complexes achieve optimal atomic packing through local interactions between neighboring atoms and higher-order couplings with more distant atoms. Rather than designing complete protein shapes, Void-X directly generates atomic clusters optimized for tight packing within specified structural regions.

Researchers Yang Jing, Yuan Junying, and James J. Chou trained the model on more than 8 million spherical atomic clusters extracted from experimentally determined protein structures in the Protein Data Bank. During training, approximately 30 percent of peripheral atoms in each cluster were masked, and the model learned to predict these missing atoms based on the remaining context.

The resulting 172-million-parameter model achieved 78.3 percent accuracy for atomic clusters within a single protein chain and 68.2 percent accuracy for clusters spanning two interacting proteins.

Why it matters

Protein interface design has historically required extensive experimental trial and error. Computational methods have accelerated this process, but most still work at the level of overall protein architecture rather than atomic detail. Void-X's atom-by-atom approach provides a physically grounded foundation that could complement existing design frameworks.

The model's ability to capture atomic-scale interaction patterns offers a new route for rational biomolecular design. Applications span drug discovery, where precise protein binding determines therapeutic efficacy, and synthetic biology, where engineered protein interactions enable novel cellular functions. As protein delivery technologies mature, tools like Void-X that can predict binding with atomic precision become increasingly valuable for translating computational designs into clinical candidates.

The research was conducted by the Chinese Academy of Sciences and published in Proceedings of the National Academy of Sciences with DOI 10.1073/pnas.2607035123.

#protein engineering#generative ai#drug discovery#computational biology#structural biology#synthetic biology

This is an original analysis by the Omega editorial team. Source reporting: AI Watch.

Want systems like this working for your business?

Book a Call

More in Science

Science· 3 min read

UC Santa Barbara Researchers Give AI Geometric Understanding

New neural network architecture treats data as continuous functions rather than discrete points, making systems more robust and reliable.

Via AI Watch · Jun 16, 2026
Science· 3 min read

Yale Teams Probe Why AI Chatbots Fail: Misinformed or Misaligned?

Two research groups are developing methods to distinguish whether large language models give bad answers because they lack knowledge or because their objectives don't match user intent.

Via AI Watch · Jun 12, 2026
Science· 3 min read

Time-Shift Flaw in AI Sepsis Models Masks Treatment Failures

Emory researchers expose a subtle data-indexing error that has compromised a decade of reinforcement learning studies in critical care.

Via AI Watch · Jun 11, 2026