You are currently viewing [Paper Review #19] Accurate structure prediction of cyclic peptides containing unnatural amino acids using HighFold3

[Paper Review #19] Accurate structure prediction of cyclic peptides containing unnatural amino acids using HighFold3

  • Post category:Knowledge
  • Post last modified:May 15, 2026
  • Reading time:4 mins read

https://academic.oup.com/bib/article/26/5/bbaf488/8259888

Accurate structure prediction of cyclic peptides containing unnatural amino acids using HighFold3

Key figures

  • Figure 1: Explains the Cyclization Switch that routes linear versus cyclic peptide inputs into AlphaFold 3 using distinct positional matrices.
  • Figure 2: Shows how HighFold3 handles disulfide-bond pairing and multi-cyclic-peptide ligand complexes with a shared protein receptor.
  • Figure 3: Quantifies cyclic peptide monomer accuracy across HighFold, HighFold3, CABS-flex, ESMFold, and HelixFold, including disulfide-bridge cases and pLDDT correlations.
  • Figure 4: Shows cyclic peptide-protein complex prediction accuracy, external benchmarking against CyclicBoltz1, and ZDOCK redocking validation.
  • Figure 5: Quantifies improved modeling of unAA-containing cyclic peptides using RMSDCα, RMSDall-atom, RMSDunAA, and pLDDT correlations.
  • Figure 6: Demonstrates that removing CycPOEM degrades cyclic peptide complex prediction and can make a cyclic peptide appear linear.
  • Figure 7 and Table 1: Shows that AlphaFold 3 failed to enforce head-to-tail closure for an unseen cyclic peptide, whereas HighFold3-Cyclic achieved 100% cyclization success versus AlphaFold 3’s 21%.

1) Thesis (one sentence)

To address inaccurate modeling of cyclic peptides containing unnatural amino acids, in AlphaFold 3-based peptide monomer and peptide-protein complex structure prediction, HighFold3 causes improved topology control and coordinate accuracy by selecting linear or cyclic positional encodings through a Cyclization Switch, encoding cyclic closure with CycPOEM, handling disulfide-bond topologies, and defining unAAs through CCD, supported by RMSD benchmarking, pLDDT correlation, ablation, docking, and cyclization-success analyses.

2) Evidence card (three bullets only)

  • Strongest result: (Fig. 3A–F; Fig. 4A–B; Fig. 5A–C; Table 1) HighFold3 achieved cyclic monomer median/average RMSDCα of 0.918/1.338 Å, improved internal cyclic peptide complex median/average RMSDCα to 0.266/0.305 Å, outperformed CyclicBoltz1 in 14/15 external complexes, improved unAA cyclic peptide RMSDCα, RMSDall-atom, and RMSDunAA versus HighFold2, and reached 100% head-to-tail cyclization success on 100 external cyclic peptide sequences.
  • Method enabler: (Fig. 1; Fig. 2; computational structure-prediction model engineering + AlphaFold 3 + CycPOEM/FCP + disulfide-bond combination matrix + CCD + ZDOCK) HighFold3 keeps AlphaFold 3 pretrained parameters frozen, uses a Boolean “head_to_tail” switch to select HighFold3-Linear or HighFold3-Cyclic, computes cyclic shortest-path residue distances with an improved Floyd-Warshall algorithm, supports disulfide-bond topology enumeration, and validates representative complexes by ZDOCK 3.0.2 rigid-body FFT docking.
  • Critical limitation: (Fig. 7; Table 1; Conclusion) The topology constraint can force a closed macrocycle and fix AlphaFold 3’s >20 Å N-to-C failure, but the output remains a static Top-1 structure that does not model physiological conformational dynamics, binding energetics, or experimentally validate predicted peptide-target contacts.

Optional

Quote bank (2–4 short excerpts)

  • Quote 1: “For cyclic peptides, CycPOEM directly connects the N-terminus and C-terminus of the peptide chain, constructing a closed-loop distance matrix.” (HighFold3 framework / Fig. 1C, p. 3)
  • Quote 2: “static structure modeling” (Conclusion, p. 10)

Key comparisons (1–3 lines)

  • Compared to: HighFold, HighFold2, CyclicBoltz1, NCPepFold, CABS-flex, ESMFold, HelixFold, and baseline AlphaFold 3.
  • Win: Lower RMSD across cyclic monomers, cyclic peptide-protein complexes, unAA-containing cyclic peptides, and linear unAA peptides; 100% external head-to-tail cyclization success versus AlphaFold 3’s 21%.
  • Tradeoff: Strong topology control improves closure and benchmarking metrics, but does not prove dynamic binding mechanisms, physiological-state ensembles, or experimental correctness of all side-chain/interface details.

Methods I might copy (protocol hooks)

  • Construct design / Models: Use two explicit modes, HighFold3-Linear and HighFold3-Cyclic, toggled by the Boolean “head_to_tail”; for linear peptides, set adjacent residue interval to 1 and N-to-C distance to peptide length minus 1; for cyclic peptides, directly connect N- and C-termini in CycPOEM; for cyclic peptide-protein complexes, combine a linear positional matrix for the protein with CycPOEM for the cyclic peptide ligand; encode unAAs through CCD.
  • Conditions / Instruments: Run on Ubuntu 20.04.2 with Intel Xeon Gold 6430 CPU, 64 cores, four NVIDIA A800 80 GB GPUs, 500 GB RAM, and 40 TB storage; evaluate Top-1/highest-confidence predictions; reported random seed values were 1, 4, 16, and 64 for cyclic peptide monomer stability testing.
  • Readout / Analysis: Calculate RMSDCα, RMSDall-atom, and RMSDunAA using Kabsch rigid superposition on Cα coordinates; define complex chains under 40 residues as peptide chains; extract unAA atomic coordinates within 10 Å for local superposition; report pLDDT-RMSD Pearson correlations, Wilcoxon signed-rank tests, cyclization success rate, and ZDOCK redocking concordance.

Open questions / Theoretical implications (2–5 bullets)

  • Can CycPOEM-style topology encoding generalize from head-to-tail and disulfide-cyclized peptides to stapled peptides, side-chain-to-tail cyclization, mixed macrocycles, or covalent warhead-bearing peptide ligands?
  • Does improved static RMSD predict usable binding free energy or target engagement when unAA side chains dominate the peptide-protein interface?
  • Should confidence thresholds be separately calibrated for backbone topology, unAA side-chain placement, and protein-peptide interface geometry?
  • Can MD refinement identify cases where HighFold3 closes the macrocycle correctly but misrepresents conformational heterogeneity or ligand-receptor residence states?
  • How much of the improvement comes from explicit topology constraints versus AlphaFold 3’s learned structural priors and CCD-based chemical definitions?