
https://www.biorxiv.org/content/10.64898/2026.03.03.707476v1
Explainable Physicochemical Determinants of Protein–Ligand Binding via Non-Covalent Interactions
Key figures
- [Figure 1]: Establishes the main contribution by pairing InteractBind dataset construction with ExplainBind’s attention-supervised sequence model, making the physical supervision strategy explicit.
- [Table 1]: Shows the core benchmark gain, with ExplainBind reaching AUROC 0.993 and AUPRC 0.954 in-distribution against eight baselines.
- [Figure 2]: Proves interpretability is functional rather than decorative, with BRHR rising from 55.6% at Top-1 to 74.6% at Top-5 and case studies recovering crystallographic pocket contacts.
- [Figure 4]: Provides the strongest translational validation by prospectively ranking compounds for unseen L2HGDH and recovering both inhibitors and activators experimentally.
1) Thesis (one sentence)
To address black-box, weakly interpretable protein–ligand binding prediction, in sequence-based protein–ligand modeling across diverse targets and ligands, supervising cross-attention with curated non-covalent interaction maps causes higher predictive accuracy, stronger out-of-distribution robustness, and residue-level pocket localization by aligning token-level protein–ligand interactions with physically grounded interaction patterns, supported by benchmark comparisons, structural case studies, and prospective wet-lab validation.
2) Evidence card (three bullets only)
- Strongest result: ExplainBind prospectively enriched functional L2HGDH modulators on an unseen target, with a Top-5 mean |Relative Activity| of 22.40 and four Top-50 ranked compounds showing |Relative Activity| ≥ 25, including three activators (1239, 1522, 3204) and one inhibitor (1303) (Fig. 4c-d; Table S6).
- Method enabler: InteractBind plus ExplainBind converts PDB complexes into six non-covalent interaction maps using docking, PLIP, and GetContacts, then aligns an 8-head cross-attention model to those maps with BCE + KL loss, enabling interpretable binding prediction without explicit 3D input at inference time (Fig. 1a,e; Fig. 5; computational ML + docking/PDB curation + rule-based interaction annotation).
- Critical limitation: The supervision is not purely experimental—positives are defined by docking affinities ≤ −7.0 kcal/mol, negatives by ≥ −5.0 kcal/mol, and interaction strengths by rule-based geometric cutoffs—so both labels and explanations inherit docking/annotation bias (Fig. 1a-b; Table S1).
Optional
Quote bank (2–4 short excerpts)
- Quote 1: “Protein sequence similarity is the primary determinant of OOD generalization.” (Results 2.3, p.5)
- Quote 2: “The top 20 residues identified by ExplainBind covered all interaction sites observed in the crystal structures.” (Results 2.5, p.7)
- Quote 3: “These progressive deviations indicate systematic changes in residue-level interaction patterns associated with reduced binding potency.” (Results 2.6, p.9)
Key comparisons (1–3 lines)
- Compared to: sequence-based PLB baselines such as SVM, RF, DeepConv-DTI, GraphDTA, MolTrans, DrugBAN, LANTERN, and GraphBAN that do not explicitly supervise attention with physical interaction maps.
- Win: ExplainBind keeps the scalability of sequence-based inference while adding physically grounded interaction supervision, giving the best ID/OOD performance and usable pocket-localization maps.
- Tradeoff: the best-performing configuration benefits from curated structural/docking annotations and structure-aware protein sequences, and performance still falls most sharply with protein sequence divergence.
Methods I might copy (protocol hooks)
- Construct design / Models: Curate InteractBind from PDB complexes (99,382 protein–ligand complexes) and train an 8-head interaction module with six chemistry-specific heads (hydrogen bonding, salt bridge, van der Waals, hydrophobic contact, π–π stacking, cation–π) plus two overall heads; ligand input can be SELFIES/SMILES and protein input FASTA or Foldseek-derived structure-aware sequence.
- Conditions / Instruments: Focused docking positives used a search box centered on the original ligand position with ligand size + 0.5 nm buffer and affinity cutoff ≤ −7.0 kcal/mol; negatives used global docking centered on the protein centroid with protein dimensions + 1.0 nm buffer and cutoff ≥ −5.0 kcal/mol; training used batch size 64, hidden dimension 512, AdamW, CosineAnnealingLR, initial learning rate 1 × 10−4, λ = 0.3, and up to 200 epochs on an NVIDIA RTX A6000.
- Readout / Analysis: Evaluate with AUROC/AUPRC/accuracy plus Binding Residue Hit Rate using strict exact-match (δ = 0) or relaxed ±1 residue (δ = 1); prospective wet-lab validation used 40 nM dmL2HGDH, 20 µM compounds, 30 min preincubation, then 120 µM L-2-hydroxyglutarate + 200 µM resazurin for 90 min, read on a PHERAstar FSX at 540/590 nm.
Open questions / Theoretical implications (2–5 bullets)
- Can an interaction-supervised sequence model be extended from binary protein–ligand binding to true ternary-state prediction, where only a condition-specific composite pocket is functional?
- If protein sequence divergence is the dominant OOD bottleneck, what representation best captures local binding-surface physics independently of global homology?
- How much of the model’s “explainability” is transferable chemistry versus an imprint of docking-derived labels and rule-based contact definitions?
- Could residue-level interaction maps be augmented with photophysical labels to predict not only binding, but state-dependent outputs such as quenching, opening, or fluorogenic turn-on?