ML-Based Prediction of Pairwise Antibody Epitope Binning

May 30, 2024
Reading time - 2 minutes

Epitope binning is essential for ensuring diversity in therapeutic antibody panels, but exhaustive pairwise testing drives up time, instrument use, and reagents. Our study introduces a machine learning–based workflow that uses Fv-sequence–derived features plus a small experimental subset (≈5–10% of total pairs) to predict the remaining interactions for pairwise binning competition for large panels of antibodies. Results from a blind test on 69 IgGs support practical deployment as a resource-saving option to standard assays.

Approach and outcomes

  • Each antibody pair is encoded as a 268 amino acid, structure-aware vector; this representation outperformed germline-only features in most campaigns.
  • Training on a random N × m subset measured by Octet BLI and predicting N × (N – m) pairs outperformed cluster-center–only training strategies.
  • Across nine campaign datasets (8 IgG, 1 HCAb), simulated benchmarks typically yielded AU-ROC >0.8, indicating robustness across antibody panels.
  • In a real-world blind evaluation (N=69 IgGs; 4,761 interactions), the model achieved AU-ROC of 0.83 and reproduced Carterra-defined epitope bins.
  • For a 96 × 96 study, the approach reduced antibody use by ~70%, antigen by ~93%, instrument time by ~66%, and sensor cost by ~42%, while maintaining comparable misprediction/attrition (~10–11%) to full experimental binning.

The results support a practical hybrid approach in which a small, well-chosen subset can be measured experimentally, while a sequence-based ML model can infer the remaining interactions. The resulting clusters reproduce those from surface-based binning, cutting consumables and cycle time while preserving the biological insight needed to select epitope-diverse leads.

View poster