PDBBind

TrustC
Updated 2 months ago
NextIn take

The most widely used binding affinity benchmark, and also one of the most misleading. Data quality is inconsistent, train/test splits are often contaminated by homology, and good PDBBind performance has historically been a poor predictor of real-world virtual screening success. Use it as a sanity check, not a decision criterion.

What It Measures

Protein-ligand binding affinity prediction. Curated set of experimentally measured binding constants (Kd, Ki, IC50) paired with crystal structures from the PDB.

What It Doesn't Measure

Docking pose accuracy, selectivity, ADMET properties, or anything beyond the specific affinity value for a known co-crystal structure.

Maintainer

Shanghai Institute of Organic Chemistry

http://www.pdbbind.org.cn/

Known Limitations

Affinity labels are heterogeneous (Kd vs Ki vs IC50) and measured under varying conditions. The "refined set" is more reliable but much smaller. Heavy bias toward druggable targets — kinases and proteases are massively overrepresented. Test/train leakage is a persistent concern due to homologous proteins appearing in both sets.