EWCL Performance Across Disorder Prediction Benchmarks
The Entropy-Weighted Collapse Likelihood (EWCL) model achieves state-of-the-art performance across curated MobiDB and CAID disorder benchmarks. Independent validations show mean AUROC above 0.90, highlighting EWCL's reliability in quantifying disorder, linkers, and binding-prone residues directly from sequence-derived entropy without training bias.
MobiDB Sequence-Only Benchmarks
EWCL highlights canonical IDR biology using only sequence-derived entropy. We prioritize proteins with multiple annotation modalities (disorder, binding, low-complexity, motifs) to show annotation orthogonality and boundary fidelity.
| Dataset | Subset | Proteins | Residues | ROC–AUC | PR–AUC | Notes |
|---|---|---|---|---|---|---|
| DisProt (curated) | Full | 3,121 | — | 0.928 | 0.803 | Training reference; curated IDRs |
| IDEAL | All | 731 | — | 0.724 | 0.327 | Low-prevalence curated set |
| IDEAL excl. DisProt | External | 368 | — | 0.625 | 0.164 | True external subset |
| Merge (all) | Full | 3,670 | 2,257,526 | 0.904 | 0.750 | Broad curation |
| Merge (excl. DisProt) | External | 549 | 425,180 | 0.805 | 0.342 | External subset |
| UniProt (all) | Full | 217 | 237,097 | 0.936 | 0.727 | Curated mapping |
| Binding (D→D) | Non-homology snapshot | 1,909 | — | 0.774 | 0.405 | ~3.16× AP lift vs random |
DisProt (curated)
TrainingIDEAL
ExternalIDEAL excl. DisProt
ExternalMobiDB
FunctionalHighlighted Case Studies
Flagship proteins from cancer biology (p53, BRCA1), molecular chaperones (HSP70), calcium signaling (Calmodulin), and bacterial regulation (SilE) — each demonstrating EWCL's precision across diverse functional contexts.
Tumor protein p53
p53 tumor suppressor protein with structured DNA-binding domain and intrinsically disordered transactivation domains. EWCL precisely maps the flexible N- and C-terminal regions critical for transcriptional regulation and protein interactions.
UniProt Feature Summary
UniProt Features & EWCL Heatmap
Pfam / domain • Motif / ELM • Disordered • Binding • Low complexity • Secondary structure • Natural variant • Experimental disorder
BRCA1 C-terminal domain
BRCA1 breast cancer susceptibility protein containing structured RING and BRCT domains connected by intrinsically disordered linker regions. EWCL identifies the flexible regions involved in protein-protein interactions and DNA damage response.
UniProt Feature Summary
UniProt Features & EWCL Heatmap
Pfam / domain • Motif / ELM • Disordered • Binding • Low complexity • Secondary structure • Natural variant • Experimental disorder
RNA polymerase II subunit A
Large RNA polymerase II subunit with structured catalytic domains and intrinsically disordered C-terminal domain (CTD). EWCL accurately maps the flexible CTD critical for transcriptional regulation.
UniProt Feature Summary
UniProt Features & EWCL Heatmap
Pfam / domain • Motif / ELM • Disordered • Binding • Low complexity • Secondary structure • Natural variant • Experimental disorder
Phosphoprotein p53
Another p53 family member with similar disordered transactivation and regulatory domains. EWCL demonstrates exceptional performance on this highly disordered tumor suppressor variant.
UniProt Feature Summary
UniProt Features & EWCL Heatmap
Pfam / domain • Motif / ELM • Disordered • Binding • Low complexity • Secondary structure • Natural variant • Experimental disorder
Transcription elongation factor B
Transcription elongation factor with structured domains and flexible regulatory regions. EWCL identifies the disordered regions involved in transcriptional control and protein-protein interactions.
UniProt Feature Summary
UniProt Features & EWCL Heatmap
Pfam / domain • Motif / ELM • Disordered • Binding • Low complexity • Secondary structure • Natural variant • Experimental disorder