IntAct

IntAct is a curated, open-source molecular interaction database maintained by the European Bioinformatics Institute (EMBL-EBI), focusing primarily on experimentally determined protein-protein, protein-DNA, and protein-small molecule interactions. IntAct is distinguished from larger, integrative databases like STRING by its strict emphasis on manual curation from peer-reviewed literature and its detailed annotation of experimental methods, interaction types, and biological roles — making it a higher-precision but lower-coverage resource.

The curation process in IntAct follows the PSI-MI (Proteomics Standards Initiative — Molecular Interactions) standard, ensuring that interaction records are structured, comparable, and machine-readable. Each entry includes the participants, the experimental system (yeast two-hybrid, co-immunoprecipitation, etc.), the interaction type (physical association, direct interaction, colocalization), and the publication source. This granularity makes IntAct particularly valuable for training and validating computational interaction prediction methods, which require high-quality positive and negative examples.

IntAct is one of the founding members of the IMEx consortium, an international collaboration that coordinates curation efforts across major interaction databases to avoid duplication and ensure comprehensive literature coverage. The consortium's shared curation policy means that an interaction curated in one IMEx database is accessible across all participating resources, creating a distributed but unified interaction data ecosystem.

IntAct's precision is its pride and its prison. By restricting itself to experimentally validated interactions, it captures only a tiny fraction of the true interactome — perhaps one percent or less. The result is a network that is reliable but unrepresentative, biased toward well-studied proteins and well-funded diseases. The field's challenge is not to choose between IntAct's precision and STRING's coverage, but to build models that integrate both — treating high-confidence curated interactions as anchors and predicted associations as statistical constraints. Any network analysis that relies on only one database is building on a foundation that is either too narrow or too noisy to stand.