Our preprint describing CoCoScore was posted on bioRxiv yesterday. CoCoScore is a novel, context-aware co-occurrence scoring scheme for text mining applications. Our method can be used to extract biomedical relations, such as protein-protein interactions, from the scientific literature.
CoCoScore unifies previous approaches based on machine learning and statistical co-occurrence counting in one approach. We use distant supervision to train a sentence-level scoring model for eight different datasets and show performance improvements compared to previous approaches. CoCoScore is joint work with Lars Juhl Jensen and we, of course, welcome any feedback. An open source, MIT-licensed implementation of CoCoScore is available, too.
Our preprint is available here and the corresponding GitHub repository here.