Functional protein representations from biological networks enable diverse cross-species inference
Functional protein representations from biological networks enable diverse cross-species inference
Files
Publication or External Link
Date
2019-03-08
Authors
Fan, Jason
Cannistra, Anthony
Fried, Inbar
Lim, Tim
Schaffner, Thomas
Crovella, Mark
Hescott, Benjamin
Leiserson, Mark D.M.
Advisor
Citation
Jason Fan, Anthony Cannistra, Inbar Fried, Tim Lim, Thomas Schaffner, Mark Crovella, Benjamin Hescott, Mark D M Leiserson, Functional protein representations from biological networks enable diverse cross-species inference, Nucleic Acids Research, Volume 47, Issue 9, 21 May 2019, Page e51, https://doi.org/10.1093/nar/gkz132
DRUM DOI
Abstract
Transferring knowledge between species is key for
many biological applications, but is complicated
by divergent and convergent evolution. Many current
approaches for this problem leverage sequence
and interaction network data to transfer knowledge
across species, exemplified by network alignment
methods. While these techniques do well, they are
limited in scope, creating metrics to address one
specific problem or task. We take a different approach
by creating an environment where multiple
knowledge transfer tasks can be performed using
the same protein representations. Specifically, our
kernel-based method, MUNK, integrates sequence
and network structure to create functional protein
representations, embedding proteins from different
species in the same vector space. First we show
proteins in different species that are close in MUNKspace
are functionally similar. Next,we use these representations
to share knowledge of synthetic lethal
interactions between species. Importantly, we find
that the results using MUNK-representations are at
least as accurate as existing algorithms for these
tasks. Finally, we generalize the notion of a phenolog
(‘orthologous phenotype’) to use functionally similar
proteins (i.e. those with similar representations). We
demonstrate the utility of this broadened notion by
using it to identify known phenologs and novel non-obvious
ones supported by current research.
Notes
Partial funding for Open Access provided by the UMD Libraries' Open Access Publishing Fund.