eprintid: 1310 rev_number: 7 eprint_status: archive userid: 6 dir: disk0/00/00/13/10 datestamp: 2012-07-03 12:13:49 lastmod: 2013-11-21 11:59:27 status_changed: 2012-07-03 12:13:49 type: article metadata_visibility: show creators_name: Penner, Orion creators_name: Sood, Vishal creators_name: Musso, Gabriel creators_name: Baskerville, Kim creators_name: Grassberger, Peter creators_name: Paczuski, Maya creators_id: orion.penner@imtlucca.it creators_id: creators_id: creators_id: creators_id: creators_id: title: Node similarity within subgraphs of protein interaction networks ispublished: pub subjects: QA subjects: QH301 divisions: EIC full_text_status: none keywords: PACS: 87.14.Ee; 02.70.Uu; 87.10.+e; 89.75.Fb; 89.75.Hc abstract: We propose a biologically motivated quantity, twinness, to evaluate local similarity between nodes in a network. The twinness of a pair of nodes is the number of connected, labeled subgraphs of size n in which the two nodes possess identical neighbours. The graph animal algorithm is used to estimate twinness for each pair of nodes (for subgraph sizes n=4 to n=12) in four different protein interaction networks (PINs). These include an Escherichia coli PIN and three Saccharomyces cerevisiae PINs -- each obtained using state-of-the-art high throughput methods. In almost all cases, the average twinness of node pairs is vastly higher than expected from a null model obtained by switching links. For all n, we observe a difference in the ratio of type A twins (which are unlinked pairs) to type B twins (which are linked pairs) distinguishing the prokaryote E. coli from the eukaryote S. cerevisiae. Interaction similarity is expected due to gene duplication, and whole genome duplication paralogues in S. cerevisiae have been reported to co-cluster into the same complexes. Indeed, we find that these paralogous proteins are over-represented as twins compared to pairs chosen at random. These results indicate that twinness can detect ancestral relationships from currently available PIN data. date: 2008-06 date_type: published publication: Physica A: Statistical Mechanics and its Applications volume: 387 number: 14 publisher: Elsevier pagerange: 3801-3810 id_number: 10.1016/j.physa.2008.02.043 refereed: TRUE issn: 0378-4371 official_url: http://www.sciencedirect.com/science/article/pii/S0378437108002379 related_url_url: http://arxiv.org/abs/0707.2076 citation: Penner, Orion and Sood, Vishal and Musso, Gabriel and Baskerville, Kim and Grassberger, Peter and Paczuski, Maya Node similarity within subgraphs of protein interaction networks. Physica A: Statistical Mechanics and its Applications, 387 (14). pp. 3801-3810. ISSN 0378-4371 (2008)