Skip to content

Network Topology

Luis Francisco Hernández Sánchez edited this page May 7, 2020 · 3 revisions

Study of the topological characteristics of biological interaction networks across all granularity levels

Replication

  1. Set up Neo4j and the Reactome Graph database.

  2. Create gene list

In the Neo4j interface run the next query to get the genes participating in the Pathways of Reactome[1] graph database in NEO4J console:

MATCH (ewas:EntityWithAccessionedSequence{speciesName:'Homo sapiens'})-[:referenceEntity]->(re:ReferenceEntity{databaseName:'UniProt'})
WITH re.identifier as protein, re.geneName as genes
WHERE size(genes) > 0  
UNWIND genes as gene
RETURN DISTINCT gene

Delete the header line of the file.

  1. Create protein list
MATCH (pe:PhysicalEntity{speciesName:"Homo sapiens"})-[:referenceEntity]->(re:ReferenceEntity{databaseName:"UniProt"})
RETURN DISTINCT re.identifier as PROTEIN

Delete the header line of the file.

  1. Create proteoform list
MATCH (pe:PhysicalEntity{speciesName:'Homo sapiens'})-[:referenceEntity]->(re:ReferenceEntity{databaseName:'UniProt'})
WITH DISTINCT pe, re
OPTIONAL MATCH (pe)-[:hasModifiedResidue]->(tm:TranslationalModification)-[:psiMod]->(mod:PsiMod)
WITH DISTINCT pe,
                re.identifier AS PROTEIN,
                CASE WHEN re.variantIdentifier IS NOT NULL THEN re.variantIdentifier ELSE re.identifier END AS ISOFORM,
                tm.coordinate as COORDINATE, 
                mod.identifier as TYPE 
ORDER BY TYPE, COORDINATE
WITH DISTINCT pe, PROTEIN, ISOFORM,
                COLLECT(TYPE + ":" + CASE WHEN COORDINATE IS NOT NULL THEN COORDINATE ELSE "null" END) AS PTMS
RETURN DISTINCT ISOFORM, PTMS

Delete the header line of the file.

Then convert the proteoform format from NEO4J to SIMPLE. Use PathwayMatcher class called ProteoformFormatConverter.

java -cp PathwayMatcher.jar matcher.tools.ProteoformFormatConverter Reactome/v72/Proteoforms/ all_proteoforms_neo4j.csv all_proteoforms.csv
  1. Follow Python notebook: src/Python/network_topology.ipynb

Before executing the script, the file config.py must be updated to the appropriate paths and file names.

The script contains functions to perform these steps:

  1. Download PathwayMatcher from: https://github.com/PathwayAnalysisPlatform/PathwayMatcher/releases/latest/download/PathwayMatcher.jar
  2. Create PathwayMatcher files
  3. Create interaction networks derived from the Pathways in Reactome:
  • Genes:
java -jar PathwayMatcher.jar match-genes -i Reactome/v72/genes/all_genes_v72.csv -o reactome/v72/genes/ -g
  • Proteins:
java -jar PathwayMatcher.jar match-uniprot -i Reactome/v72/proteins/all_proteins_v72.csv -o reactome/v72/proteins/ -g
  • Proteoforms:
java -jar PathwayMatcher.jar match-proteoforms -i Reactome/v72/proteoforms/all_proteoforms_v72.csv -o reactome/v72/proteoforms/ -g
  1. Calculates the min, max and average number of connections per node of each interaction network.

  2. Finds the articulation points and bridges in the network.

References