Skip to content

server

Latest
Compare
Choose a tag to compare
@tddschn tddschn released this 31 Oct 15:37
· 189 commits to master since this release
b5d8fbc

Server benchmarking results for the paper

General Info

  • Machine: AWS EC2 c5.2xlarge instance
  • CPU: Intel Xeon Platinum 8275CL (8) @ 3.599GHz
  • RAM: 16 GB
  • OS: Amazon Linux 2 x86_64
  • Kernel: 5.10.144-127.601.amzn2.x86_64
  • python: Python 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0] on linux

Results were generated by downloading all the datasets and running ./paper.sh.

The file bench-results-server.db includes info of the graph datasets this benchmark uses and the benchmarking results.

In the bench_results table in the sqlite database, records with id from 1-96 were generated by /mentrypoint_paper.sh called by ./paper.sh,
while records with id from 97-300 were generated by ./entrypoint_paper.sh called by ./paper.sh.

Benchmarking easygraph multiprocessing

The /mentrypoint_paper.sh file runs scripts that measures performance of easygraph multiprocessing methods

Methods benchmarked:

# in config.py
easygraph_multipcoessing_methods_for_paper = {
    'betweenness_centrality',
    'closeness_centrality',
    'constraint',
    'hierarchy',
}

On these datasets:

# real-world network
dataset_names_for_paper_multiprocessing = [
    'bio',
    'uspowergrid',
    'enron',
    'coauthorship',
]

# and  erdos-renyi random networks (undirected)
er_dataset_names_for_paper_multiprocessing = [
    f'er_{x}' for x in (500, 1000, 5000, 10000)
]

# additions from the 20221213 runs: 
er_dataset_names_for_paper_20221213 = [
    f'er_{x}'
    for x in (50000, 500000, 1000000)
]

Benchmarking easygraph with and without C++ binding, and networkx

Done by ./entrypoint_paper.sh.
average_time = -1.0 means that the method is not supported by that graph type (e.g., undirected graph)

Methods benchmarked:

clustering_methods = ["clustering"]
shortest_path_methods = [('Dijkstra', 'single_source_dijkstra_path')] # eg.Dijkstra() vs nx.single_source_dijkstra_path()

connected_components_methods_G = [
    "connected_components",
    "biconnected_components",
    'strongly_connected_components',
]

mst_methods = ['minimum_spanning_tree']
other_methods = ['density', 'constraint']
new_methods = ['effective_size']

With these multiprocessing options:

# n_workers not specified
# and 2, 4
easygraph_multiprocessing_n_workers_options_for_paper = [2, 4]

On these datasets:

[
    "cheminformatics", # read world networks
    "bio",
    "eco",
    'pgp',
    'pgp_undirected',
    'road',
    'uspowergrid',
    'enron',
    'coauthorship',
    'stub',
    'stub_with_underscore',
    'stub_directed',
    'stub_nx',
] + [f'er_{x}' for x in (500, 1000, 5000, 10000)] # erdos-renyi random networks (undirected)

Bench results REST API