Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pagerank example #673

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ jobs:
- name: Build and install Sparse
run: |
pip install -U setuptools wheel
python -m pip install '.[finch]' scipy
python -m pip install '.[finch]' scipy networkx
- name: Run examples
run: |
source ci/test_examples.sh
Expand Down
84 changes: 84 additions & 0 deletions examples/pagerank_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
import importlib
import os
import time

import sparse

import networkx as nx
from networkx.algorithms.link_analysis.pagerank_alg import _pagerank_scipy

import numpy as np
import scipy.sparse as sp


def pagerank(G, alpha=0.85, max_iter=100, tol=1e-6) -> dict:
N = len(G)
if N == 0:
return {}

alpha = sparse.asarray(alpha)
nodelist = list(G)
A = nx.to_scipy_sparse_array(G, dtype=float, format="csc")
A = sparse.asarray(A)
S = sparse.sum(A, axis=1)
S = sparse.where(sparse.asarray(0.0) != S, sparse.asarray(1.0) / S, S)

# TODO: spdiags https://github.com/willow-ahrens/Finch.jl/issues/499
Q = sparse.asarray(sp.csc_array(sp.spdiags(S.todense(), 0, *A.shape)))
A = Q @ A
Comment on lines +27 to +28
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC This is equivalent to https://data-apis.org/array-api/latest/extensions/generated/array_api.linalg.diagonal.html#diagonal, which is supported by the Numba backend. Let's use that instead.

Suggested change
Q = sparse.asarray(sp.csc_array(sp.spdiags(S.todense(), 0, *A.shape)))
A = Q @ A
Q = sparse.diagonal(S)
A *= Q

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Finch, we can add support for it later, as the example only needs to work on Numba for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might need Q[None, :] to match the results, but otherwise this is equivalent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the pagerank example is run either with Finch or Numba backend. when we run with Finch backend (and I think it should be runnable with Finch) then we can't execute one line with Numba.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep these examples Finch and Numba ready, and update implementation once both backends support something new.

Copy link
Collaborator

@hameerabbasi hameerabbasi May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep these examples Finch and Numba ready, and update implementation once both backends support something new.

According to the deliverable, we just need a script that will demonstrate a speedup against the "old" code. If you would prefer, we can always keep this PR open until xp.diagonal is supported by Finch, but I wouldn't densify.


# initial vector
x = sparse.full((1, N), fill_value=1.0 / N)

# personalization vector
p = sparse.full((1, N), fill_value=1.0 / N)

# Dangling nodes
dangling_weights = p

# power iteration: make up to max_iter iterations
for _ in range(max_iter):
xlast = x
x_dangling = sparse.where(S[None, :] == sparse.asarray(0.0), x, sparse.asarray(0.0))
x = (
alpha * (x @ A + sparse.asarray(sparse.sum(x_dangling)) * dangling_weights)
+ (sparse.asarray(1) - alpha) * p
)
# check convergence, l1 norm
err = sparse.sum(sparse.abs(x - xlast))
if err < N * tol:
return dict(zip(nodelist, map(float, x[0, :]), strict=False))

raise nx.PowerIterationFailedConvergence(max_iter)


if __name__ == "__main__":
G = nx.DiGraph(nx.path_graph(4))
ITERS = 3

os.environ[sparse._ENV_VAR_NAME] = "Finch"
importlib.reload(sparse)

# compile
pagerank(G)
print("compiled")

# finch
start = time.time()
for i in range(ITERS):
print(f"finch iter: {i}")
pr = pagerank(G)
elapsed = time.time() - start
print(f"Finch took {elapsed / ITERS} s.")

# scipy
start = time.time()
for i in range(ITERS):
print(f"scipy iter: {i}")
scipy_pr = _pagerank_scipy(G)
elapsed = time.time() - start
print(f"SciPy took {elapsed / ITERS} s.")

np.testing.assert_almost_equal(list(pr.values()), list(scipy_pr.values()))
print(f"finch: {pr}")
print(f"scipy: {scipy_pr}")
Loading