Better mask_test_edges function #55

stefanosantaris · 2019-12-31T09:44:18Z

@tkipf This is my fix for a little bit more performant mask_test_edges function. I managed to reproduce the results of the paper and it works much better even with large graphs.
The only time consuming processes in this function are the assertions at the end of the function.

This reverts commit e117287, reversing changes made to a1aecb0.

tkipf · 2020-01-03T13:01:29Z

Thanks -- I'll leave this open for now in case someone is interested in a (working) example for a more efficient implementation. I rolled the master branch back to the original version before @philipjackson's PR to keep it in line with the paper.

GuillaumeSalhaGalvan · 2020-01-07T20:36:09Z

Dear all,

Contrary to previous comments (here + #54), I was able to reproduce all results from @tkipf's original paper using @philipjackson's implementation (see #25) of the mask_test_edges function.

I suspect that previous issues simply come from different train/validation/test splits. Indeed, @philipjackson set default parameters values to test_percent=30. and val_percent=20., whereas @tkipf used test_percent=10. and val_percent=5. in his experiments. So, @philipjackson masks more edges from the training graph w.r.t. original experiments, which leads to lower performances. With corrected parameters, I reach consistent results w.r.t. the paper.

Moreover, for the PubMed dataset, i.e. the largest one, @stefanosantaris's implementation runs in 3+ minutes on my laptop. @philipjackson's implementation runs in 0.03 seconds, and in a few seconds for a graph with 1 million nodes (I removed all assert lines for both functions).

As a consequence, I would recommend to use #25 with updated default parameters. :)

haorannlp · 2020-01-09T13:17:19Z

Dear all,

Contrary to previous comments (here + #54), I was able to reproduce all results from @tkipf's original paper using @philipjackson's implementation (see #25) of the mask_test_edges function.

I suspect that previous issues simply come from different train/validation/test splits. Indeed, @philipjackson set default parameters values to test_percent=30. and val_percent=20., whereas @tkipf used test_percent=10. and val_percent=5. in his experiments. So, @philipjackson masks more edges from the training graph w.r.t. original experiments, which leads to lower performances. With corrected parameters, I reach consistent results w.r.t. the paper.

Moreover, for the PubMed dataset, i.e. the largest one, @stefanosantaris's implementation runs in 3+ minutes on my laptop. @philipjackson's implementation runs in 0.03 seconds, and in a few seconds for a graph with 1 million nodes (I removed all assert lines for both functions).

As a consequence, I would recommend to use #25 with updated default parameters. :)

Hi, @GuillaumeSalha . Can you reproduce the results in the paper with test_percent=10, val_percent=5 with the original implementation /updated implementation? I still cannot reproduce it with the original one. Sad...

GuillaumeSalhaGalvan · 2020-01-09T13:21:19Z

Hi @haorannlp !
After changing test_percent=10. and val_percent=5. in preprocessing.py, you need to run again:

cd .. 
python setup.py install

Then, indeed, you should be able to reproduce results from the paper.

philipjackson · 2020-01-09T16:50:22Z

Hi everyone,

I think what happened here is that I wrote this code along with @sbonner0 for use in a paper of our own, in which we used different sized val and test splits, and only submitted it as a pull request here as an afterthought. That's why my default val_percent and test_percent don't match up with @tkipf's originals, I didn't think to revert them when I made the pull request. Apologies for the inconvenience caused, and thanks to @GuillaumeSalha for spotting the issue!

haorannlp · 2020-01-11T02:34:44Z

Thank you buddy, @GuillaumeSalha! I can reproduce the results now.

Better mask_test_edges function

d22a3b2

stefanosantaris mentioned this pull request Dec 31, 2019

can't reproduce the results in the paper #54

Closed

tkipf referenced this pull request Jan 3, 2020

Revert "Merge pull request #25 from philipjackson/mask_test_edges"

0ebbe9b

This reverts commit e117287, reversing changes made to a1aecb0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better mask_test_edges function #55

Better mask_test_edges function #55

stefanosantaris commented Dec 31, 2019

tkipf commented Jan 3, 2020

GuillaumeSalhaGalvan commented Jan 7, 2020 •

edited

Loading

haorannlp commented Jan 9, 2020

GuillaumeSalhaGalvan commented Jan 9, 2020 •

edited

Loading

philipjackson commented Jan 9, 2020

haorannlp commented Jan 11, 2020

Better mask_test_edges function #55

Are you sure you want to change the base?

Better mask_test_edges function #55

Conversation

stefanosantaris commented Dec 31, 2019

tkipf commented Jan 3, 2020

GuillaumeSalhaGalvan commented Jan 7, 2020 • edited Loading

haorannlp commented Jan 9, 2020

GuillaumeSalhaGalvan commented Jan 9, 2020 • edited Loading

philipjackson commented Jan 9, 2020

haorannlp commented Jan 11, 2020

GuillaumeSalhaGalvan commented Jan 7, 2020 •

edited

Loading

GuillaumeSalhaGalvan commented Jan 9, 2020 •

edited

Loading