Node Classification on the Cora Dataset.
image source: https://arxiv.org/abs/1611.08402
The cora dataset consists of a single network of 2708 publications classified into one of seven classes and consist of 5429 links.
It was originally prepared by McCallum et al. 2000.
Here we use the planetoid version from Revisiting Semi-Supervised Learning with Graph Embeddings
Here we use a GCN (Graph Convolutional Network), which was first introduced by Semi-Supervised Classification with Graph Convolutional Networks, Kipf & Welling 2017
(image source: https://tkipf.github.io/graph-convolutional-networks/)
We compare the Accuracy. The search grid used to find the optimal hyperparameters can be found here.
Our model achieves a performance of 81.9 ± 0.6%
(using the best checkpoint).
Performance using a GCN was reported in Semi-Supervised Classification with Graph Convolutional Networks, Kipf & Welling 2017, where the authors report a classificationa accuracy of 81.5%
Their GCN configuration was found by training models:
- 200 epochs (training iterations)
- early stopping: window size 10 (stop training if val loss does not decrease for 10 consecutive epochs)
- optimizer: adam
- weights are initialized as described in Glorot & Bengio (2010) and input feature vectors are (row-)normalize accordingly.
Their model:
- num layer: 2
- hidden_channel: 16
- dropout: 0.5
- learning rate 0.01
- L2 regularization: 5 · 10−4 (first GCN layer)