snap-stanford · TianyuDu · Jun 2, 2021 · Jun 3, 2021 · Jun 3, 2021 · Jun 4, 2021
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,8 @@
 **/data_dir/
 run/datasets/data/
+run/results/
+run/runs_*/
 **/__pycache__/
 **/.ipynb_checkpoints
-.idea/
+.idea/
+.vscode/settings.json
diff --git a/ROLAND_README.md b/ROLAND_README.md
@@ -0,0 +1,106 @@
+# ROLAND: Graph Neural Networks for Dynamic Graphs
+This repository contains code associated with the ROLAND project and more.
+You can firstly walk through the *how-to* sections to run experiments on existing
+public datasets.
+After understanding how to run and analyze experiments, you can read through the *development topics* to run our 
+
+
+<!-- ## TODO: add figures to illustrate the ROLAND framework. -->
+
+## How to Download Datasets
+Most of datasets are used in our paper can be found at `https://snap.stanford.edu/data/index.html`.
+
+```bash
+# Or Use your own dataset directory.
+mkdir ./all_datasets/
+cd ./all_datasets
+wget 'https://snap.stanford.edu/data/soc-sign-bitcoinotc.csv.gz'
+wget 'https://snap.stanford.edu/data/soc-sign-bitcoinalpha.csv.gz'
+wget 'https://snap.stanford.edu/data/as-733.tar.gz'
+wget 'https://snap.stanford.edu/data/CollegeMsg.txt.gz'
+wget 'https://snap.stanford.edu/data/soc-redditHyperlinks-body.tsv'
+wget 'https://snap.stanford.edu/data/soc-redditHyperlinks-title.tsv'
+wget 'http://snap.stanford.edu/data/web-redditEmbeddings-subreddits.csv'
+
+# Unzip files
+gunzip CollegeMsg.txt.gz
+gunzip soc-sign-bitcoinalpha.csv.gz
+gunzip soc-sign-bitcoinotc.csv.gz
+tar xf ./as-733.tar.gz
+
+# Rename files, this step is required by our loader.
+# You can leave the web-redditEmbeddings-subreddits.csv file unchanged.
+mv ./soc-sign-bitcoinotc.csv ./bitcoinotc.csv
+mv ./soc-sign-bitcoinalpha.csv ./bitcoinalpha.csv
+
+mv ./soc-redditHyperlinks-body.tsv ./reddit-body.tsv
+mv ./soc-redditHyperlinks-title.tsv ./reddit-title.tsv
+```
+You should expect 740 files, including the zipped `as-733.tar.gz`, by checking `ls | wc -l`.
+The total disk space required is approximately 950MiB.
+## How to Run Single Experiments from Our Paper
+**WARNING**: for each `yaml` file in `./run/configs/ROLAND`, you need to update the `dataset.dir` field to the correct path of datasets downloaded above.
+
+The ROLAND project focuses on link-predictions for homogenous dynamic graphs.
+Here we demonstrate example runs using 
+
+To run link-prediction task on `CollegeMsg.txt` dataset with default settings:
+```bash
+cd ./run
+python3 main_dynamic.py --cfg configs/ROLAND/roland_gru_ucimsg.yaml --repeat 1
+```
+For other datasets:
+```bash
+python3 main_dynamic.py --cfg configs/ROLAND/roland_gru_btcalpha.yaml --repeat 1
+
+python3 main_dynamic.py --cfg configs/ROLAND/roland_gru_btcotc.yaml --repeat 1
+
+python3 main_dynamic.py --cfg configs/ROLAND/roland_gru_ucimsg.yaml --repeat 1
+
+python3 main_dynamic.py --cfg configs/ROLAND/roland_gru_reddittitle.yaml --repeat 1
+
+python3 main_dynamic.py --cfg configs/ROLAND/roland_gru_redditbody.yaml --repeat 1
+```
+The `--repeat` argument controls for number of random seeds used for each experiment. For example, setting `--repeat 3` runs each single experiments for three times with three different random seeds.
+
+To explore training result:
+```bash
+cd ./run
+tensorboard --logdir=./runs_live_update --port=6006
+```
+**WARNING** The x-axis of plots in tensorboard is **not** epochs, they are snapshot IDs (e.g., the $i^{th}$ day or the $i^{th}$ week) instead.
+
+<!-- ## Examples on Heterogenous Graph Snapshots
+```bash
+Under development.
+``` -->
+
+## How to Run Grid Search / Batch Experiments
+To run grid search / batch experiments, one needs a `main.py` file, a `base_config.yaml`, and a `grid.txt` file. The main and config files are the same as in the single experiment setup above.
+If one wants to do link-prediction on `CollegeMsg.txt` dataset with configurations from  `configs/ROLAND/roland_gru_ucimsg.yaml`, in addition, she wants to try out (1) *different numbers of GNN message passing layers* and (2) *different learning rates*.
+In this case, one can use the following grid file:
+```text
+# grid.txt, lines starting with # are comments.
+gnn.layers_mp mp [2,3,4,5]
+optim.base_lr lr [0.003,0.01,0.03]
+```
+**WARNING**: the format of each line is crucial: `NAME_IN_YAML<space>SHORT_ALIAS<space>LIST_OF_VALUES`, and there should **not** be any space in the list of values.
+
+The `grid.txt` above will generate $4\times 3=12$ different configurations by modifying `gnn.layers_mp` and `gnn.layers_mp` to the respective levels in base config file `roland_gru_ucimsg.yaml`.
+
+Please see `./run/grids/ROLAND/example_grid.txt` for a complete example of grid search text file.
+
+To run the experiment using `example_grid.txt`:
+```bash
+bash ./run_roland_batch.sh
+```
+## How to Export Tensorboard Results to CSV
+We provide a simple script to aggregate results from a batch of tensorboard files, please feel free to look into `tabulate_events.py` and modify it.
+```bash
+# Usage: python3 ./tabulate_events.py <tensorboard_logdir> <output_file_name>
+python3 ./tabulate_events.py ./live_update ./out.csv
+```
+
+## Development Topic: Use Your Own Dataset
+We provided two examples of constructing your own datasets, please refer to
+(1) `./graphgym/contrib/loader/roland_template.py` and (2) `./graphgym/contrib/loader/roland_template_hetero.py` for examples of building loaders.
diff --git a/graphgym/contrib/config/roland.py b/graphgym/contrib/config/roland.py
@@ -0,0 +1,196 @@
+from yacs.config import CfgNode as CN
+
+from graphgym.register import register_config
+
+
+def set_cfg_roland(cfg):
+    """
+    This function sets the default config value for customized options
+    :return: customized configuration use by the experiment.
+    """
+
+    # ----------------------------------------------------------------------- #
+    # Customized options
+    # ----------------------------------------------------------------------- #
+
+    # Use to identify experiments, tensorboard will be saved to this path.
+    # Options: any string.
+    cfg.remark = ''
+
+    # ----------------------------------------------------------------------- #
+    # Additional GNN options.
+    # ----------------------------------------------------------------------- #
+    # Method to update node embedding from old node embedding and new node features.
+    # Options: {'moving_average', 'mlp', 'gru'}
+    cfg.gnn.embed_update_method = 'moving_average'
+
+    # How many layers to use in the MLP updater.
+    # Options: integers >= 1.
+    # NOTE: there is a known issue when set to 1, use >= 2 for now.
+    # Only effective when cfg.gnn.embed_update_method == 'mlp'.
+    cfg.gnn.mlp_update_layers = 2
+
+    # What kind of skip-connection to use.
+    # Options: {'none', 'identity', 'affine'}.
+    cfg.gnn.skip_connection = 'none'
+
+    # The bath size while making link prediction, useful when number of negative
+    # edges is huge, use a smaller number depends on GPU memroy size..
+    cfg.gnn.link_pred_batch_size = 500000
+    # ----------------------------------------------------------------------- #
+    # Meta-Learning options.
+    # ----------------------------------------------------------------------- #
+    # For meta-learning.
+    cfg.meta = CN()
+    # Whether to do meta-learning via initialization moving average.
+    # Options: {True, False}
+    cfg.meta.is_meta = False
+
+    # Weight used in moving average for model parameters.
+    # After fine-tuning the model in period t and get model M[t],
+    # Set W_init = (1-alpha) * W_init + alpha * M[t].
+    # For the next period, use W_init as the initialization for fine-tune
+    # Set cfg.meta.alpha = 1.0 to recover the original algorithm.
+    # Options: float between 0.0 and 1.0.
+    cfg.meta.alpha = 0.9
+
+    # ----------------------------------------------------------------------- #
+    # Additional GNN options.
+    # ----------------------------------------------------------------------- #
+    # How many snapshots for the truncated back-propagation.
+    # Set to a very large integer to use full-back-prop-through-time
+    # Options: integers >= 1.
+    cfg.train.tbptt_freq = 10
+
+    # Early stopping tolerance in live-update.
+    # Options: integers >= 1.
+    cfg.train.internal_validation_tolerance = 5
+
+    # Computing MRR is slow in the baseline setting.
+    # Only start to compute MRR in the test set range after certain time.
+    # Options: integers >= 0.
+    cfg.train.start_compute_mrr = 0
+
+    # ----------------------------------------------------------------------- #
+    # Additional dataset options.
+    # ----------------------------------------------------------------------- #
+
+    # How to handle node features in AS-733 dataset.
+    # Options: ['one', 'one_hot_id', 'one_hot_degree_global']
+    cfg.dataset.AS_node_feature = 'one'
+
+    # Method used to sample negative edges for edge_label_index.
+    # Options:
+    # 'uniform': all non-existing edges have same probability of being sampled
+    #            as negative edges.
+    # 'src':  non-existing edges from high-degree nodes are more likely to be
+    #         sampled as negative edges.
+    # 'dest': non-existing edges pointed to high-degree nodes are more likely
+    #         to be sampled as negative edges.
+    cfg.dataset.negative_sample_weight = 'uniform'
+
+    # Whether to load dataset as heterogeneous graphs.
+    # Options: {True, False}.
+    cfg.dataset.is_hetero = False
+
+    # whether to look for and load cached graph. By default (load_cache=False)
+    # the loader loads the raw tsv file from disk and
+    cfg.dataset.load_cache = False
+
+    cfg.dataset.premade_datasets = 'fresh'
+
+    cfg.dataset.include_node_features = False
+
+    # 'chronological_temporal' or 'default'.
+    # 'chronological_temporal': only for temporal graphs, for example,
+    # the first 80% snapshots are for training, then subsequent 10% snapshots
+    # are for validation and the last 10% snapshots are for testing.
+    cfg.dataset.split_method = 'default'
+
+    # In the case of live-update, whether to predict all edges at time t+1.
+    cfg.dataset.link_pred_all_edges = False
+    # ----------------------------------------------------------------------- #
+    # Customized options: `transaction` for ROLAND dynamic graphs.
+    # ----------------------------------------------------------------------- #
+
+    # example argument group
+    cfg.transaction = CN()
+
+    # whether use snapshot
+    cfg.transaction.snapshot = False
+
+    # snapshot split method 1: number of snapshots
+    # split dataset into fixed number of snapshots.
+    cfg.transaction.snapshot_num = 100
+
+    # snapshot split method 2: snapshot frequency
+    # e.g., one snapshot contains transactions within 1 day.
+    cfg.transaction.snapshot_freq = 'D'
+
+    cfg.transaction.check_snapshot = False
+
+    # how to use transaction history
+    # full or rolling
+    cfg.transaction.history = 'full'
+
+    # type of loss: supervised / meta
+    cfg.transaction.loss = 'meta'
+
+    # feature dim for int edge features
+    cfg.transaction.feature_int_dim = 32
+    cfg.transaction.feature_edge_int_num = [50, 8, 252, 252, 3, 3]
+    cfg.transaction.feature_node_int_num = [0]
+
+    # feature dim for amount (float) edge feature
+    cfg.transaction.feature_amount_dim = 64
+
+    # feature dim for time (float) edge feature
+    cfg.transaction.feature_time_dim = 64
+
+    #
+    cfg.transaction.node_feature = 'raw'
+
+    # how many days look into the future
+    cfg.transaction.horizon = 1
+
+    # prediction mode for the task; 'before' or 'after'
+    cfg.transaction.pred_mode = 'before'
+
+    # number of periods to be captured.
+    # set to a list of integers if wish to use pre-defined periodicity.
+    # e.g., [1,7,28,31,...] etc.
+    cfg.transaction.time_enc_periods = [1]
+
+    # if 'enc_before_diff': attention weight = diff(enc(t1), enc(t2))
+    # if 'diff_before_enc': attention weight = enc(t1 - t2)
+    cfg.transaction.time_enc_mode = 'enc_before_diff'
+
+    # how to compute the keep ratio while updating the recurrent GNN.
+    # the update ratio (for each node) is a function of its degree in [0, t)
+    # and its degree in snapshot t.
+    cfg.transaction.keep_ratio = 'linear'
+
+    # ----------------------------------------------------------------------- #
+    # Customized options: metrics.
+    # ----------------------------------------------------------------------- #
+
+    cfg.metric = CN()
+    # How many negative edges for each node to compute rank-based evaluation
+    # metrics such as MRR and recall at K.
+    # E.g., if multiplier = 1000 and a node has 3 positive edges, then we
+    # compute the MRR using 1000 randomly generated negative edges
+    # + 3 existing positive edges.
+    # Use 100 ~ 1000 for fast and reliable results.
+    cfg.metric.mrr_num_negative_edges = 1000
+
+    # how to compute MRR.
+    # available: f = 'min', 'max', 'mean'.
+    # Step 1: get the p* = f(scores of positive edges)
+    # Step 2: compute the rank r of p* among all negative edges.
+    # Step 3: RR = 1 / rank.
+    # Step 4: average over all users.
+    # expected MRR(min) <= MRR(mean) <= MRR(max).
+    cfg.metric.mrr_method = 'max'
+
+
+register_config('roland', set_cfg_roland)