Command-line Tools
Graphormer reuses the fairseq-train command-line tools of fairseq for training, and here we mainly document the additional parameters in Graphormer
and parameters of fairseq-train used by Graphormer.
Model
--arch, type=enum, options:graphormer_base,graphormer_slim,graphormer_largePredefined graphormer architectures
--encoder-ffn-embed-dim, type = floatencoder embedding dimension for FFN
--encoder-layers, type = intnumber of graphormer encoder layers
--encoder-embed-dim, type = intencoder embedding dimension
--share-encoder-input-output-embed, type = boolif set, share encoder input and output embeddings
--share-encoder-input-output-embed, type = boolif set, share encoder input and output embeddings
--encoder-learned-pos, type = boolif set, use learned positional embeddings in the encoder
--no-token-positional-embeddings, type = boolif set, disables positional embeddings” ” (outside self attention)
--max-positions, type = intnumber of positional embeddings to learn
--activation-fn, type = enum, options:gelu,reluactivation function to use
--encoder-normalize-beforeif set, apply layernorm before each encoder block
Training
--task, type = enum, options:graph_prediction,is2rethe task for training
graph_prediction: ordinary graph-level prediction task, predict a single target for a graphis2re: for IS2RE task of Open Catalyst Challenge
--criterion, type = enum, options:l1_loss,binary_logloss,multiclass_cross_entropy,mae_deltapos,l1_loss_with_flag,binary_logloss_with_flag,multiclass_cross_entropy_with_flagthe criterion, or objective function for training.
l1_loss: mean absolute error (MAE) for regression tasksbinary_logloss: binary cross entropy for binary classificationmulticlass_cross_entropy: multi-class cross entropy for multi-class classificationmae_deltapos: criterion for IS2RE task of Open Catalyst Challengel1_loss_with_flag:l1_losswith FLAGbinary_logloss_with_flag:binary_loglosswith FLAGmulticlass_cross_entropy_with_flag:multiclass_cross_entropywith FLAG
--apply-graphormer-init, type = boolif set, use custom param initialization for Graphormer
--dropout, type = floatdropout probability
--attention-dropout, type = floatdropout probability for attention weights
--act-dropout, type = floatdropout probability after activation in FFN
--seed, type = intrandom seed
--pretrained-model-name, type = enum, default=none, options:pcqm4mv1_graphormer_base,pcqm4mv2_graphormer_base--load-pretrained-model-output-layer, type = boolif set, the weights of the final fully connected layer in the pre-trained model is loaded
--optimizer, type = enumoptimizers from fairseq
--lr, type = floatlearning rate
--lr-scheduler, type=enumlearning rate scheduler from fairseq
--fp16, type=boolif set, use mixed precision training
--data-buffer-size, type=int, default=10number of batches to preload
--batch-size, type=intnumber of examples in a batch
--max-epoch, type=int, default=0force stop training at specified epoch
--save-dir, type=str, default=``checkpoints``path to save checkpoints
Dataset
--dataset-name, type = str, default=pcqm4mname of the dataset
--dataset-source, type = str, default=ogbsource of graph dataset, can be:
pyg,dgl,ogb
--num-classes, type = int, default=-1number of classes or regression targets
--num-atoms, type = int, default=512 * 9number of atom types in the graph
--num-edges, type = int, default=512 * 3number of edge types in the graph
--num-in-degree, type = int, default=512number of in degree types in the graph
--num-out-degree, type = int, default=512number of out degree types in the graph
--num-spatial, type = int, default=512number of spatial types in the graph
--num-edge-dis, type = int, default=128number of edge dis types in the graph
--multi-hop-max-dist, type = int, default=5max number of edges considered in the edge encoding
--spatial-pos-max, type = int, default=1024max distance of attention in graph
--edge-type, type = str, default=”multi_hop”edge type in the graph
--edge-type, type = str, default=”multi_hop”edge type in the graph
--user-data-dir, type = str, default=””path to the module of user-defined dataset