Method
This paper constructs a heterogeneous graph containing word nodes and document nodes. The proposed PEGCN model architecture proposed is shown in Figure 1. The leftmost part shows the input to the model. The present work adds Token Embedding and Position Embedding as the word vector. Then the model feeds the vector data to the GCN layer and BERT layer respectively. Finally, the output of the two layers is interpolated and sent to the softmax layer for classification. In the GCN part of Figure 1, two layers of GCN layers are used as the graph network. The bottom half of Figure 1 shows the BERT layer of the pre-training model.
Figure 1. PEGCN network.