The first layer of those layers receives the set of neighbor nodes for the given node and processes to generate an output for the given node. The next layer receives the processing results generated by the previous layer for the set of neighbor nodes and processes to generate a further output for the given node. The output of the last layer designed for the graph convolution operation may be the word representation 528 for the given node. By stacking more than one layer (assuming K layers), the word representation 528 may contain information from its K-hop neighbors.
Each of the layers may apply an aggregation function on its input and sometimes such a layer may be referred to as a graph aggregation layer. In some embodiments, the processing for two or more given nodes in the sentence graph 512 may be performed in parallel by the graph aggregation layers. That is, multiple same graph aggregation layers may be configured for the respective nodes to perform their graph convolution operations in parallel.
In some embodiments, the sentence representation generation module 530 is configured to generate the sentence representation 412 based on the word representations 528 determined for all the words in the sentence 402. The sentence representation generation module 530 may be implemented as an output layer in the GNN. Thus, the graph convolution module 526 and the sentence representation generation module 530 may consist of a GNN.