What is claimed is:1. A computer-implemented method, comprising:applying, by one or more processors, a machine learning model to a sentence containing a plurality of words to generate a sentence graph comprising nodes representing the plurality of words and edges connecting the nodes, the edges indicating relationships between the words represented by the nodes connected therebetween, and the relationships comprising at least one syntactic relationship, wherein:the at least one syntactic relationship comprises at least one of a dependency relationship from a first word of the plurality of words to a second word of the plurality of words or an opposite dependency relationship from the second word to the first word, the opposite dependency relationship being opposite from the dependency relationship; andgenerating the sentence graph comprises constructing at least one of (i) a first directed edge from a first node representing the first word to a second node representing the second word to indicate the dependency relationship or (ii) a second directed edge from the second node to the first node to indicate the opposite dependency relationship;applying, by one or more processors, the machine learning model to determine word representations for the plurality of words based on the sentence graph by applying a graph convolution operation on respective sets of neighbor nodes for respective ones of the nodes based on weights specific for the respective sets of neighbor nodes, wherein a set of neighbor nodes for a node has edges connected with the node, wherein the weights indicate contributions of the respective sets of neighbor nodes to the word representations, and wherein the weights comprise at least one of a first set of weights specific to types of relationships indicated by edges between the respective sets of neighbor nodes and the respective ones of the nodes or a second set of weights determined based on a number of nodes having edges connected with respective neighbor nodes in the respective sets of neighbor nodes and a number of nodes in the respective sets of neighbor nodes; anddetermining, by one or more processors and based on the word representations, a sentence representation for the sentence for use in a natural language processing task related to the sentence.2. The method of claim 1, wherein generating the sentence graph further comprises:in response to lacking of a syntactic relationship between a third word and a fourth word of the plurality of words, determining, by one or more processors, whether the third and fourth words are adjacent to each other in the sentence; andin response to determining that the third and fourth words are adjacent to each other, constructing, by one or more processors, at least one edge in the sentence graph to connect a third node representing the third word and a fourth node representing the fourth word, the at least one edge indicating a sequential relationship between the third and fourth words.3. The method of claim 2, wherein constructing the at least one edge comprises:constructing a third directed edge from the third node to the fourth node in the sentence graph; andconstructing a fourth directed edge from the fourth node to the third node in the sentence graph, the third and fourth directed edges both indicating the sequential relationship.4. The method of claim 1, wherein generating the sentence graph further comprises:constructing, by one or more processors, a further edge in the sentence graph to connect one of the nodes with itself, the further edge indicating a self-relationship.5. The method of claim 1, wherein applying the graph convolution operation comprises:applying, by one or more processors, a first graph convolution on the set of neighbor nodes based on the first set of weights to obtain a first intermediate representation;applying, by one or more processors, a second graph convolution on the set of neighbor nodes based on the second set of weights to obtain a second intermediate representation; andcombining, by one or more processors, the first and second intermediate representations to obtain a word representation.6. The method of claim 1, wherein:the edges comprise a third directed edge from a given node of the sentence graph to a further node in the set of neighbor nodes and a fourth directed edge from the further node to the given node, the third directed edge indicating a first relationship from a given word represented by the given node to a further word represented by the further node, the fourth directed edge indicating a second relationship from the further word to the given word; andthe first set of weights comprises a weight specific to a type of the second relationship instead of to a type of the first relationship.7. The method of claim 1, wherein determining the word representations comprises determining, by one or more processors, a word representation for a word of the plurality of words in parallel with determining at least one further word representation for at least one further word of the plurality of words.8. The method of claim 1, wherein the set of neighbor nodes for the node have directed edges with the node to indicate relationships from words of the plurality of words represented by the set of neighbor nodes to a word represented by the node.9. A system comprising:a processing unit; anda memory coupled to the processing unit and storing instructions thereon, the instructions, when executed by the processing unit, performing acts comprising:applying a machine learning model to a sentence containing a plurality of words to generate a sentence graph comprising nodes representing the plurality of words and edges connecting the nodes, the edges indicating relationships between the words represented by the nodes connected therebetween, and the relationships comprising at least one syntactic relationship, wherein:the at least one syntactic relationship comprises at least one of a dependency relationship from a first word of the plurality of words to a second word of the plurality of words or an opposite dependency relationship from the second word to the first word, the opposite dependency relationship being opposite from the dependency relationship; andgenerating the sentence graph comprises constructing at least one of (i) a first directed edge from a first node representing the first word to a second node representing the second word to indicate the dependency relationship or (ii) a second directed edge from the second node to the first node to indicate the opposite dependency relationship;applying the machine learning model to determine word representations for the plurality of words based on the sentence graph by applying a graph convolution operation on respective sets of neighbor nodes for respective ones of the nodes based on weights specific for the respective sets of neighbor nodes, wherein a set of neighbor nodes for a node having edges connected with the node, wherein the weights indicate contributions of the respective sets of neighbor nodes to the word representations, and wherein the weights comprise at least one of a first set of weights specific to types of relationships indicated by edges between the respective sets of neighbor nodes and the respective ones of the nodes or a second set of weights determined based on a number of nodes having edges connected with respective neighbor nodes in the respective sets of neighbor nodes and a number of nodes in the respective sets of neighbor nodes; anddetermining, based on the word representations, a sentence representation for the sentence for use in a natural language processing task related to the sentence.10. The system of claim 9, wherein generating the sentence graph further comprises:in response to lacking of a syntactic relationship between a third word and a fourth word of the plurality of words, determining whether the third and fourth words are adjacent to each other in the sentence; andin response to determining that the third and fourth words are adjacent to each other, constructing at least one edge in the sentence graph to connect a third node and a fourth node representing the third and fourth words, the at least one edge indicating a sequential relationship between the third and fourth words.11. The system of claim 10, wherein constructing the at least one edge comprises:constructing a third directed edge from the third node to the fourth node in the sentence graph; andconstructing a fourth directed edge from the fourth node to the third node in the sentence graph, the third and fourth directed edges both indicating the sequential relationship.12. The system of claim 9, wherein generating the sentence graph further comprises constructing a further edge in the sentence graph to connect one of the nodes with itself, the further edge indicating a self-relationship.13. The system of claim 9, wherein applying the graph convolution operation comprises:applying a first graph convolution on the set of neighbor nodes based on the first set of weights to obtain a first intermediate representation;applying a second graph convolution on the set of neighbor nodes based on the second set of weights to obtain a second intermediate representation; andcombining the first and second intermediate representations to obtain a word representation.14. A computer program product, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by an electronic device to cause the electronic device to perform acts comprising:applying a machine learning model to a sentence containing a plurality of words to generate a sentence graph comprising nodes representing the plurality of words and edges connecting the nodes, the edges indicating relationships between the words represented by the nodes connected therebetween, and the relationships comprising at least one syntactic relationship, wherein:the at least one syntactic relationship comprises at least one of a dependency relationship from a first word of the plurality of words to a second word of the plurality of words or an opposite dependency relationship from the second word to the first word, the opposite dependency relationship being opposite from the dependency relationship; andgenerating the sentence graph comprises constructing at least one of (i) a first directed edge from a first node representing the first word to a second node representing the second word to indicate the dependency relationship or (ii) a second directed edge from the second node to the first node to indicate the opposite dependency relationship;applying the machine learning model to determine word representations for the plurality of words based on the sentence graph by applying a graph convolution operation on respective sets of neighbor nodes for respective ones of the nodes based on weights specific for the respective sets of neighbor nodes, wherein a set of neighbor nodes for a node having edges connected with the node, wherein the weights indicate contributions of the respective sets of neighbor nodes to the word representations, and wherein the weights comprise at least one of a first set of weights specific to types of relationships indicated by edges between the respective sets of neighbor nodes and the respective ones of the nodes or a second set of weights determined based on a number of nodes having edges connected with respective neighbor nodes in the respective sets of neighbor nodes and a number of nodes in the respective sets of neighbor nodes; anddetermining, based on the word representations, a sentence representation for the sentence for use in a natural language processing task related to the sentence.