Given the sentence representation 412, the decoder 420 then generates an output 422. Depending on the specific natural language processing task to be performed by the system 400, the decoder 420 processes the sentence representation 412 to obtain the corresponding output 422. For example, in a machine translation task, the decoder 420 determines, based on the sentence representation 412, an output sentence which has a same semantic meaning in a target natural language to the input sentence 402 in its source natural language. In a natural language inference (NLI) task, the decoder 420 can determine whether the input sentence 402 semantically entails another input sentence based on sentence representations determined by the encoder 410 for the two input sentences. As a further example, the decoder 420 can label semantic roles or recognize entities of a knowledge base in the input sentence 402 based on the sentence representation 412. Other natural language processing tasks may include text summarization, reading comprehension, relation extraction, and so on. The scope of the embodiments in the present invention is not limited in this regard.