Generally, the hidden layer(s) 406 allows knowledge about the input nodes of the input layer 402 to be shared amongst the output nodes of the output layer 404. To do so, an activation function ? is applied to the input nodes through the hidden layer(s) 406. In an example, the activation function ? may be non-linear. Different non-linear activation functions ? are available including, for instance, a rectifier function ?(x)=max(0, x). In an example, a particular non-linear activation function ? is selected based on cross-validation. For example, given known example pairs (x, y), where x∈X and y∈Y, a function ?:X→Y is selected when such a function results in the best matches (e.g., the best representations of actual correlation data).