The reconstruction of the haptic cues into the original speech signal may be achieved via a second neural network, hereinafter called a reconstruction neural network. The reconstruction neural network is trained on the reverse of the actions of the neural network 1310. It receives as input the compressed haptic cues from the neural network 1310 and is trained to attempt to generate an output that most closely matches the original input features (the spectrogram 1306) with as small of an error amount as possible. The training of the reconstruction neural network attempts to reduce this error amount (e.g., using gradient descent) until a satisfactory minima is reached (e.g., after a certain number of iterations). At this point, the error between the reconstructed spectrogram generated by the reconstruction neural network and the input spectrogram 1306 is compared and an error amount is determined. In one embodiment, the reconstructed spectrogram is further converted into a reconstructed audio signal, which is compared with the input audio signal 1302 to determine the error. The error may be computed via cross power spectral density, or other methods. In other embodiments, the error is not computed using a reconstruction neural network, but using some other function that attempts to determine a relationship between the output actuator signals and the spectrogram 1306, such as a correlation. The difference in the identified relationship may be deemed to be the error amount.