A probability distribution defines a tolerable range of samples. A slight change in the observed subject may cause a change in the raw information observed by a sensor, but may still fall within the probability distribution. For example, the probability distribution may be common information shared between an encoder and a decoder. If samples x1, x2 and x3 fall within the probability distribution defined by the common information, the encoder may determine that there is no change to the probability distribution and thus no feature needs to be encoded and transmitted. On the other hand, if samples x4 and x5 fall outside of the probability distribution, the encoder encodes these samples for transmission. The encoded features may be an update of the distribution (e.g., a new expectation value and new variance, calculated based on the samples x4 and x5) and the decoder may use this information to update the probability distribution.
Using common information in the manner may enable transmission of information that is more robust (e.g., against a noisy and hostile channel) than transmitting every sample. The Shannon capacity limit theory assumes that two data blocks or even every single bit in one data block, are independently distributed. Therefore, the Shannon capacity limit does not take into account the possibility of structural and/or logical relevance among the information (e.g., correlation of information along the time axis) and among multiple encoders related to the same information source. In examples discussed herein, by selectively transmitting some features and not transmitting others, the channel efficiency would be improved.