Protocol generation system 1200 also includes composite statistics 1210, which may be based on, for example, composites and composite values detected in data elements. Composite statistics 2010 can include distributions 121, outliers 1214 and/or other statistics (e.g., mean or median values, probability of detection, etc.). A distribution may indicate, for example, a distribution of values of a given composite across a set of data elements. The set of data elements may include, for example, all data elements received at a given system (e.g., stream processing system) or device, data elements from a given source, data elements including an identifier of a particular composite (e.g., corresponding to the value), data elements received within a defined recent time period and so on. For example, for a composite that pertains to a concentration of a particular substance or type of cell, a distribution may include a count (or probability) of data elements having values for the composite that were within each of 20 (or other number) defined ranges.
Distributions or other types of analysis (e.g., based on standard-deviation analysis or a predictive approach) may be used to detect outliers 1214 for each of one or more composites. Distributions 1212 and/or outliers 1214 may be used to automatically generate a threshold or other relationship for a composite. The threshold or relationship may be defined so as to determine whether a given value for a composite is itself an outlier, is within a tail of a distribution, etc. The threshold or relationship may be generated merely based on the values or may be based on a multivariate data set that includes paired composite values and outcome values (e.g., which itself may be represented by a composite in a same or related data element).