In certain embodiments, the ensemble model may analyze the 2D and 3D images in multiple chunks and provide multiple chunk classifications across an entire series or set of images, such as a series of images captured when a driver made a 10 minute (or longer) trip from a first location to a second location. In such an embodiment, the ensemble model may analyze the timestamps associated with each of the 2D and 3D images so that the chunk classifications are analyzed in a chronological order. In other embodiments, the ensemble model may validate each of the chunks in the series of images to determine if the 2D and 3D images to be used for the chunk are valid for the ensemble, for example, whether the 2D and 3D images to be used for the chunk have enough frames to be analyzed.