The text-based output data structure 1000 can include multiple entries of descriptive information detailing a driver's behavior as described herein. For example, data structure 1000 shows four separate entries 1002, 1004, 1006 and 1008. Each of the descriptive entries may include various details relating to the driver operating a vehicle. For example, each of the entries 1002-1008 indicate the driver's name, a timestamp indicating when the entry was recorded, and a driver behavior or action that was predicted by the prediction model based on a 2D and/or 3D image that was taken at the time indicated by the timestamp. For example, entry 1002 indicates that a driver named “Aaron” was operating the vehicle normally (“Normal Driving”) at the time 10:01. Entry 1004 indicates that Aaron was “Texting” at the time 10:04 and entry 1005 indicates that Aaron reverted back to normal driving at the time 10:05. Five seconds later, at time 10:10, entry 1008 indicates that the predictive model determined that Aaron was making a call on his phone. Accordingly, data structure 1000 describes a 10 second timeframe, or chunk of time, when a driver (Aaron) was operating a vehicle and Aaron's various behaviors and actions over that time period.
Moreover, while the data structure 1000 shows a 10 second period of driver behavior, other time periods or timeframes are contemplated herein, for example, such as over an entire vehicular trip or over a series of days or months. In other embodiments, the data capture process can operate continuously, such that the computing device continuously captures images of the driver and the predictive model continuously provide an output of data structures, such as data structure 1000.