In some implementations, the model 128 for a user's speech and/or appearance may be trained or otherwise developed using one or more suitable machine learning (ML) techniques. In such implementations, the successful or unsuccessful passive authentication of the user, and/or the successful or unsuccessful action authentication of the user, may be used for further train and/or otherwise refine the model to provide a more accurate result. For example, if the user is successfully actively authenticated following an unsuccessful attempt at passive authentication, the model may be refined to incorporate the speech data and/or video data characteristics that were previously provided and led to the unsuccessful passive attempt. Accordingly, the model may, during subsequent authentication attempts, be more likely to successfully passively authenticate the user if the same or similar speech data and/or video data is provided during the subsequent attempts. In this way, the model for a user may be retrained and/or refined over time to provide a more accurate determination of the user's identity. Moreover, the retraining of the model over time may account for changes in the user's speech and/or appearance with age, season of the year, time of day, and so forth.