The audio and/or video component may be modified to improve synchronization (310). Synchronization may be improved in various ways. For example, either of the audio component or the video component may be modified such that higher correlating video frames and audio bins become corresponding relative to the shared timeline. In some implementations, audio bins and/or video frames may be stretched, compressed, or otherwise transformed to account for any modifications that are performed to improve synchronization. Alternatively, or in addition, new video frames and/or audio bins may be generated or existing video frames and/or audio bins may be removed. For example, as discussed further herein, audio bins may be analyzed to generate predicted video feature sets. The predicted video feature sets may be used to generate portions of a video frame. Similarly, video frames may be analyzed to generate predicted audio feature sets, which may be used to generate an audio bin. The generated video frame and/or audio bin may be inserted into the audio component or video component to improve synchronization. In some implementations, stretching and inserting video content or audio content may include duplicating video frames or audio bins.