白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

System and computerized method for subtitles synchronization of audiovisual content using the human voice detection for synchronization

專利號
US11445266B2
公開日期
2022-09-13
申請人
IChannel.IO Ltd.(IL Petah Tikva)
發(fā)明人
Oren Jack Maurice
IPC分類
H04N7/00; H04N21/488; H04N21/43; G10L15/26; G10L25/57
技術(shù)領(lǐng)域
subtitle,subtitles,voice,human,audio,correction,s430,segments,analyzer,content
地域: Petah Tikva

摘要

Audiovisual content in the form of video clip files, streamed or broadcasted may further contain subtitles. Such subtitles are provided with timing information so that each subtitle should be displayed synchronously with the spoken words. However, at times such synchronization with the audio portion of the audiovisual content has a timing offset which when above a predetermined threshold is bothersome. The system and method determine time spans in which a human speaks and attempts to synchronize those time spans with the subtitle content. Indication is provided when an incurable synchronization exists as well as the case where the subtitles and audio are well synchronized. It further is able to determine, when an offset exists, the type of offset (constant or dynamic) and providing the necessary adjustment information so that the timing used in conjunction with the subtitles timing provided may be corrected and synchronization deficiency resolved.

說明書

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of PCT Application No. PCT/IL2019/051023 filed on Sep. 12, 2018 which claims the benefit of U.S. Provisional Application No. 62/730,556 filed on Sep. 13, 2018, the contents of which are hereby incorporated by reference.

TECHNICAL FIELD

The disclosure relates to synchronization of an audiovisual content and its subtitles, and in particular the synchronization of the audio signal of the audiovisual content and its corresponding subtitles, and even more particularly when the audio signal is a human voice.

BACKGROUND

The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.

權(quán)利要求

1
What is claimed is:1. A method for synchronization of subtitles and human voice segments of an audiovisual content comprises:gathering a plurality of audio segments of the audiovisual content;gathering a plurality of subtitles of the audiovisual content;generating a list of possible start offsets and end offsets for the plurality of audio segments and the plurality of subtitles;determine a best cost and a plurality of factor values by performing a calculation on each of the start offsets and end offsets of the plurality of audio segments and the plurality of subtitles;generating a notification that no match has been found upon determination that the best cost is not less than a value defined as infinity;generating a notification that no correction is necessary upon determination that the plurality of factors have each a predetermined value and that the best cost is less than a value defined as infinity; andperforming a subtitle offset correction upon determination that the plurality of factors do not have each a predetermined value and that the best cost is less than a value defined as infinity.2. The method of claim 1, wherein the plurality of audio segments were created using a voice activity detection technique.3. The method of claim 1, wherein a first factor of the plurality of factors has a value of ‘0’ and a second factor of the plurality of factors has a value of ‘1’.4. The method of claim 1, wherein performing a calculation on each of the start offsets and end offsets of the plurality of audio segments and the plurality of subtitles comprises repetition of:selecting a start offset and an end offset of an audio segment and a subtitle;calculating a corrected subtitle time per the selected offsets;calculating the plurality of factors for the selected offsets;calculating cost for mismatches and if it is over a predetermined threshold value set the cost to infinity;save the calculated cost and the plurality of factors upon determination that this is the best cost;until all the start offsets and the end offsets of the audiovisual content and subtitles have been selected.5. The method of claim 1, wherein the offset is a linear drift.6. The method of claim 1, wherein the offset is a non-linear drift.
微信群二維碼
意見反饋