白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

System and computerized method for subtitles synchronization of audiovisual content using the human voice detection for synchronization

專利號
US11445266B2
公開日期
2022-09-13
申請人
IChannel.IO Ltd.(IL Petah Tikva)
發(fā)明人
Oren Jack Maurice
IPC分類
H04N7/00; H04N21/488; H04N21/43; G10L15/26; G10L25/57
技術(shù)領(lǐng)域
subtitle,subtitles,voice,human,audio,correction,s430,segments,analyzer,content
地域: Petah Tikva

摘要

Audiovisual content in the form of video clip files, streamed or broadcasted may further contain subtitles. Such subtitles are provided with timing information so that each subtitle should be displayed synchronously with the spoken words. However, at times such synchronization with the audio portion of the audiovisual content has a timing offset which when above a predetermined threshold is bothersome. The system and method determine time spans in which a human speaks and attempts to synchronize those time spans with the subtitle content. Indication is provided when an incurable synchronization exists as well as the case where the subtitles and audio are well synchronized. It further is able to determine, when an offset exists, the type of offset (constant or dynamic) and providing the necessary adjustment information so that the timing used in conjunction with the subtitles timing provided may be corrected and synchronization deficiency resolved.

說明書

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

FIG. 4. is an exemplary and non-limiting schematic illustration of a flowchart 400 for detection and correction of subtitle and human voice misalignment according to an embodiment and FIG. 5 is an exemplary and non-limiting schematic illustration of a flowchart 500 for a cost calculation detail of the method for detection and correction of subtitle and human voice misalignment according to an embodiment. The method begins by obtaining a list of audio voice activity detected (VAD) segments and a list subtitles with the timing associated with each subtitle (S405). Each such segment, in either list, has a start time and an end time, signifying the timing and duration of the segment. The VAD segments are detected using, for example but not by way of limitation, prior-art solutions referenced herein. The method then generates a collection start/end audio VAD/sub offsets (S410). Each such set, points to a specific VAD/subtitle (up to a predefined distance from the list's start) as a possible start, for either list, and to another VAD/subtitle (up to a predefined distance from the end of the list) as its end, again from either list. These sets cover all the possibilities for start and end, on either list, resulting in X4 such sets. Thereafter the method initiates the best found cost up to infinity iterating over each of these possible sets (S420), and calculating a A and B factors for each selected set (S425). This is performed as follows: Af=Ss?As and Bf=(Se?Ss)/(Ae?As), where: Ss is the selected “start” subtitle start's time in the specific set; Se is the selected “end” subtitle end time; As the selected “start” audio VAD start time; and, Ae the selected “end” audio VAD end time. Following that, a new list of corrected subtitle timing is determined as follows: Ss[i]=(Ss[i]?Ss)/Bf?Af+Ss and Se[i]=(Se[i]?Ss)/Bf?Af+Ss. Next (S4320) the cost for this set of A,B factors is determined as follows. In s430-05 the cost accumulator is to 0, the number of detected mismatches is set to 0, and pointers inside the list for both audio VAD and subtitles, are set to 0 (Pa=Ps=0). The method then loops over until both pointers reach the end of their lists, preferably following the rest of the steps described with respect of FIG. 5. The distance between the pointed-to segments is determined by D=|As[Pa]?Ss[Ps]|+|Ae[Pa]?Se[Ps]|. If (S430-10) the pointed to segments are close enough to count as a match (D<=Dm), but not a perfect match (D>Dp) (S430-15), the distance between them is added to the accumulated cost (S430-20). Thereafter both Pa and Ps are incremented unless one reached the end of its list, in which case it will not be incremented any further. If the pointed to segments cuts are close enough to count as a perfect match (D<=Dp) (S430-15), then execution continues with S430-25 where both Pa and Ps are incremented, unless one has reached the end of its list, in which case it will not be further incremented. In the case where the delta is too big (D>Dm) (S430-10), the mismatch counter is incremented (S430-30), and then increment the pointer which is pointing to a segment start time that is “further behind” (unless that pointer reached the end of its list, in which case the other one will be incremented) (S430-35, S430-40, S430-45). Once both pointers reach the end of their respective lists (S430-50), the number of mismatches is evaluated (S430-55). If it is above a predetermined threshold value, the cost of this set is determined to be infinite (S430-60), which is not considered to be a good option. However, if the number of mismatches is below or equal to the predetermined threshold value then the accumulated cost is provided (S430-65). This cost is compared (S435) to the best accumulated cost thus far. If the cost if lower for this set, the new determined cost is saved (S440) as the best cost, and the respective A and B factors are saved as the best factors thus far. Once all the sets have been evaluated (S445), three options exist. The first option is that best cost is still infinity which means that no good match was found (S450), and the user shall be informed the subtitle-sync cannot be corrected (S455). The second option is that the best cost is not infinity, the best A factor is 0, and the best B factor is 1 (S460). The user is therefore informed (S470) that the subtitle-sync appears to be perfect as-is and no correction is necessary. The third option is that the best cost is not infinity, but the best factors differ from Af=0 or Bf=1 (S460). The user is then informed (S465) that the subtitle-sync is not good, but can be corrected by applying these factors to the subtitles.

權(quán)利要求

1
微信群二維碼
意見反饋