Devices, systems, and methods for distributed voice processing

專(zhuān)利號(hào)

US10867604B2

公開(kāi)日期

2020-12-15

申請(qǐng)人

Sonos, Inc.（US CA Santa Barbara）

發(fā)明人

Connor Kristopher Smith; John Tolomei; Betty Lee

IPC分類(lèi)

G10L15/22; G10L15/08; H04R3/00; G10L15/30; H04R1/40

技術(shù)領(lǐng)域

playback,wake,nmd,sound,vas,word,voice,device,may,in

地域： CA CA Santa Barbara

摘要

Systems and methods for distributed voice processing are disclosed herein. In one example, the method includes detecting sound via a microphone array of a first playback device and analyzing, via a first wake-word engine of the first playback device, the detected sound. The first playback device may transmit data associated with the detected sound to a second playback device over a local area network. A second wake-word engine of the second playback device may analyze the transmitted data associated with the detected sound. The method may further include identifying that the detected sound contains either a first wake word or a second wake word based on the analysis via the first and second wake-word engines, respectively. Based on the identification, sound data corresponding to the detected sound may be transmitted over a wide area network to a remote computing device associated with a particular voice assistant service.

說(shuō)明書(shū)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104

In general, the detected-sound data form a digital representation (i.e., sound-data stream), S_DS, of the sound detected by the microphones 222. In practice, the sound-data stream S_DSmay take a variety of forms. As one possibility, the sound-data stream S_DSmay be composed of frames, each of which may include one or more sound samples. The frames may be streamed (i.e., read out) from the one or more buffers 568 for further processing by downstream components, such as the wake-word engine 570 and the voice extractor 572 of the NMD 503.

In some implementations, at least one buffer 568 captures detected-sound data utilizing a sliding window approach in which a given amount (i.e., a given window) of the most recently captured detected-sound data is retained as a sound specimen in the at least one buffer 568 while older detected-sound data are overwritten when they fall outside of the window. For example, at least one buffer 568 may temporarily retain 20 frames of a sound specimen at given time, discard the oldest frame after an expiration time, and then capture a new frame, which is added to the 19 prior frames of the sound specimen.

白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Devices, systems, and methods for distributed voice processing

摘要

說(shuō)明書(shū)

權(quán)利要求

白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Devices, systems, and methods for distributed voice processing

摘要

說(shuō)明書(shū)

權(quán)利要求

該功能需要專(zhuān)業(yè)版企業(yè)版VIP權(quán)限，您可以：