Moreover, according to this embodiment, the plurality of devices 20 each may transmit the recording state information indicating the recording state in recording the user's speech to the server apparatus 10 and the server apparatus 10 may perform weighting according to the recording state information to interpret the contents of the user's speech in voice-recognizing the plurality of recorded data. This makes it possible to further increase the accuracy of the voice recognition by performing weighting according to the recording state.
Moreover, according to this embodiment, the recording state information may include one or two or more of information of the recording level, the noise level, and the influence of echoes. This makes it possible to perform the voice recognition considering the recording level, the noise level, or the influence of echoes in recording.
Moreover, according to this embodiment, the server apparatus 10 may select the device outputting the voice assistant among the plurality of devices according to the predetermined priority. This makes it possible to select the device suitable as the output destination of the voice assistant.
Moreover, according to this embodiment, the predetermined priority may be determined based on one or two or more of the usage state of the device (in use or not in use), the output unit used in the device, the distance between the device and the user, and the performance of the output unit of the device. This makes it possible to output the voice assistant from a more preferable device for a user.
Hardware Configuration Example