In order to solve the above-described problems to achieve the object, a voice assistant system according to a first aspect of some embodiments of the present invention has a server apparatus performing voice assistant and a plurality of devices, in which the server apparatus and the devices are communicatively connected to each other, the plurality of devices each records the same user's speech through a microphone, and then transmits recorded data of the same user's speech to the server apparatus, and the server apparatus receives the recorded data transmitted from each of the plurality of devices, and then voice-recognizes two or more of the received recorded data in accordance with a predetermined standard to thereby interpret the contents of the user's speech to perform the voice assistant.
The plurality of devices may start the recording of the user's speech after a user's predetermined verbal start command is input through the microphone.
The plurality of devices each may further transmit recording state information indicating a recording state in recording the user's speech to the server apparatus and the server apparatus may interpret the contents of the user's speech while performing weighting according to the recording state information in voice-recognizing the two or more of the received recorded data.
The recording state information may include at least one of a recording level, a noise level, and an echo. The recording state information may include all the information of a recording level, a noise level, and an echo.
The server apparatus may further select a device outputting the voice assistant among the plurality of devices according to a predetermined priority.