The voice recognition module 12 may interpret the contents of the user's speech while performing weighting according to recording state information indicating the recording state in recording the user's speech sent out from the plurality of devices 20 in voice-recognizing the plurality of recorded data of the user's speech.
The voice assistant module 13 may select the device outputting the voice assistant among the plurality of devices 20 according to a predetermined priority. The predetermined priority may be determined based on one or two or more of a state where the device 20 is in use or not in use, the type of an output unit used in the device 20, the distance between the device 20 and the user, and the performance the output unit of the device 20.
The storage module 14 may have a device table 14a in which user names utilizing the voice assistant and device information on the plurality of devices used by the users are registered so as to correspond to each other. The device information may include a device name, a model name, an IP address, and the type and the specification of the output unit to be mounted (for example, in the case of a speaker, output sound pressure level, frequency characteristics, crossover frequency, input impedance, allowable input, and the like, and, in the case of a display, screen size, resolution, and the like). In this embodiment, a user utilizing the voice assistant of the server apparatus 10 and device information therefore are registered in the device table 14a beforehand. The voice assistant module 13 may register the user names and the device information in the device table 14a according to a request from the devices 20.