In this embodiment, the first CNN 31 and the second CNN 32 are trained, using a data set of a large number of CT images Bc0 of the brain including an infarction region and the infarction region A1 specified in the CT images Bc0 as training data, so as to output a discrimination result R1 of the infarction region for each pixel included in the input CT image Bc1. In a case in which the CT image Bc1 is input to the input layer 31a of the first CNN 31, among a plurality of processing layers of the first CNN 31 and the second CNN 32, a feature amount map output from a processing layer in the previous stage is sequentially input to a processing layer in the next stage and the discrimination result R1 of the infarction region for each pixel of the CT image Bc1 is output from the output layer 32b of the second CNN 32. In addition, the discrimination result R1 output from the second CNN 32 is the result of discriminating whether each pixel of the CT image Bc1 is an infarction region or a region other than the infarction region.
In addition, the first CNN 31 and the third CNN 33 are trained, using a data set of an image set obtained by the registration between a large number of CT image Bc0 of the brain and the MR images Bm0 obtained by capturing the images of the same subject as that in the CT image Bc0, so as to output, as a discrimination result R2, the MR estimated image Dm1 which is an estimated image of the MR image obtained by capturing an image of the same subject as that in the input CT image Bc1.