In one set of embodiments the method comprises (and the processing circuitry is configured to) forming one or more point clouds using the determined three-dimensional position(s) of one or more identified and matched features, e.g. using the depth maps created, for each pair of video camera(s) and/or sensor(s) in the array, e.g. between which identified features have been matched. These initial “sparse” point cloud(s) may not contain many data points, e.g. owing to them only representing a single or a few identified features. However, such point cloud(s) may be helpful to act as a guide for the creation of more dense and accurate point cloud(s).
Preferably the information from the point cloud(s) (e.g. the location of the identified feature(s)) is used in an iterative process to re-analyse the identified and matched feature(s). For example, the positions in the point cloud(s) may be used to test against (e.g. the determined positions of) one or more of the identified features (whether matched or not) to determine if they have been correctly matched or not. This may be used to change the matching of identified features from different video camera(s) and/or sensor(s) and/or to refine the position of the identified and matched feature(s) in the point cloud(s). The number of iterations used may depend on the precision desired and/or on the processing time available.