This process corresponds to structured subsampling for the point cloud data: the feature vector will store k points from the original point cloud data closest to the selected basis points in the basis point set. Thus, other information about the data points (e.g., Red-Green-Blue (RGB) values) can be saved as part of the fixed representation. However, the disclosure is not limited thereto and in some examples, the system 100 may only include distance values (e.g., Euclidean distances) in the feature vector without departing from the disclosure. While the above examples are described with regard to Euclidean distances, the disclosure is not limited thereto and other metrics may be used without departing from the disclosure. For example, the system 100 may use data structures like ball trees to perform a nearest neighbor search and identify the nearest data point in the point cloud data without departing from the disclosure.
While the basis point set includes fewer basis points (e.g., k data points) than the point cloud data (e.g., n data points), the distance data 116 represents or encodes the point cloud data with enough fidelity that the system 100 may accurately capture details and perform surface reconstruction of the point cloud data. As the distance data 116 is represented by a fixed length feature vector and the basis points are ordered to give a notion of neighborhood, the distance data 116 can be processed efficiently by a trained model.