Data Representation
In Chapter 10, we described an algorithm for data representation using the idea of an optimal manifold, due to Chigirev and Bialek (2005). Given a set of unlabeled data as input applied to the algorithm, two results are produced by the algorithm as follows:
• a set of manifold points, around which the input data are clustered;
• a stochastic map, which projects the input data onto the manifold.
Using the idea of the Grassberger-Procacia correlation dimension described in Section 13.10, outline an experiment for validating the Chigirev-Bialek algorithm as a possible estimator of manifold-dimensional complexity