Data sets

Most of the data sets used in my recent papers is available from the LEAR data webpage.

Data for non-linear dimensionality reduction

If we observe a system with only few degrees of freedom with a high dimensional sensor (eg a camera measuring thousands of dimensions given by the pixel-color intensities) and we assume that the mapping from the state of the system (eg direction a face is looking)  to our measurement of it (the image we make of the face) is continous, then our measurements will live on some low dimensional subspace of the sensor space. The subspace might be a non-linear subspace which renders linear feature extraction methods (eg Principal Component Analysis, Independent Component Analysis) unfit to extract the data subspace.

Two of the data sets used in my papers are available via http.

Both sets are matlab files containing the following variables

IMS rows of 1600 columns which contain 40x40 pixel gray value images (the mean image is subtracted!)
MEAN mean image
PCA PCA projection of the images
varX variances in the individual pixels
Evals leading eigenvalues of the covariance matrix of IMS
Evecs corresponding leading eigenvectors of the cov. matrix of IMS used to map from PCA to IMS

I used these data sets in several papers, including papers in 2006 in IEEE PAMI and Pattern Recognition.