ANR Project MACARON: large-scale machine learning and applications
Statistical modeling requires representing measurements of some physical phenomenon as computationally manageable data, before learning a model that fits some observations. Recently, models involving a large number of parameters have gained significant success in solving difficult prediction tasks. Unfortunately, using general huge-dimensional models for tackling scientific and technological problems raise new methodological challenges: (i) exploiting the prediction capabilities of huge-dimensional models often goes along with using a large amount of training data, and computational techniques that are both scalable in the model and data size remain to be developed; (ii) huge-dimensional models are hard to visualize and interpret, which is problematic whenever understanding these models is important, e.g., in experimental sciences.
The project MACARON is an endeavor to develop new mathematical and algorithmic tools for solving the above challenges. Our ultimate goal is to use data for solving scientific problems and automatically converting data into scientific knowledge by using machine learning techniques. Therefore, our project has two different axes, a methodological one, and an applied one driven by explicit problems. The methodological axis addresses the limitations of current machine learning for simultaneously dealing with large-scale data and huge models. The second axis addresses open scientific problems in bioinformatics, computer vision, image processing, and neuroscience, where a massive amount of data is currently produced, and where huge-dimensional models yield similar computational problems.
External present and past collaborations
Scientific Events we co-organized