ANR Project MACARON: large-scale machine learning and applications

Macaron is a project funded by ANR which started in October 2014 and ended in March 2019.

Project description

Statistical modeling requires representing measurements of some physical phenomenon as computationally manageable data, before learning a model that fits some observations. Recently, models involving a large number of parameters have gained significant success in solving difficult prediction tasks. Unfortunately, using general huge-dimensional models for tackling scientific and technological problems raise new methodological challenges: (i) exploiting the prediction capabilities of huge-dimensional models often goes along with using a large amount of training data, and computational techniques that are both scalable in the model and data size remain to be developed; (ii) huge-dimensional models are hard to visualize and interpret, which is problematic whenever understanding these models is important, e.g., in experimental sciences.

The project MACARON is an endeavor to develop new mathematical and algorithmic tools for solving the above challenges. Our ultimate goal is to use data for solving scientific problems and automatically converting data into scientific knowledge by using machine learning techniques. Therefore, our project has two different axes, a methodological one, and an applied one driven by explicit problems. The methodological axis addresses the limitations of current machine learning for simultaneously dealing with large-scale data and huge models. The second axis addresses open scientific problems in bioinformatics, computer vision, image processing, and neuroscience, where a massive amount of data is currently produced, and where huge-dimensional models yield similar computational problems.

Members

Scientific leader

Julien Mairal, Inria, Thoth team

Permanent researchers

Zaid Harchaoui, Inria, Thoth team, now assistant professor at University of Washington, Seattle
Michael Blum, CNRS, TIMC laboratory
Laurent Jacob, CNRS, LBBE laboratory
Joseph Salmon, Telecom ParisTech, now professor at Unversity of Montpellier

Research engineers

Ghislain Durif, Inria, Thoth team, now permanent research engineer at CNRS
Francois Gindraud, Inria, Thoth team and LBBE

Phd students

Hongzhou Lin (co-advised by Zaid Harchaoui and Julien Mairal). defended in December 2017, now post-doc at MIT.
Thomas Dias Alves (co-advised by Michael Blum and Julien Mairal). defended in October 2017, now data scientist at HP
Federico Pierucci (co-advised by Anatoli Juditski, Zaid Harchaoui and Jerome Malick). defended in March 2017, now research scientist at StartMeUp
Arthur Mensch (co-advised by Gael Varoquaux, Bertrand Thirion and Julien Mairal). defended in September 2018, now post-doc at ENS Ulm.
Daan Wynen (co-advised by Julien Mairal and Cordelia Schmid).
Alberto Bietti (advised by Julien Mairal).
Nikita Dvornik (co-advised by Julien Mairal, Cordelia Schmid).
Dexiong Chen (co-advised by Julien Mairal and Laurent Jacob).
Andrei Kulunchakov (co-advised by Julien Mairal and Anatoli Juditsky)

External present and past collaborations

Dmitriy Drusvyatskiy and Courtney Paquette from the University of Washington (collaboration with Zaid Harchaoui and Julien Mairal)
Gael Varoquaux and Bertrand Thirion (collaboration with Julien Mairal)
Jean-Philippe Vert and Elsa Bernard from Institut Curie (collaboration with Laurent Jacob and Julien Mairal)
Bin Yu from UC Berkeley (collaboration with Julien Mairal through a grant from the France Berkeley Fund)
Anatoli Juditsky and Jerome Malick (collaboration with Zaid Harchaoui and Julien Mairal)
Cordelia Schmid (collaborations with Julien Mairal)
Yonina Eldar from The Technion and Andreas Tillmann (collaborations with Julien Mairal)

Scientific Events we co-organized

The picture of macarons on the top is under Creative commons licence, rights: Julien Haler