References

References

[1]: A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009.
[2]: J. M. Borwein and A. S. Lewis. Convex analysis and nonlinear optimization: Theory and examples. Springer, 2006.
[3]: P. Brucker. An O(n) algorithm for quadratic knapsack problems. 3:163–166, 1984.
[4]: E. J. Candès, M. Wakin, and S. Boyd. Enhancing sparsity by reweighted l1 minimization. Journal of Fourier Analysis and Applications, 14:877–905, 2008.
[5]: B. V. Cherkassky and A. V. Goldberg. On implementing the push-relabel method for the maximum flow problem. Algorithmica, 19(4):390–410, 1997.
[6]: S. F. Cotter, J. Adler, B. Rao, and K. Kreutz-Delgado. Forward sequential algorithms for best basis selection. In IEEE Proceedings of Vision Image and Signal Processing, pages 235–244, 1999.
[7]: A. Cutler and L. Breiman. Archetypal analysis. Technometrics, 36(4):338–347, 1994.
[8]: J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the ℓ₁-ball for learning in high dimensions. In Proceedings of the International Conference on Machine Learning (ICML), 2008.
[9]: B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals of statistics, 32(2):407–499, 2004.
[10]: J. Friedman, T. Hastie, H. Hölfling, and R. Tibshirani. Pathwise coordinate optimization. Annals of statistics, 1(2):302–332, 2007.
[11]: J. Friedman, T. Hastie, and R. Tibshirani. A note on the group lasso and a sparse group lasso. Technical report, Preprint arXiv:1001.0736, 2010.
[12]: W. J. Fu. Penalized regressions: The bridge versus the Lasso. Journal of computational and graphical statistics, 7:397–416, 1998.
[13]: A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem. In Proc. of ACM Symposium on Theory of Computing, pages 136–146, 1986.
[14]: P. O. Hoyer. Non-negative sparse coding. In Proc. IEEE Workshop on Neural Networks for Signal Processing, 2002.
[15]: R. Jenatton, J. Mairal, G. Obozinski, and F. Bach. Proximal methods for sparse hierarchical dictionary learning. In Proceedings of the International Conference on Machine Learning (ICML), 2010.
[16]: R. Jenatton, J. Mairal, G. Obozinski, and F. Bach. Proximal methods for hierarchical sparse coding. Journal of Machine Learning Research, 12:2297–2334, 2011.
[17]: D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems, 2001.
[18]: N. Maculan and J. R. G. Galdino de Paula. A linear-time median-finding algorithm for projecting a vector on the simplex of Rn. Operations research letters, 8(4):219–222, 1989.
[19]: J. Mairal. Sparse coding for machine learning, image processing and computer vision. PhD thesis, Ecole Normale Supérieure, Cachan, 2010.
[20]: J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In Proceedings of the International Conference on Machine Learning (ICML), 2009.
[21]: J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11:19–60, 2010.
[22]: J. Mairal, R. Jenatton, G. Obozinski, and F. Bach. Network flow algorithms for structured sparsity. In Advances in Neural Information Processing Systems, 2010.
[23]: J. Mairal, R. Jenatton, G. Obozinski, and F. Bach. Convex and network flow optimization for structured sparsity. Journal of Machine Learning Research, 12:2649–2689, 2011.
[24]: J. Mairal and B. Yu. Supervised feature selection in graphs with path coding penalties and network flows. Journal of Machine Learning Research, 2013.
[25]: Julien Mairal. Optimization with first-order surrogate functions. In International Conference on Machine Learning (ICML), 2013.
[26]: Julien Mairal. Stochastic majorization-minimization algorithms for large-scale optimization. In Advances in Neural Information Processing Systems (NIPS), 2013.
[27]: S. Mallat and Z. Zhang. Matching pursuit in a time-frequency dictionary. IEEE Transactions on Signal Processing, 41(12):3397–3415, 1993.
[28]: N. Meinshausen and P. Buehlmann. Stability selection. Technical report. ArXiv:0809.2932.
[29]: G. Obozinski, B. Taskar, and M.I. Jordan. Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing, pages 1–22.
[30]: M. R. Osborne, B. Presnell, and B. A. Turlach. On the Lasso and its dual. Journal of Computational and Graphical Statistics, 9(2):319–37, 2000.
[31]: P. Sprechmann, I. Ramirez, G. Sapiro, and Y. C. Eldar. Collaborative hierarchical sparse modeling. Technical report, 2010. Preprint arXiv:1003.0400v1.
[32]: R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society Series B, 67(1):91–108, 2005.
[33]: J. A. Tropp. Algorithms for simultaneous sparse approximation. part ii: Convex relaxation. Signal Processing, special issue "Sparse approximations in signal and image processing", 86:589–602, April 2006.
[34]: J. A. Tropp, A. C. Gilbert, and M. J. Strauss. Algorithms for simultaneous sparse approximation. part i: Greedy pursuit. Signal Processing, special issue "sparse approximations in signal and image processing", 86:572–588, April 2006.
[35]: S. Weisberg. Applied Linear Regression. Wiley, New York, 1980.
[36]: T. T. Wu and K. Lange. Coordinate descent algorithms for Lasso penalized regression. Annals of Applied Statistics, 2(1):224–244, 2008.
[37]: J. Mairal Y. Chen and Z. Harchaoui. Fast and robust archetypal analysis for representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[38]: M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B, 68:49–67, 2006.
[39]: H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B, 67(2):301–320, 2005.