Tutorials
=========

.. note:: Many of the datasets I used are available `here <http://pascal.inrialpes.fr/data2/mairal/data/>`_.  I am waiting for the authorization from Criteo before putting their dataset online as well in .npz format.

Examples for binary classification
----------------------------------
The following code performs binary classification with :math:`\ell_2`-regularized logistic regression, with no intercept, on the criteo dataset (21Gb, huge sparse matrix)
::
   import cyanure as cyan
   import scipy.sparse
   import numpy as np
   #load criteo dataset 21Gb, n=45840617, p=999999
   dataY=np.load('criteo_y.npz',allow_pickle=True); y=dataY['y']
   X = scipy.sparse.load_npz('criteo_X.npz')
   #normalize the rows of X in-place, without performing any copy
   cyan.preprocess(X,normalize=True,columns=False) 
   #declare a binary classifier for l2-logistic regression
   classifier=cyan.BinaryClassifier(loss='logistic',penalty='l2')
   # uses the auto solver by default, performs at most 500 epochs
   classifier.fit(X,y,lambd=0.1/X.shape[0],max_epochs=500,tol=1e-3,it0=5) 

Before we comment the previous choices, let us 
run the above code on a regular three-years-old quad-core workstation with 32Gb of memory
(Intel Xeon CPU E5-1630 v4, 400\$ retail price). 
::
   Matrix X, n=45840617, p=999999
   *********************************
   Catalyst Accelerator
   MISO Solver
   Incremental Solver with uniform sampling
   Lipschitz constant: 0.250004
   Logistic Loss is used
   L2 regularization
   Epoch: 5, primal objective: 0.456014, time: 92.5784
   Best relative duality gap: 14383.9
   Epoch: 10, primal objective: 0.450885, time: 227.593
   Best relative duality gap: 1004.69
   Epoch: 15, primal objective: 0.450728, time: 367.939
   Best relative duality gap: 6.50049
   Epoch: 20, primal objective: 0.450724, time: 502.954
   Best relative duality gap: 0.068658
   Epoch: 25, primal objective: 0.450724, time: 643.323
   Best relative duality gap: 0.00173208
   Epoch: 30, primal objective: 0.450724, time: 778.363
   Best relative duality gap: 0.00173207
   Epoch: 35, primal objective: 0.450724, time: 909.426
   Best relative duality gap: 9.36947e-05
   Time elapsed : 928.114

The solver used was *catalyst-miso*; the problem was solved up to
accuracy tol=0.001 in about 15mn after 35 epochs (without taking into account
the time to load the dataset from the hard drive). The regularization
parameter was chosen to be :math:`\lambda=\frac{1}{10n}`, which is close to the
optimal one given by cross-validation.  Even though performing a grid search with
cross-validation would be more costly, it nevertheless shows that processing such 
a large dataset does not necessarily require to massively invest in Amazon EC2 credits,
GPUs, or distributed computing architectures.

In the next example, we use the squared hinge loss with
:math:`\ell_1`-regularization, choosing the regularization parameter such that the
obtained solution has about 10\% non-zero coefficients.
We also fit an intercept. As shown below, the solution is obtained in 26s on a
laptop with a quad-core i7-8565U CPU (specifically a dell XPS 13 9380).
::
   import cyanure as cyan
   import numpy as np
   import scipy.sparse
   #load rcv1 dataset about 1Gb, n=781265, p=47152
   data = np.load('rcv1.npz',allow_pickle=True); y=data['y']; X=data['X']
   X = scipy.sparse.csc_matrix(X.all()).T # n x p matrix, csr format 
   #normalize the rows of X in-place, without performing any copy
   cyan.preprocess(X,normalize=True,columns=False) 
   #declare a binary classifier for squared hinge loss + l1 regularization
   classifier=cyan.BinaryClassifier(loss='sqhinge',penalty='l2')
   # uses the auto solver by default, performs at most 500 epochs
   classifier.fit(X,y,lambd=0.000005,max_epochs=500,tol=1e-3) 
which yields
::
   Matrix X, n=781265, p=47152
   Memory parameter: 20
   *********************************
   QNing Accelerator
   MISO Solver
   Incremental Solver with uniform sampling
   Lipschitz constant: 1
   Squared Hinge Loss is used
   L1 regularization
   Epoch: 10, primal objective: 0.0915524, time: 7.33038
   Best relative duality gap: 0.387338
   Epoch: 20, primal objective: 0.0915441, time: 15.524
   Best relative duality gap: 0.00426003
   Epoch: 30, primal objective: 0.0915441, time: 25.738
   Best relative duality gap: 0.000312145
   Time elapsed : 26.0225
   Total additional line search steps: 8
   Total skipping l-bfgs steps: 0

Multiclass classification
-------------------------
Let us now do something a bit more involved and perform multinomial logistic regression on the
*ckn_mnist* dataset (10 classes, n=60000, p=2304, dense matrix), with multi-task group lasso regularization,
using the same laptop as previously, and choosing a regularization parameter that yields a solution with 5\% non zero coefficients.
::
   import cyanure as cyan
   import numpy as np
   #load ckn_mnist dataset 10 classes, n=60000, p=2304
   data=np.load('ckn_mnist.npz'); y=data['y']; X=data['X']
   #center and normalize the rows of X in-place, without performing any copy
   cyan.preprocess(X,centering=True,normalize=True,columns=False) 
   #declare a multinomial logistic classifier with group Lasso regularization
   classifier=cyan.MultiClassifier(loss='multiclass-logistic',penalty='l1l2')
   # uses the auto solver by default, performs at most 500 epochs
   classifier.fit(X,y,lambd=0.0001,max_epochs=500,tol=1e-3,it0=5) 
which produces
::
   Matrix X, n=60000, p=2304
   Memory parameter: 20
   *********************************
   QNing Accelerator
   MISO Solver
   Incremental Solver with uniform sampling
   Lipschitz constant: 0.25
   Multiclass logistic Loss is used
   Mixed L1-L2 norm regularization
   Epoch: 5, primal objective: 0.340267, time: 30.2643
   Best relative duality gap: 0.332051
   Epoch: 10, primal objective: 0.337646, time: 62.0562
   Best relative duality gap: 0.0695877
   Epoch: 15, primal objective: 0.337337, time: 93.9541
   Best relative duality gap: 0.0172626
   Epoch: 20, primal objective: 0.337293, time: 125.683
   Best relative duality gap: 0.0106066
   Epoch: 25, primal objective: 0.337285, time: 170.044
   Best relative duality gap: 0.00409663
   Epoch: 30, primal objective: 0.337284, time: 214.419
   Best relative duality gap: 0.000677961
   Time elapsed : 215.074
   Total additional line search steps: 4
   Total skipping l-bfgs steps: 0

Learning the multiclass classifier took about 3mn and 35s. To conclude, we provide a last more classical example
of learning l2-logistic regression classifiers on the same dataset, in a one-vs-all fashion.
::
   import cyanure as cyan
   import numpy as np
   #load ckn_mnist dataset 10 classes, n=60000, p=2304
   data=np.load('ckn_mnist.npz'); y=data['y']; X=data['X']
   #center and normalize the rows of X in-place, without performing any copy
   cyan.preprocess(X,centering=True,normalize=True,columns=False) 
   #declare a multinomial logistic classifier with group Lasso regularization
   classifier=cyan.MultiClassifier(loss='logistic',penalty='l2')
   # uses the auto solver by default, performs at most 500 epochs
   classifier.fit(X,y,lambd=0.01/X.shape[0],max_epochs=500,tol=1e-3) 

Then, the 10 classifiers are learned in parallel using the four cpu cores
(still on the same laptop), which gives the following output after about 1mn
::
   Matrix X, n=60000, p=2304
   Solver 4 has terminated after 30 epochs in 36.3953 seconds
      Primal objective: 0.00877348, relative duality gap: 8.54385e-05
   Solver 8 has terminated after 30 epochs in 37.5156 seconds
      Primal objective: 0.0150244, relative duality gap: 0.000311491
   Solver 9 has terminated after 30 epochs in 38.4993 seconds
      Primal objective: 0.0161167, relative duality gap: 0.000290268
   Solver 7 has terminated after 30 epochs in 39.5971 seconds
      Primal objective: 0.0105672, relative duality gap: 6.49337e-05
   Solver 0 has terminated after 40 epochs in 45.1612 seconds
      Primal objective: 0.00577768, relative duality gap: 3.6291e-05
   Solver 6 has terminated after 40 epochs in 45.8909 seconds
      Primal objective: 0.00687928, relative duality gap: 0.000175357
   Solver 2 has terminated after 40 epochs in 45.9899 seconds
      Primal objective: 0.0104324, relative duality gap: 1.63646e-06
   Solver 5 has terminated after 40 epochs in 47.1608 seconds
      Primal objective: 0.00900643, relative duality gap: 3.42144e-05
   Solver 3 has terminated after 30 epochs in 12.8874 seconds
      Primal objective: 0.00804966, relative duality gap: 0.000200631
   Solver 1 has terminated after 40 epochs in 15.8949 seconds
      Primal objective: 0.00487406, relative duality gap: 0.000584138
   Time for the one-vs-all strategy
   Time elapsed : 62.9996

Note that the toolbox also provides the classes LinearSVC and LogisticRegression that are near-compatible with scikit-learn's API.