Python API¶
The BinaryClassifier Class¶
-
class
cyanure.
BinaryClassifier
(loss='square', penalty='l2', fit_intercept=False)[source]¶ Bases:
cyanure.ERM
The binary classification class, which derives from ERM. The goal is to minimize the following objective
\[\min_{w,b} \frac{1}{n} \sum_{i=1}^n L\left( y_i, w^\top x_i + b\right) + \psi(w),\]where \(L\) is a classification loss, \(\psi\) is a regularization function (or constraint), \(w\) is a p-dimensional vector representing model parameters, and b is an optional unregularized intercept. We expect binary labels in {-1,+1}.
- Parameters
- loss: string, default=’square’
Loss function to be used. Possible choices are
‘square’ => \(L(y,z) = \frac{1}{2} ( y-z)^2\)
‘logistic’ => \(L(y,z) = \log(1 + e^{-y z} )\)
‘sqhinge’ or ‘squared_hinge’ => \(L(y,z) = \frac{1}{2} \max( 0, 1- y z)^2\)
‘safe-logistic’ => \(L(y,z) = e^{ yz - 1 } - y z ~\text{if}~ yz \leq 1~~\text{and}~~0\) otherwise
- penalty: string, default=’l2’
Regularization function psi. Possible choices are
‘none’ => \(\psi(w) = 0\)
‘l2’ => \(\psi(w) = \frac{\lambda}{2} \|w\|_2^2\)
‘l1’ => \(\psi(w) = \lambda \|w\|_1\)
‘elastic-net’ => \(\psi(w) = \lambda \|w\|_1 + \frac{\lambda_2}{2}\|w\|_2^2\)
‘fused-lasso’ => \(\psi(w) = \lambda \sum_{i=2}^p |w[i]-w[i-1]| + \lambda_2\|w\|_1 + \frac{\lambda_3}{2}\|w\|_2^2\)
‘l1-ball’ => encodes the constraint \(\|w\|_1 \leq \lambda\)
‘l2-ball’ => encodes the constraint \(\|w\|_2 \leq \lambda\)
- fit_intercept: boolean, default=’False’
learns an unregularized intercept b
Methods
eval
(self, X, y[, lambd, lambd2, lambd3])get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
fit
(self, X, y[, lambd, lambd2, lambd3, …])The fitting function (the one that does the job)
get_weights
(self)get the model parameters (either w or the tuple (w,b))
predict
(self, X)predict the labels given an input matrix X (same format as fit)
score
(self, X, y)Compute classification accuracy of the model for new test data (X,y)
-
fit
(self, X, y, lambd=0, lambd2=0, lambd3=0, solver='auto', tol=0.001, it0=10, max_epochs=500, l_qning=20, f_restart=50, verbose=True, restart=False, nthreads=-1, seed=0)[source]¶ The fitting function (the one that does the job)
- Parameters
- X: numpy array, or scipy sparse CSR matrix
input n x p numpy matrix; the samples are on the rows
- y: labels, numpy array.
vector of size n with {-1,+1} labels for binary classification, which will be automatically converted if labels in {0,1} are provided
- lambd: float, default=0
first regularization parameter \(\lambda\)
- lambd2: float, default=0
second regularization parameter \(\lambda_2\), if needed
- lambd3: float, default=0
third regularization parameter \(\lambda_3\), if needed
- solver: string, default=’auto’
Optimization solver. Possible choices are
‘ista’
‘fista’
‘catalyst-ista’
‘qning-ista’ (proximal quasi-Newton method)
‘svrg’
‘catalyst-svrg’ (accelerated SVRG with Catalyst)
‘qning-svrg’ (quasi-Newton SVRG)
‘acc-svrg’ (SVRG with direct acceleration)
‘miso’
‘catalyst-miso’ (accelerated MISO with Catalyst)
‘qning-miso’ (quasi-Newton MISO)
‘auto’
see the Latex documentation for more details. If you are unsure, use ‘auto’
- tol: float, default=’1e-3’
Tolerance parameter. For almost all combinations of loss and penalty functions, this parameter is based on a duality gap. Assuming the (non-negative) objective function is \(f\) and its optimal value is \(f^*\), the algorithm stops with the guarantee
\[f(x_t) - f^* \leq tol f(x_t)\]- max_epochs: int, default=500
Maximum number of iteration of the algorithm in terms of passes over the data
- it0: int, default=10
Frequency of duality-gap computation
- verbose: boolean, default=True
Display information or not
- nthreads: int, default=-1
maximum number of cores the method may use (-1 = all cores). Note that more cores is not always better.
- seed: int, default=0
random seed
- restart: boolean, default=False
use a restart strategy (useful for computing regularization path)
- univariate: boolean, default=True
univariate or multivariate problems
- l_qning: int, default=20
memory parameter for the qning method
- f_restart: int, default=50
restart strategy for fista
- Returns
- numpy array
information about the optimization process (number of iterations, objective function values, duality gap) will be documented in the future if people ask me.
-
eval
(self, X, y, lambd=0, lambd2=0, lambd3=0)¶ get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
-
get_weights
(self)¶ get the model parameters (either w or the tuple (w,b))
The Regression Class¶
-
class
cyanure.
Regression
(loss='square', penalty='l2', fit_intercept=False)[source]¶ Bases:
cyanure.ERM
The regression class. The objective is the same as for the BinaryClassifier class, but we use a regression loss only (see below), and the targets will be real values.
- Parameters
- loss: string, default=’square’
Only the square loss is implemented at this point
‘square’ => \(L(y,z) = \frac{1}{2} ( y-z)^2\)
- penalty: string, default=’l2’
same as for the class BinaryClassifier
- fit_intercept: boolean, default=’False’
learns an unregularized intercept b
Methods
eval
(self, X, y[, lambd, lambd2, lambd3])get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
fit
(self, X, y[, lambd, lambd2, lambd3, …])The fitting function is the same as for the class BinaryClassifier, except that we do not necessarily expect binary labels in y.
get_weights
(self)get the model parameters (either w or the tuple (w,b))
predict
(self, X)predict the labels given an input matrix X (same format as fit)
-
fit
(self, X, y, lambd=0, lambd2=0, lambd3=0, solver='auto', tol=0.001, it0=10, max_epochs=500, l_qning=20, f_restart=50, verbose=True, restart=False, nthreads=-1, seed=0)[source]¶ The fitting function is the same as for the class BinaryClassifier, except that we do not necessarily expect binary labels in y.
-
eval
(self, X, y, lambd=0, lambd2=0, lambd3=0)¶ get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
-
get_weights
(self)¶ get the model parameters (either w or the tuple (w,b))
The MultiClassifier Class¶
-
class
cyanure.
MultiClassifier
(loss='square', penalty='l2', fit_intercept=False)[source]¶ Bases:
cyanure.ERM
The multi-class classification class. The goal is to minimize the following objective
\[\min_{W,b} \frac{1}{n} \sum_{i=1}^n L\left( y_i, W^\top x_i + b\right) + \psi(W),\]where \(L\) is a classification loss, \(\psi\) is a regularization function (or constraint), \(W=[w_1,\ldots,w_k]\) is a (p x k) matrix that carries the k predictors, where k is the number of classes, and \(y_i\) is a label in \(\{1,\ldots,k\}\). b is a k-dimensional vector representing an unregularized intercept (which is optional).
- Parameters
- loss: string, default=’square’
Loss function to be used. Possible choices are
- any loss function compatible with the class BinaryClassifier e.g.
(‘square’, ‘logistic’, ‘sqhinge’, ‘safe-logistic’). In such a case, the loss function encodes a one vs. all strategy based on the chosen binary-classification loss.
- ‘multiclass-logistic’, which is also called multinomial or
softmax logistic:
\[L(y, W^\top x + b) = \sum_{j=1}^k \log\left(e^{w_j^\top + b_j} - e^{w_y^\top + b_y} \right)\]- penalty: string, default=’l2’
Regularization function psi. Possible choices are
any penalty function compatible with the class BinaryClassifier such as (‘none’, ‘l2’, ‘l1’, ‘elastic-net’, ‘fused-lasso’, ‘l1-ball’, ‘l2-ball’). In such a case,. the penalty is applied on each predictor
\(w_j\) individually:
\[\psi(W) = \sum_{j=1}^k \psi(w_j).\]‘l1l2’, which is the multi-task group Lasso regularization
\[\psi(W) = \lambda \sum_{j=1}^p \|W^j\|_2~~~~ \text{where}~W^j~\text{is the j-th row of}~W.\]‘l1linf’
\[\psi(W) = \lambda \sum_{j=1}^p \|W^j\|_\infty.\]‘l1l2+l1’, which is the multi-task group Lasso regularization + l1
\[\psi(W) = \sum_{j=1}^p \lambda \|W^j\|_2 + \lambda_2 \|W^j\|_1 ~~~~ \text{where}~W^j~\text{is the j-th row of}~W.\]- fit_intercept: boolean, default=’False’
learns an unregularized intercept b, which is a k-dimensional vector
Methods
eval
(self, X, y[, lambd, lambd2, lambd3])get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
fit
(self, X, y[, lambd, lambd2, lambd3, …])Same as BinaryClassifier, but y should be a vector a n-dimensional vector of integers
get_weights
(self)get the model parameters (either w or the tuple (w,b))
predict
(self, X)Predicts the class label
score
(self, X, y)Gives a classification score on new test data
-
fit
(self, X, y, lambd=0, lambd2=0, lambd3=0, solver='auto', tol=0.001, it0=10, max_epochs=500, l_qning=20, f_restart=50, verbose=True, restart=False, nthreads=-1, seed=0)[source]¶ Same as BinaryClassifier, but y should be a vector a n-dimensional vector of integers
-
eval
(self, X, y, lambd=0, lambd2=0, lambd3=0)¶ get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
-
get_weights
(self)¶ get the model parameters (either w or the tuple (w,b))
The MultiVariateRegression Class¶
-
class
cyanure.
MultiVariateRegression
(loss='square', penalty='l2', fit_intercept=False)[source]¶ Bases:
cyanure.ERM
The multivariate regression class. The objective is the same as for the MultiClassifier class, but we use a regression loss only (see below), and the targets \(y_i\) are k-dimensional vectors.
- Parameters
- loss: string, default=’square’
Only the square loss is implemented at this point. Given two k-dimensional vectors y,z:
‘square’ => \(L(y,z) = \frac{1}{2} \|y-z\|^2\)
- penalty: string, default=’l2’
same as for the class MultiClassifier
- fit_intercept: boolean, default=’False’
learns an unregularized intercept b
Methods
eval
(self, X, y[, lambd, lambd2, lambd3])get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
fit
(self, X, y[, lambd, lambd2, lambd3, …])Same as ERM.fit, but y should be n x k, where k is size of the target for each data point
get_weights
(self)get the model parameters (either w or the tuple (w,b))
predict
(self, X)Predicts the targets
-
fit
(self, X, y, lambd=0, lambd2=0, lambd3=0, solver='auto', tol=0.001, it0=10, max_epochs=500, l_qning=20, f_restart=50, verbose=True, restart=False, nthreads=-1, seed=0)[source]¶ Same as ERM.fit, but y should be n x k, where k is size of the target for each data point
-
eval
(self, X, y, lambd=0, lambd2=0, lambd3=0)¶ get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
-
get_weights
(self)¶ get the model parameters (either w or the tuple (w,b))
Scikit-learn compatible classes¶
-
class
cyanure.
LinearSVC
(loss='sqhinge', penalty='l2', fit_intercept=False, C=1, max_iter=500)[source]¶ Bases:
cyanure.BinaryClassifier
A compatibility class for scikit-learn user, but only for square hinge loss It is perfectly equivalent to the BinaryClassifier class, but the regularization parameter (here “C”) is provided during the class initialization. Note that \(C= \frac{1}{2n \lambda}\)
- Parameters
- loss: should be ‘sqhinge’ or ‘squared_hinge’
- penalty: same as BinaryClassifier
- fit_intercept: same as BinaryClassifier
- C: regularization parameter
- max_iter: maximum number of iterations for the optimization solver
Methods
eval
(self, X, y[, lambd, lambd2, lambd3])get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
fit
(self, X, y[, C, verbose, lambd2, …])Same as BinaryClassification, but the parameter C replaces lambd, and max_iter replaces max_epochs.
get_weights
(self)get the model parameters (either w or the tuple (w,b))
predict
(self, X)predict the labels given an input matrix X (same format as fit)
score
(self, X, y)Compute classification accuracy of the model for new test data (X,y)
-
fit
(self, X, y, C=None, verbose=None, lambd2=0, lambd3=0, solver='auto', tol=0.001, it0=10, max_iter=None, l_qning=20, f_restart=50, restart=False, nthreads=-1, seed=0)[source]¶ Same as BinaryClassification, but the parameter C replaces lambd, and max_iter replaces max_epochs.
-
eval
(self, X, y, lambd=0, lambd2=0, lambd3=0)¶ get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
-
get_weights
(self)¶ get the model parameters (either w or the tuple (w,b))
-
predict
(self, X)¶ predict the labels given an input matrix X (same format as fit)
-
score
(self, X, y)¶ Compute classification accuracy of the model for new test data (X,y)
-
class
cyanure.
LogisticRegression
(penalty='l2', fit_intercept=False, C=1, max_iter=500)[source]¶ Bases:
cyanure.BinaryClassifier
A compatibility class for scikit-learn user, but only for square hinge loss It is perfectly equivalent to the BinaryClassifier class, but the regularization parameter (here “C”) is provided during the class initialization. Note that \(C= \frac{1}{n \lambda}\)
- Parameters
- loss: should be ‘sqhinge’ or ‘squared_hinge’
- penalty: same as BinaryClassifier
- fit_intercept: same as BinaryClassifier
- C: regularization parameter
- max_iter: maximum number of iterations for the optimization solver
Methods
eval
(self, X, y[, lambd, lambd2, lambd3])get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
fit
(self, X, y[, C, lambd2, lambd3, solver, …])Same as BinaryClassification, but the parameter C replaces lambd, and max_iter replaces max_epochs.
get_weights
(self)get the model parameters (either w or the tuple (w,b))
predict
(self, X)predict the labels given an input matrix X (same format as fit)
score
(self, X, y)Compute classification accuracy of the model for new test data (X,y)
-
eval
(self, X, y, lambd=0, lambd2=0, lambd3=0)¶ get the value of the objective function and computes a relative duality gap, see function fit for the format of parameters.
-
fit
(self, X, y, C=None, lambd2=0, lambd3=0, solver='auto', tol=0.001, it0=10, max_iter=None, l_qning=20, f_restart=50, verbose=None, restart=False, nthreads=-1, seed=0)[source]¶ Same as BinaryClassification, but the parameter C replaces lambd, and max_iter replaces max_epochs.
-
get_weights
(self)¶ get the model parameters (either w or the tuple (w,b))
-
predict
(self, X)¶ predict the labels given an input matrix X (same format as fit)
-
score
(self, X, y)¶ Compute classification accuracy of the model for new test data (X,y)