Dictionary Learning and Matrix Factorization Toolbox

3 Dictionary Learning and Matrix Factorization Toolbox

This is the section for dictionary learning and matrix factorization, corresponding to [20, 21].

3.1 Function spams.trainDL

This is the main function of the toolbox, implementing the learning algorithms of [21]. Given a training set x¹,…, . It aims at solving

min

D ∈ C

lim

n → +∞

∑

i=1

min

αⁱ

⎛
⎜
⎝

||xⁱ−Dαⁱ||₂² + ψ(αⁱ)

⎞
⎟
⎠

. (1)

ψ is a sparsity-inducing regularizer and C is a constraint set for the dictionary. As shown in [21] and in the help file below, various combinations can be used for ψ and C for solving different matrix factorization problems. What is more, positivity constraints can be added to α as well. The function admits several modes for choosing the optimization parameters, using the parameter-free strategy proposed in [20], or using the parameters t₀ and ρ presented in [21]. Note that for problems of a reasonable size, and when ψ is the ℓ₁-norm, the function spams.trainDL_Memory can be faster but uses more memory.

#
# Name: spams.trainDL
#
# Usage: spams.trainDL(X,return_model= FALSE,model= NULL,D = NULL,numThreads = -1,batchsize = -1,
#               K= -1,lambda1= NULL,lambda2= 10e-10,iter=-1,t0=1e-5,mode='PENALTY',
#               posAlpha=FALSE,posD=FALSE,expand=FALSE,modeD='L2',whiten=FALSE,clean=TRUE,
#               verbose=TRUE,gamma1=0.,gamma2=0.,rho=1.0,iter_updateD=NULL,
#               stochastic_deprecated=FALSE,modeParam=0,batch=FALSE,log_deprecated=FALSE,
#               logName='')
#
# Description:
#     spams.trainDL is an efficient implementation of the
#     dictionary learning technique presented in
#
#     "Online Learning for Matrix Factorization and Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     arXiv:0908.0050
#
#     "Online Dictionary Learning for Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     ICML 2009.
#
#     Note that if you use mode=1 or 2, if the training set has a
#     reasonable size and you have enough memory on your computer, you
#     should use spams.trainDL_Memory instead.
#
#
#     It addresses the dictionary learning problems
#        1) if mode=0
#     min_{D in C} (1/n) sum_{i=1}^n (1/2)||x_i-Dalpha_i||_2^2  s.t. ...
#                                                  ||alpha_i||_1 <= lambda1
#        2) if mode=1
#     min_{D in C} (1/n) sum_{i=1}^n  ||alpha_i||_1  s.t.  ...
#                                           ||x_i-Dalpha_i||_2^2 <= lambda1
#        3) if mode=2
#     min_{D in C} (1/n) sum_{i=1}^n (1/2)||x_i-Dalpha_i||_2^2 + ...
#                                  lambda1||alpha_i||_1 + lambda1_2||alpha_i||_2^2
#        4) if mode=3, the sparse coding is done with OMP
#     min_{D in C} (1/n) sum_{i=1}^n (1/2)||x_i-Dalpha_i||_2^2  s.t. ...
#                                                  ||alpha_i||_0 <= lambda1
#        5) if mode=4, the sparse coding is done with OMP
#     min_{D in C} (1/n) sum_{i=1}^n  ||alpha_i||_0  s.t.  ...
#                                           ||x_i-Dalpha_i||_2^2 <= lambda1
#        6) if mode=5, the sparse coding is done with OMP
#     min_{D in C} (1/n) sum_{i=1}^n 0.5||x_i-Dalpha_i||_2^2 +lambda1||alpha_i||_0
#
#
#     C is a convex set verifying
#        1) if modeD=0
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 <= 1 }
#        2) if modeD=1
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 + ...
#                                                  gamma1||d_j||_1 <= 1 }
#        3) if modeD=2
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 + ...
#                                  gamma1||d_j||_1 + gamma2 FL(d_j) <= 1 }
#        4) if modeD=3
#           C={  D in Real^{m x p}  s.t.  forall j,  (1-gamma1)||d_j||_2^2 + ...
#                                  gamma1||d_j||_1 <= 1 }
#
#     Potentially, n can be very large with this algorithm.
#
# Inputs:
#       X:  double m x n matrix   (input signals)
#               m is the signal size
#               n is the number of signals to decompose
#       return_model:
#               if true the function will return the model
#               as a named list ('A' = A, 'B' = B, 'iter' = n)
#       model:        NULL or model (as A,B,iter) to use as initialisation
#       D: (optional) double m x p matrix   (dictionary)
#         p is the number of elements in the dictionary
#         When D is not provided, the dictionary is initialized
#         with random elements from the training set.
#       K: (size of the dictionary, optional is D is provided)
#       lambda1:  (parameter)
#       lambda2:  (optional, by default 0)
#       iter: (number of iterations).  If a negative number is
#          provided it will perform the computation during the
#          corresponding number of seconds. For instance iter=-5
#          learns the dictionary during 5 seconds.
#       mode: (optional, see above, by default 2)
#       posAlpha: (optional, adds positivity constraints on the
#         coefficients, false by default, not compatible with
#         mode =3,4)
#       modeD: (optional, see above, by default 0)
#       posD: (optional, adds positivity constraints on the
#         dictionary, false by default, not compatible with
#         modeD=2)
#       gamma1: (optional parameter for modeD >= 1)
#       gamma2: (optional parameter for modeD = 2)
#       batchsize: (optional, size of the minibatch, by default
#          512)
#       iter_updateD: (optional, number of BCD iterations for the dictionary
#          update step, by default 1)
#       modeParam: (optimization mode).
#          1) if modeParam=0, the optimization uses the
#             parameter free strategy of the ICML paper
#          2) if modeParam=1, the optimization uses the
#             parameters rho as in arXiv:0908.0050
#          3) if modeParam=2, the optimization uses exponential
#             decay weights with updates of the form
#             A_{t} <- rho A_{t-1} + alpha_t alpha_t^T
#       rho: (optional) tuning parameter (see paper arXiv:0908.0050)
#       t0: (optional) tuning parameter (see paper arXiv:0908.0050)
#       clean: (optional, true by default. prunes
#          automatically the dictionary from unused elements).
#       verbose: (optional, true by default, increase verbosity)
#       numThreads: (optional, number of threads for exploiting
#          multi-core / multi-cpus. By default, it takes the value -1,
#          which automatically selects all the available CPUs/cores).
#       expand:    undocumented; modify at your own risks!
#       whiten:    undocumented; modify at your own risks!
#       stochastic_deprecated:    undocumented; modify at your own risks!
#       batch:    undocumented; modify at your own risks!
#       log_deprecated:    undocumented; modify at your own risks!
#       logName:    undocumented; modify at your own risks!
#
# Output:
#       D:     double m x p matrix   (dictionary)
#       model:  the model as A B iter
#         D <- spams.trainDL(X,return_model = FALSE,...)
#             v <- spams.trainDL(X,return_model = TRUE,...)
#             D <- v[[1]]
#             model <- v[[2]]
#             A = model[['A']]
#             B = model[['B']]
#             iter = model[['iter']]
#
# Authors:
# Julien MAIRAL, 2009 (spams, matlab interface and documentation)
# Jean-Paul CHIEZE 2011-2012 (R interface)
#
# Note:
#     this function admits a few experimental usages, which have not
#     been extensively tested:
#         - single precision setting
#

The following piece of code illustrates how to use this function. The following piece of code contains usage examples:

3.2 Function spams.trainDL_Memory

Memory-consuming version of spams.trainDL. This function is well adapted to small/medium-size problems: It requires storing all the coefficients α and is therefore impractical for very large datasets. However, in many situations, one can afford this memory cost and it is better to use this method, which is faster than spams.trainDL. Note that unlike spams.trainDL this function does not allow warm-restart.

#
# Name: spams.trainDL_Memory
#
# Usage: spams.trainDL_Memory(X,D = NULL,numThreads = -1,batchsize = -1,K= -1,lambda1= NULL,iter=-1,
#                      t0=1e-5,mode='PENALTY',posD=FALSE,expand=FALSE,modeD='L2',
#                      whiten=FALSE,clean=TRUE,gamma1=0.,gamma2=0.,rho=1.0,iter_updateD=1,
#                      stochastic_deprecated=FALSE,modeParam=0,batch=FALSE,
#                      log_deprecated=FALSE,logName='')
#
# Description:
#     spams.trainDL_Memory is an efficient but memory consuming
#     variant of the dictionary learning technique presented in
#
#     "Online Learning for Matrix Factorization and Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     arXiv:0908.0050
#
#     "Online Dictionary Learning for Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     ICML 2009.
#
#     Contrary to the approaches above, the algorithm here
#        does require to store all the coefficients from all the training
#        signals. For this reason this variant can not be used with large
#        training sets, but is more efficient than the regular online
#        approach for training sets of reasonable size.
#
#     It addresses the dictionary learning problems
#        1) if mode=1
#     min_{D in C} (1/n) sum_{i=1}^n  ||alpha_i||_1  s.t.  ...
#                                         ||x_i-Dalpha_i||_2^2 <= lambda1
#        2) if mode=2
#     min_{D in C} (1/n) sum_{i=1}^n (1/2)||x_i-Dalpha_i||_2^2 + ...
#                                                      lambda1||alpha_i||_1
#
#     C is a convex set verifying
#        1) if modeD=0
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 <= 1 }
#        1) if modeD=1
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 + ...
#                                                  gamma1||d_j||_1 <= 1 }
#        1) if modeD=2
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 + ...
#                                  gamma1||d_j||_1 + gamma2 FL(d_j) <= 1 }
#
#     Potentially, n can be very large with this algorithm.
#
# Inputs:
#       X:  double m x n matrix   (input signals)
#               m is the signal size
#               n is the number of signals to decompose
#       D: (optional) double m x p matrix   (dictionary)
#         p is the number of elements in the dictionary
#         When D is not provided, the dictionary is initialized
#         with random elements from the training set.
#       K: (size of the dictionary, optional is D is provided)
#       lambda1:  (parameter)
#       iter: (number of iterations).  If a negative number is
#          provided it will perform the computation during the
#          corresponding number of seconds. For instance iter=-5
#          learns the dictionary during 5 seconds.
#       mode: (optional, see above, by default 2)
#       modeD: (optional, see above, by default 0)
#       posD: (optional, adds positivity constraints on the
#         dictionary, false by default, not compatible with
#         modeD=2)
#       gamma1: (optional parameter for modeD >= 1)
#       gamma2: (optional parameter for modeD = 2)
#       batchsize: (optional, size of the minibatch, by default
#         512)
#       iter_updateD: (optional, number of BCD iterations for the dictionary
#           update step, by default 1)
#       modeParam: (optimization mode).
#         1) if modeParam=0, the optimization uses the
#            parameter free strategy of the ICML paper
#         2) if modeParam=1, the optimization uses the
#            parameters rho as in arXiv:0908.0050
#         3) if modeParam=2, the optimization uses exponential
#            decay weights with updates of the form
#            A_{t} <- rho A_{t-1} + alpha_t alpha_t^T
#       rho: (optional) tuning parameter (see paper arXiv:0908.0050)
#       t0: (optional) tuning parameter (see paper arXiv:0908.0050)
#       clean: (optional, true by default. prunes
#         automatically the dictionary from unused elements).
#       numThreads: (optional, number of threads for exploiting
#         multi-core / multi-cpus. By default, it takes the value -1,
#         which automatically selects all the available CPUs/cores).
#       expand:    undocumented; modify at your own risks!
#       whiten:    undocumented; modify at your own risks!
#       stochastic_deprecated:    undocumented; modify at your own risks!
#       batch:    undocumented; modify at your own risks!
#       log_deprecated:    undocumented; modify at your own risks!
#       logName:    undocumented; modify at your own risks!
#
# Output:
#       D:     double m x p matrix   (dictionary)
#       model:  the model as A B iter
#        D <- spams.trainDL_Memory(X,...)
#
# Authors:
# Julien MAIRAL, 2009 (spams, matlab interface and documentation)
# Jean-Paul CHIEZE 2011-2012 (R interface)
#
# Note:
#     this function admits a few experimental usages, which have not
#     been extensively tested:
#         - single precision setting (even though the output alpha is double
#           precision)
#

The following piece of code illustrates how to use this function. The following piece of code contains usage examples:

library(spams)
I = readPNG(paste(.imagesDir,'lena.png',sep= '/'))
if (length(dim(I)) == 3) {
  A = matrix(I,nrow = nrow(I),ncol = 3 * ncol(I))
} else {
  A = I
}

m = 8;n = 8;
X = spams.im2col_sliding(A,m,n)

X = X - matrix(rep(colMeans(X),nrow(X)),nrow(X),ncol(X),byrow = T)
X = X / matrix(rep(sqrt(colSums(X*X)),nrow(X)),nrow(X),ncol(X),byrow=T)
X = X[,seq(from = 1,to = ncol(X),by = 10)]
lambda1 = 0.15

############# FIRST EXPERIMENT  ##################
tic = proc.time()
D = spams.trainDL_Memory(X,K = 100,lambda1 = lambda1, numThreads = 4,iter = 100)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)

.objective(X,D,lambda1)

#### SECOND EXPERIMENT ####
tic = proc.time()
D = spams.trainDL(X,K = 100,lambda1 = lambda1, numThreads = 4,iter = 100)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)

.objective(X,D,lambda1)

3.3 Function spams.structTrainDL

This function allows to use spams.trainDL with structured regularization functions for the coefficients α. It internally uses the FISTA algorithm.

#
# Name: spams.structTrainDL
#
# Usage: spams.structTrainDL(X,return_model= FALSE,model= NULL,D = NULL,graph = NULL,tree = NULL,
#                     numThreads = -1,tol = 0.000001,fixed_step = TRUE,ista = FALSE,
#                     batchsize = -1,K= -1,lambda1= NULL,lambda2= 10e-10,lambda3 = 0.,
#                     iter=-1,t0=1e-5,regul='none',posAlpha=FALSE,posD=FALSE,expand=FALSE,
#                     modeD='L2',whiten=FALSE,clean=TRUE,verbose=TRUE,gamma1=0.,gamma2=0.,
#                     rho=1.0,iter_updateD=NULL,stochastic_deprecated=FALSE,modeParam=0,
#                     batch=FALSE,log_deprecated=FALSE,logName='')
#
# Description:
#     spams.structTrainDL is an efficient implementation of the
#     dictionary learning technique presented in
#
#     "Online Learning for Matrix Factorization and Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     arXiv:0908.0050
#
#     "Online Dictionary Learning for Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     ICML 2009.
#
#
#     It addresses the dictionary learning problems
#        min_{D in C} (1/n) sum_{i=1}^n 0.5||x_i-Dalpha_i||_2^2 + lambda1 psi(alpha)
#        where the regularization function psi depends on regul
#        (see spams.proximalFlat for the description of psi,
#         and regul below for allowed values of regul)
#
#     C is a convex set verifying
#        1) if modeD=0
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 <= 1 }
#        2) if modeD=1
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 + ...
#                                                  gamma1||d_j||_1 <= 1 }
#        3) if modeD=2
#           C={  D in Real^{m x p}  s.t.  forall j,  ||d_j||_2^2 + ...
#                                  gamma1||d_j||_1 + gamma2 FL(d_j) <= 1 }
#        4) if modeD=3
#           C={  D in Real^{m x p}  s.t.  forall j,  (1-gamma1)||d_j||_2^2 + ...
#                                  gamma1||d_j||_1 <= 1 }
#
#     Potentially, n can be very large with this algorithm.
#
# Inputs:
#       X:  double m x n matrix   (input signals)
#               m is the signal size
#               n is the number of signals to decompose
#       return_model:
#               if true the function will return the model
#               as a named list ('A' = A, 'B' = B, 'iter' = n)
#       model:        NULL or model (as A,B,iter) to use as initialisation
#       D: (optional) double m x p matrix   (dictionary)
#         p is the number of elements in the dictionary
#         When D is not provided, the dictionary is initialized
#         with random elements from the training set.
#       K: (size of the dictionary, optional is D is provided)
#       lambda1:  (parameter)
#       lambda2:  (optional, by default 0)
#       lambda3: (optional, regularization parameter, 0 by default)
#       iter: (number of iterations).  If a negative number is
#          provided it will perform the computation during the
#          corresponding number of seconds. For instance iter=-5
#          learns the dictionary during 5 seconds.
#       regul: choice of regularization : one of
#           'l0' 'l1' 'l2' 'linf' 'none' 'elastic-net' 'fused-lasso'
#           'graph' 'graph-ridge' 'graph-l2' 'tree-l0' 'tree-l2' 'tree-linf'
#       tree: struct (see documentation of spams.proximalTree);
#           needed for regul of graph kind.
#       graph: struct (see documentation of spams.proximalGraph);
#           needed for regul of tree kind.
#       posAlpha: (optional, adds positivity constraints on the
#           coefficients, false by default.
#       modeD: (optional, see above, by default 0)
#       posD: (optional, adds positivity constraints on the
#         dictionary, false by default, not compatible with
#         modeD=2)
#       gamma1: (optional parameter for modeD >= 1)
#       gamma2: (optional parameter for modeD = 2)
#       batchsize: (optional, size of the minibatch, by default
#          512)
#       iter_updateD: (optional, number of BCD iterations for the dictionary
#          update step, by default 1)
#       modeParam: (optimization mode).
#          1) if modeParam=0, the optimization uses the
#             parameter free strategy of the ICML paper
#          2) if modeParam=1, the optimization uses the
#             parameters rho as in arXiv:0908.0050
#          3) if modeParam=2, the optimization uses exponential
#             decay weights with updates of the form
#             A_{t} <- rho A_{t-1} + alpha_t alpha_t^T
#       ista: (optional, use ista instead of fista, false by default).
#       tol: (optional, tolerance for stopping criteration, which is a relative duality gap
#       fixed_step: (deactive the line search for L in fista and use K instead)
#       rho: (optional) tuning parameter (see paper arXiv:0908.0050)
#       t0: (optional) tuning parameter (see paper arXiv:0908.0050)
#       clean: (optional, true by default. prunes
#          automatically the dictionary from unused elements).
#       verbose: (optional, true by default, increase verbosity)
#       numThreads: (optional, number of threads for exploiting
#          multi-core / multi-cpus. By default, it takes the value -1,
#          which automatically selects all the available CPUs/cores).
#       expand:    undocumented; modify at your own risks!
#       whiten:    undocumented; modify at your own risks!
#       stochastic_deprecated:    undocumented; modify at your own risks!
#       batch:    undocumented; modify at your own risks!
#       log_deprecated:    undocumented; modify at your own risks!
#       logName:    undocumented; modify at your own risks!
#
# Output:
#       D:     double m x p matrix   (dictionary)
#       model:  the model as A B iter
#         D <- spams.structTrainDL(X,return_model = FALSE,...)
#             v <- spams.structTrainDL(X,return_model = TRUE,...)
#             D <- v[[1]]
#             model <- v[[2]]
#             A = model[['A']]
#             B = model[['B']]
#             iter = model[['iter']]
#
# Authors:
# Julien MAIRAL, 2009 (spams, matlab interface and documentation)
# Jean-Paul CHIEZE 2011-2012 (R interface)
#
# Note:
#     this function admits a few experimental usages, which have not
#     been extensively tested:
#         - single precision setting
#

The following piece of code illustrates how to use this function. The following piece of code contains usage examples:

library(spams)
I = readPNG(paste(.imagesDir,'boat.png',sep= '/'))
if (length(dim(I)) == 3) {
  A = matrix(I,nrow = nrow(I),ncol = 3 * ncol(I))
} else {
  A = I
}

m = 8;n = 8;
X = spams.im2col_sliding(A,m,n)

X = X - matrix(rep(colMeans(X),nrow(X)),nrow(X),ncol(X),byrow = T)
X = X / matrix(rep(sqrt(colSums(X*X)),nrow(X)),nrow(X),ncol(X),byrow=T)

lambda1 = 0.05
K = 64
tol = 1e-3
iter = 20
########## FIRST EXPERIMENT ###########
regul = 'l1'
.printf("with Fista Regression %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)
#
regul = 'l2'
.printf("with Fista Regression %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)
#
regul = 'elastic-net'
.printf("with Fista  %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)

# pause :
# readline("pause ")
########### GRAPH
lambda1 = 0.1
tol = 1e-5
K = 10
eta_g = as.vector(c(1, 1, 1, 1, 1),mode='double')
groups = as(matrix(as.vector(c(0, 0, 0, 1, 0,
  0, 0, 0, 0, 0,
  0, 0, 0, 0, 0,
  0, 0, 0, 0, 0,
  0, 0, 1, 0, 0),mode='logical'),ncol = 5,byrow = T),'CsparseMatrix')

groups_var = as(matrix(as.vector(c(1, 0, 0, 0, 0,
  1, 0, 0, 0, 0,
  1, 0, 0, 0, 0,
  1, 1, 0, 0, 0,
  0, 1, 0, 1, 0,
  0, 1, 0, 1, 0,
  0, 1, 0, 0, 1,
  0, 0, 0, 0, 1,
  0, 0, 0, 0, 1,
  0, 0, 1, 0, 0),mode='logical'),ncol = 5,byrow = T),'CsparseMatrix')
graph = list('eta_g'= eta_g,'groups' = groups,'groups_var' = groups_var)

regul = 'graph'
.printf("with Fista  %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul, graph = graph)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)
#
regul = 'graph-ridge'
.printf("with Fista  %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul, graph = graph)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)
# pause :
# readline("pause ")
###########   TREE

lambda1 = 0.001
tol = 1e-5
treedata = "0 1. [] -> 1 4
1 1. [0 1 2] -> 2 3
4 2. [] -> 5 6
2 1. [3 4]
3 2. [5]
5 2. [6 7]
6 2.5 [8] -> 7
7 2.5 [9]
"
own_variables = as.vector(c(0, 0, 3, 5, 6, 6, 8, 9),mode= 'integer')
N_own_variables = as.vector(c(0,3,2,1,0,2,1,1),mode= 'integer')
eta_g = as.vector(c(1,1,1,2,2,2,2.5,2.5),mode = 'double')
groups = matrix(as.vector(c(0,0,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,
  0,1,0,0,0,0,0,0,
  0,1,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,
  0,0,0,0,1,0,0,0,
  0,0,0,0,1,0,0,0,
  0,0,0,0,0,0,1,0),mode='logical'),ncol = 8,byrow = T)
groups = as(groups,'CsparseMatrix')
tree = list('eta_g'= eta_g,'groups' = groups,'own_variables' = own_variables,
  'N_own_variables' = N_own_variables)

#
regul = 'tree-l0'
.printf("with Fista  %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul, tree = tree)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)
#
gstruct = spams.groupStructOfString(treedata)
x = spams.treeOfGroupStruct(gstruct)
tree = x[[2]]
regul = 'tree-l2'
.printf("with Fista  %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul, tree = tree)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)
#
regul = 'tree-linf'
.printf("with Fista  %s\n",regul)
tic = proc.time()
D <- spams.structTrainDL(X,K = K,lambda1 = lambda1, tol = tol,numThreads = 4, batchsize = 400,iter = iter,regul = regul, tree = tree)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
.objective(X,D,lambda1)

3.4 Function nmf

This function is an example on how to use the function spams.trainDL for the problem of non-negative matrix factorization formulated in [17]. Note that spams.trainDL can be replaced by spams.trainDL_Memory in this function for small or medium datasets.

#
# Name: nmf
#
# Usage: spams.nmf(X,return_lasso= FALSE,model= NULL,numThreads = -1,batchsize = -1,K= -1,iter=-1,
#           t0=1e-5,clean=TRUE,rho=1.0,modeParam=0,batch=FALSE)
#
# Description:
#     spams.trainDL is an efficient implementation of the
#     non-negative matrix factorization technique presented in
#
#     "Online Learning for Matrix Factorization and Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     arXiv:0908.0050
#
#     "Online Dictionary Learning for Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     ICML 2009.
#
#     Potentially, n can be very large with this algorithm.
#
# Inputs:
#       X:  double m x n matrix   (input signals)
#               m is the signal size
#               n is the number of signals to decompose
#       return_lasso:
#               if true the function will return 2 matrices in a list.
#       K: (number of required factors)
#       iter: (number of iterations).  If a negative number
#         is provided it will perform the computation during the
#         corresponding number of seconds. For instance iter=-5
#         learns the dictionary during 5 seconds.
#       batchsize: (optional, size of the minibatch, by default
#          512)
#       modeParam: (optimization mode).
#          1) if modeParam=0, the optimization uses the
#             parameter free strategy of the ICML paper
#          2) if modeParam=1, the optimization uses the
#             parameters rho as in arXiv:0908.0050
#          3) if modeParam=2, the optimization uses exponential
#             decay weights with updates of the form
#             A_{t} <- rho A_{t-1} + alpha_t alpha_t^T
#       rho: (optional) tuning parameter (see paper
#       arXiv:0908.0050)
#       t0: (optional) tuning parameter (see paper
#       arXiv:0908.0050)
#       clean: (optional, true by default. prunes automatically
#         the dictionary from unused elements).
#       batch: (optional, false by default, use batch learning
#         instead of online learning)
#       numThreads: (optional, number of threads for exploiting
#            multi-core / multi-cpus. By default, it takes the value -1,
#            which automatically selects all the available CPUs/cores).
#       model: struct (optional) learned model for "retraining" the data.
#
# Output:
#       U: double m x p matrix
#       V: double p x n matrix   (optional)
#       model: struct (optional) learned model to be used for
#         "retraining" the data.
#         U <- spams.nmf(X,return_lasso = FALSE,...)
#         v <- spams.nmf(X,return_lasso = TRUE,...)
#         U <- v[[1]]
#         V <- v[[2]]
#
# Authors:
# Julien MAIRAL, 2009 (spams, matlab interface and documentation)
# Jean-Paul CHIEZE 2011-2012 (R interface)
#

The following piece of code illustrates how to use this function. The following piece of code contains usage examples:

library(spams)
I = readPNG(paste(.imagesDir,'boat.png',sep= '/'))
if (length(dim(I)) == 3) {
A = matrix(I,nrow = nrow(I),ncol = 3 * ncol(I))
} else {
A = I
}
m = 16;n = 16;
X = spams.im2col_sliding(A,m,n)
X = X[,seq(from = 1,to = ncol(X),by = 10)]
X = X / matrix(rep(sqrt(colSums(X*X)),nrow(X)),nrow(X),ncol(X),byrow=T)

########## FIRST EXPERIMENT ###########
tic = proc.time()
res <- spams.nmf(X,return_lasso= TRUE,K = 49,numThreads=4,iter = -5)
tac = proc.time()
t = (tac - tic)[['elapsed']]
.printf("time of computation for Dictionary Learning: %f\n",t)
U = res[[1]]
V = res[[2]]
.printf("Evaluating cost function...\n")
R = mean(0.5 * colSums((X - U %*% V) ^ 2))
.printf("objective function: %f\n",R)

3.5 Function nnsc

This function is an example on how to use the function spams.trainDL for the problem of non-negative sparse coding as defined in [14]. Note that spams.trainDL can be replaced by spams.trainDL_Memory in this function for small or medium datasets.

#
# Name: nmf
#
# Usage: spams.nnsc(X,return_lasso= FALSE,model= NULL,lambda1= NULL,numThreads = -1,batchsize = -1,
#            K= -1,iter=-1,t0=1e-5,clean=TRUE,rho=1.0,modeParam=0,batch=FALSE)
#
# Description:
#     spams.trainDL is an efficient implementation of the
#     non-negative sparse coding technique presented in
#
#     "Online Learning for Matrix Factorization and Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     arXiv:0908.0050
#
#     "Online Dictionary Learning for Sparse Coding"
#     by Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro
#     ICML 2009.
#
#     Potentially, n can be very large with this algorithm.
#
# Inputs:
#       X:  double m x n matrix   (input signals)
#               m is the signal size
#               n is the number of signals to decompose
#       return_lasso:
#               if true the function will return 2 matrices in a list.
#       K: (number of required factors)
#       lambda1: (parameter)
#       iter: (number of iterations).  If a negative number
#          is provided it will perform the computation during the
#          corresponding number of seconds. For instance iter=-5
#          learns the dictionary during 5 seconds.
#       batchsize: (optional, size of the minibatch, by default
#          512)
#       modeParam: (optimization mode).
#          1) if modeParam=0, the optimization uses the
#             parameter free strategy of the ICML paper
#          2) if modeParam=1, the optimization uses the
#             parameters rho as in arXiv:0908.0050
#          3) if modeParam=2, the optimization uses exponential
#             decay weights with updates of the form
#             A_{t} <- rho A_{t-1} + alpha_t alpha_t^T
#       rho: (optional) tuning parameter (see paper
#       arXiv:0908.0050)
#       t0: (optional) tuning parameter (see paper
#       arXiv:0908.0050)
#       clean: (optional, true by default. prunes automatically
#          the dictionary from unused elements).
#       batch: (optional, false by default, use batch learning
#          instead of online learning)
#       numThreads: (optional, number of threads for exploiting
#          multi-core / multi-cpus. By default, it takes the value -1,
#          which automatically selects all the available CPUs/cores).
#       model: struct (optional) learned model for "retraining" the data.
#
# Output:
#       U: double m x p matrix
#       V: double p x n matrix   (optional)
#       model: struct (optional) learned model to be used for
#          "retraining" the data.
#          U <- spams.nnsc(X,return_lasso = FALSE,...)
#          v <- spams.nnsc(X,return_lasso = TRUE,...)
#          U <- v[[1]]
#          V <- v[[2]]
#
# Authors:
# Julien MAIRAL, 2009 (spams, matlab interface and documentation)
# Jean-Paul CHIEZE 2011-2012 (R interface)
#

3.6 Function spams.archetypalAnalysis

The function optimizes the archetypal analysis formulation of [7]. It follows the methodology presented in the following paper [37].

#
# Name: spams.archetypalAnalysis
#
# Usage: spams.archetypalAnalysis(X,p = 10,Z0 = NULL,returnAB = FALSE,robust=FALSE,epsilon=1e-3,
#                          computeXtX=TRUE,stepsFISTA=3,stepsAS=50,randominit=TRUE,
#                          numThreads=-1)
#
# Description:
#     documentation to appear soon
#
# Inputs:
#       X:  double m x n matrix   (input signals)
#               m is the signal size
#               n is the number of signals to decompose
#
# Output:
#       Z: double %
#
# Authors:
# Yuansi Chen and Julien MAIRAL, 2014 (spams, matlab interface and documentation)
# Jean-Paul CHIEZE 2011-2012 (R interface)
#

The following piece of code illustrates how to use this function.