Putting up questions like this one raises a bad conscience... nevertheless I find it surprisingly difficult to google this one. I am experimenting with
int matrix_order, char jobu, char jobvt,
lapack_int m, lapack_int n, double* a,
lapack_int lda, double* s, double* u, lapack_int ldu,
double* vt, lapack_int ldvt, double* superb);
which promises a Singular Value Decomposition. Having already stopped to fear Fortran I found a gold mine of information here: http://www.netlib.no/netlib/lapack/double/dgesvd.f
Actually that link's target explains all parameters but the LAPACKE specific double* superb (well and the order parameter, but in FORTRAN all is COL_MAJOR).
Next, here http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_lapack_examples/lapacke_dgesvd_row.c.htm I found a program which seems to hint at 'this is some kind of worker cache'.
However, if that were true what is the reason for LAPACKE_dgesvd_work(..)?
In addition I have a second question: In the example they use min(M,N)-1 as a size for superb. Why?
According to http://www.netlib.no/netlib/lapack/double/dgesvd.f , about parameter
WORK of the fortran version :
WORK (workspace/output) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) On exit, if INFO = 0, WORK(1) returns the optimal LWORK; if INFO > 0, WORK(2:MIN(M,N)) contains the unconverged superdiagonal elements of an upper bidiagonal matrix B whose diagonal is in S (not necessarily sorted). B satisfies A = U * B * VT, so it has the same singular values as A, and singular vectors related by U and VT.
There is a chance that superb is the superdiagonal of this upper bidiagonal matrix
B which has the same singular values as A. This also explain the length
A look at lapack-3.5.0/lapacke/src/lapacke_dgesvd.c downloaded from http://www.netlib.org/lapack/ confirms it.
The source code also shows that the high level function
lapacke_dgesvd() calls the middle level interface
lapacke_dgesvd_work(). If you use the high level interface, you don't have to care about the optimal size of
WORK. It will be computed and
WORK will be allocated in
I wonder if there is any gain to use the middle level interface instead...Maybe when this function is called many times on little matrices of same sizes...